Kaplan-Meier Survival Plot – with at risk table

Credit for the bulk of this code is to Abhijit Dasgupta and the commenters on the original post here from earlier this year. I have made a few changes to the functionality of this which I think warrant sharing.

A brief intro, this function will use the output from a survival analysis fitted in R with ‘survfit’ from the ‘survival’ library, to plot a survival curve with the option to include a table with the numbers of those ‘at risk’ below the plot.

Continue reading

Latent class mixed models – with graphics

UPDATE – May 2014

The site I was originally using to store the script and data has since closed, and I’ve had a few requests for it. So, the code is now available on github here: https://raw.githubusercontent.com/nzcoops/blog_code/master/2011-10-02_lcmm_post_upload

And the .Rdata file can be downloaded below.
NOTE: you will need to change the extension from .doc to .Rdata.



Apologies that the output isn’t showing well as far as spacing and alignment is concerned. Am working on fixing that!


Generally when we have a set of data, we have known groupings. Be that three different treatment groups, two sex groups, 4 ethnicity groups etc. There is also the possibility of unobserved groupings within your data, some examples (I’m clutching at straws here) are those who are vegetarians and those who aren’t, those who regularly exercise and those who don’t, or those who have a family history of a certain condition and those who don’t (assuming those data were not collected). There is an approach to look into this.

Continue reading

Graphing – margins, titles, mtext, workspace

This is a great post, very true, not enough of R’s graphics are well displayed online to really see how to achieve what the often ambiguous ‘help’ information suggests.


I particularly find “mtext(“lol”, outer=T)” to be particularly useful (requires “oma=c(2,2,2,2)” or similar).

http://addictedtor.free.fr/graphiques/ This site is somewhat of the way there, but I’ve found seeing the actual code to be rather clumsy.

Child health metrics

In analysis of Child Health data, generally z-scores or percentile groupings are used as children do not growth is not linear.

The CDC (Center for Disease Control and Prevention) have released tables of data for calculating these z-scores and percentiles, and here are some scripts for R to calculate these in your sample.

Update – Nov 2013
The site I was hosting the code on closed, scripts are now available on my github.

Note: these are reasonably straight forward, please check the code. Age can be a ‘tricky one’, given rounding and cut points etc so you’re encouraged to double check your results.

Mixed Models – Part 1

Very brief. Have been exploring mixed models in R using nlme::lme. Am looking forward to understanding them more, they’re going to be used more and more in years to come I’ve no doubt of that.

Here are some scripts, very rough, for diagnostics when running simple 2 levels, or models with 1 grouping variable.

Update – Nov 2013
The site I was hosting the code on closed, scripts are now available on my github.

To use simply run:
where ‘my.model’ is the output of an ‘lme’.

This code is derived largely from code prepared by Andrew Robinson whose guide icebreakeR is freely available and a highly recommended read for both R beginneRs or experienced R users looking to dabble in mixed models for the first time http://www.ms.unimelb.edu.au/~andrewpr/r-users/. I take 0 credit for this code.

Automatic descriptive statistic tables

Here is some code I worked on a while back to make the process of generating descriptive tables quicker. This was driven by constantly making data frames for and calculating means or counts for either sex or genotype divisions, then having to completely restructure code if a new level was introduced for one factor.

Update – Nov 2013
The site I was hosting the code on closed, scripts are now available on my github.

Gallery with 3 examples of tables generated with one line of code, using html export option.