Kaplan-Meier Survival plot – with at risk table, by sub groups

This is a follow on from the previous post, with updated code.

There was an argument ‘groups’ in the ggplot(…) line of the code that was working but is now no longer working with the updated version of R/ggplot2 (I don’t know the full ins and outs of this sorry). So code is fixed and available below.

Most up-to-date code available here: http://pastebin.com/FjkWnCWm

Code originally included in this post: http://pastebin.com/yRmuZtPG

34 thoughts on “Kaplan-Meier Survival plot – with at risk table, by sub groups

  1. Pingback: Kaplan-Meier Survival Plot – with at risk table « Matt's Stats n stuff

    • Antonio, I’d be happy to help when I have time. But you’ve given me absolutely nothing to go off… Error message? Are you using your data or just the example from this and my last post?

      • Hi I am using my data. I have recently updated R. I now have R 2.15.0 (2012-03-30) for Mac and when i run the code I get this error message:
        Error in ggkm(fit, timeby = 6, ystratalabs = c(“Calcium salts”, “Sevelamer”), :
        could not find function “rbind.fill”
        Thanks for al you can tell me.

      • Hi Antonio. Hmm, I can only assume that it’s saying that could there is a package missing. Can you try


        And then re run?


      • Okay. Well I can see from your strata labels that you are using it on your own data. It would be good if you could use the example I used in the post to see if that runs. Then we would know if it was the function on your system that isn’t working or just an anomaly between your data and the function.

      • Error “could not find function “rbind.fill”” can be fixed by adding following lines under “libraries” in ggkm function:

        # libraries #

        require(reshape) #rbind.fill function is in this package.
        require(plyr) #additional dependency.


  2. Hey,

    Your function works fine in R 2.14.0 (2011-10-31) now. Unfortunately, it won’t print a p-value anymore, even if I set pval = TRUE

    Also, you might consider using pastebin (http://pastebin.com/) to upload your code with, makes it a bit easier imo.

    Thanks for the help!

  3. Thanks Petra, have removed the stray p, hang over from fixing the error. The p-value side of it still works for me, however when doing the subgroups the p-value that is used isn’t appropriate, not sure when I’ll get to look at this. Have migrated to pastebin.


  4. Hello Matt,

    first of all, great job with your ggkm function. I was searching for a nice function to plot (good looking) survivor curves and your function just fits in.

    I have a question about setting the ystratalabs when not given to the function.

    if(is.null(ystratalabs)) ystratalabs <- as.character(levels(summary(sfit)$strata)[subs1])

    In case on of the strata has no events, so just censored observations, it would not show up in "summary(sfit)". As a consequece your ystratalabs have the wrong length. I got an error when trying the function:

    Error in `levels<-.factor`(`*tmp*`, value = "group=G2") :
    number of levels differs

    Maybe I did something wrong ? Is it possible to set "ystratalabs" to "names(sfit$strata)" or is this the wrong order ?

    Best regards,

    P.S.: Here is a minimal example producing the error above:

    surv.table <- data.frame(time=runif(40, min=0, max=12),
    status=c(rep(0,20), sample(c(0,1), 20, replace=T)),
    group=rep(c("G1", "G2"), each=20)
    sfit <- survfit(Surv(time, status)~group, data=surv.table)

    • Thanks for the post. Sorry for the delay, I didn’t get an email about this until a pingback came through, oddly.

      I’ve amended the code down the line you mentioned just adding a sub command to strip out the ‘group=’ from the front. Give it a go. Appreciate a quick reproducible example too! Good luck.

      • Hello Matt,

        now I have to apologize for the late answer. I wasn’t around for some weeks. I tested your new code and it works. Thank you very much! Excellent work!


  5. I got an error message as below. Any idea?! Thank you!

    > ggkm(fit1)
    Error: ggplot2 doesn’t know how to deal with data of class function

  6. Pingback: Forest plots in R (ggplot) with side table | Matt's Stats n stuff

  7. Hey Matt,

    can you help me please. I’m still new in using R and I want to change the font characteristics in the KaplanMeier plot using ggkm (font type, size etc.) Where can I do this?


  8. Hi Matt

    This is terrific-looking Kaplan-Meier plot

    One question: would you advise how to color the survival functions… in other words, have each survival function appear in different color. I’ve been trying quite a bit without success

    Thank you


  9. Hi! Thanks for the wonderful plots. Maybe I haven’t seen them, but I can’t find a way to add censoring symbols to the curves. Also, how does one get rid of the grid? Sorry if the question is stupid, I did my RTFM but couldn’t come up with a solution!


  10. I am also interested in adding censoring tick-marks to the curves. Haven’t been able to figure out how to add them in. Would welcome any tips!

  11. Hi there,

    I’ve found the ggkm and ggkmTable functions to be awesome and super easy to use. Thanks for creating them!

    In the basic plot of a survFit object (“plot(sfit)”), one can specify “fun=’event'” in order to get a “reverse” Kaplan Meier plot where the probability of the event starts at 0 on the far left side of the plot — rather than 1 as is in a standard KM plot. This is useful when modeling things like “time to remission.” Is there a way to do that with the ggkm function?


  12. I’d love to use ggkm when doing so I get a long list of inscrutable errors.

    > data(colon)
    > fit ggkm(fit, timeby=500)
    The following `from` values were not present in `x`: col, color, pch, cex, lty, lwd, srt, adj, bg, fg, min, max
    The following `from` values were not present in `x`: col, color, pch, cex, lty, lwd, srt, adj, bg, fg, min, max … and then repeated ad-infinitum.

    Traceback gives this not-so-very useful information:
    Error in plyr:::split_indices(seq_len(nrow(data)), scale_id, n) :
    unused argument (n)
    13 plyr:::split_indices(seq_len(nrow(data)), scale_id, n)
    12 scale_apply(layer_data, x_vars, scale_train, SCALE_X, panel$x_scales)
    11 train_position(panel, data, scale_x(), scale_y())
    10 ggplot_build(x)
    9 ggplot_gtable(ggplot_build(x))
    8 with(x$layout, paste(name, t, l, sep = “-“))
    7 gtable_gList(x)
    6 gtable_gTree(ggplot_gtable(ggplot_build(x)))
    5 ggplotGrob(grobs[[ii.table]])
    4 arrangeGrob(…, as.table = as.table, clip = clip, main = main,
    sub = sub, left = left, legend = legend)
    3 grid.draw(arrangeGrob(…, as.table = as.table, clip = clip,
    main = main, sub = sub, left = left, legend = legend))
    2 grid.arrange(p, blank.pic, data.table, clip = FALSE, nrow = 3,
    ncol = 1, heights = unit(c(2, 0.1, 0.25), c(“null”, “null”,
    1 ggkm(fit, timeby = 500)


    Al Z.

  13. Hi,
    Thanks for a great code, but I do have some problems running it.
    I am using R studio Version 0.99.896 – © 2009-2016 RStudio, Inc. on a Macbook.

    I ran the script and tried to enter the command, but got an error message stating:
    “could not find function “opts””

    and unfortunately no plot.

    My command looks like this:

    fit <- survfit(Surv(perisafe_v1$days_to_first_event,perisafe_v1$has_first_event)~perisafe_v1$asa, data=perisafe_v1)
    ggkm(fit, timeby=100, ystratalabs=c("ASA1","ASA2","ASA3", "ASA4"))

    The name of the dataset is "perisafe_v1"
    The time variable is: "perisafe_v1$days_to_first_event,perisafe_v1"
    the status variable is: "perisafe_v1$has_first_event"
    the factor is: "perisafe_v1$asa"

    I am using it on my own data set, and since I can't find the data set you are referring to in the script, I can't see if it is a problem with my system.

    Do you have any idea how to solve the problem?

    • Hey, don’t have time to fully troubleshoot this one sorry.
      But I can suggest having a look through the code for ‘opts’ and replacing it with ‘theme’. Can you try that and then check the errors. For background, ggplot2 used to use ‘opts’ for customising a bunch of plot features, this was replaced by ‘theme’ for consistency about a year ago. Hopefully that’s enough to get you going.

      • Thanks,
        I just replaced all “opts” with theme.
        When I ran the command:
        ggkm(fit, timeby=100, ystratalabs=c(“ASA1″,”ASA2″,”ASA3”, “ASA4”))

        I got the error:

        Error in theme(axis.title.x = theme_text(vjust = 0.5)) :
        could not find function “theme_text”

        Naturally, I understand if you don’t have the time, just wanted to hear you out.

        If you can recommend another code for a similar Kaplan Meier plot, I would also be interested in that instead.

        Thank you.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s