r - Creating Repeated Start and End Dates -


i have data set many variables. of interest are: id, episode, start, end, assessment date. example data set shown

 id episode     start         end  assessmentdate  1       1  1/1/2012  12/21/2012        1/1/2012  1       1  1/1/2010  12/21/2012      12/12/2012  1       1  1/1/2010  12/21/2012      12/21/2012  1       2  1/1/2013           .        1/2/2013  1       2  1/1/2013           .        2/2/2013  1       2  1/1/2013           .        3/2/2013  2       1  1/1/2012           .        4/1/2012  2       1  1/1/2010           .       5/12/2012  2       1  1/1/2010           .       6/21/2012  2       2  1/1/2013           .        7/2/2013  2       2  1/1/2013           .        8/2/2013  2       2  1/1/2013           .        9/2/2013 

i have start dates everyone, not end dates. want identify end date each episode , each patient, 10,000 patients. want end date last date of assessment per episode number, , want present each row between first , last assessment dates.

i reading bit splitting data set many smaller parts based on id , episode, feel there should simpler way this. i'm new r, coming sas, , issue in sas not give me trouble.

i appreciate input may have regarding data preparations.

you can find maximum assessment date episode using ddply() plyr library:

df <- data.frame(id=1, episode=c(1,1,1,2,2,2), assessmentdate=as.date(c("2012-01-01", "2012-12-12", "2012-12-21", "2013-01-02", "2013-02-02", "2013-03-02")))  library(plyr)  df <- ddply(df, .(episode), transform, end=max(assessmentdate)) df 

which gives you:

  id episode assessmentdate        end 1  1       1     2012-01-01 2012-12-21 2  1       1     2012-12-12 2012-12-21 3  1       1     2012-12-21 2012-12-21 4  1       2     2013-01-02 2013-03-02 5  1       2     2013-02-02 2013-03-02 6  1       2     2013-03-02 2013-03-02 

if want patient, can use ddply() .(id) (assuming identifies patients) or that.

it's possible by(), becomes bit more complicated because split data lists identified values of grouping variable.

edit: also, if episode not unique on entire data frame, i.e. repeats each patient, group both variables, i.e. ddply(df, .(id, episode), ...).


Comments

Popular posts from this blog

php - Submit Form Data without Reloading page -

linux - Rails running on virtual machine in Windows -

php - $params->set Array between square bracket -