When I tried to join and create days a difference between two dates (don’t have any missing value here) using difftime, there are some rows that have NA and others are just fine. There are two things that I notice:
- Those NA also seem to be clustered by patient ID, so I am not sure what is going on,
- If I run difftime separately from data table, it produced all difference just fine, no NA.
I expect the difftime does not produce NA since two date variables don’t have NA when running in data table.
## generate data 1
pat_characteristics <- function(n_group){
patid <- seq(1:n_group)
birth_date <- rdate(n_group, min = as.Date('01/01/1940',"%d/%m/%Y"),
max = as.Date('31/12/2000',"%d/%m/%Y"))
sex <- sample(c("F","M"), n_group, replace = TRUE)
edu <- sample(c("AA","B","M","PHD","MD"), n_group, replace = TRUE)
f_born <- sample(c("US-born","f-born", "Others"), n_group, replace = TRUE)
state <- sample(state.abb, n_group, replace = TRUE)
pat1 <- data.table(patid, birth_date, sex, edu, f_born, state)
return(pat1)
}
## generate data 2
encounter <- function(n_group, n_each_group){
patid <- rep(seq(1:n_group), each=n_each_group)
enc_date <- rdate(n_group*n_each_group, min = as.Date('01/01/2014',"%d/%m/%Y"),
max = as.Date('31/12/2020',"%d/%m/%Y"))
ICD_code <- sample(1:100, n_group*n_each_group, replace=TRUE)
pat2 <- data.table(patid, enc_date, ICD_code)
return(pat2)
}
set.seed(1)
pat1 <- pat_characteristics(10)
enc1 <- encounter(10,5)
d <- enc1[pat1[birth_date > as.Date("01/01/1970", "%d/%m/%Y")], on="patid"
][sex=="F", age := enc_date- birth_date]
difftime(d$enc_date, d$birth_date)
New contributor
Dang Dinh is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
1