Hello fellow statistics lovers!
I’m working on a survival analysis in R and need to handle missing values in my dataset. I’m considering using the CoxMI function from the SurvMI package for multiple imputation. However, I’m unsure about how to properly transform my data using uc_data_transform, particularly regarding the probabilities parameter. My dataset contains survival data with explanatory variables and time to event variable and event occurrence variable for each individual. Although I’ve conducted Kaplan-Meier estimates, I’ve noticed discrepancies in the number of observations compared to the original dataset.
Additionally, I’m confused about the concept of ‘long data’ and why it’s necessary for each time point to be in long format. Currently, my data frame has one row per observation with variables in columns.
In essence, I’m seeking guidance and clarification on multiple fronts: understanding the intricacies of data transformation for imputation, deciphering discrepancies in observed versus expected counts, grasping the concept of ‘long data,’ and effectively pooling imputed datasets for subsequent analysis. Any insights, explanations, or pointers to relevant resources would be immensely valuable as I navigate through these complexities and advance with my analysis.
Previously, I attempted imputation using the mice package’s mice function, which seemed to work. However, I struggled with pooling the results when I obtained multiple datasets with different imputations. My ultimate goal is to use imputed data for a Cox model analysis. I expected it to work.
Amelie Barrling is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.