I am working on a scientific medical paper using the HADS-Score for assessing patients anxiety and depression.
This score consists of 14 items divided in two subscales (HADS-D, HADS-A) of 7 items with possible values from 0-3 points. I have missing data and want to replace them. According to the Score manual I have to drop the observation, if i have more than one missing item in one subscale. If only one item is missing per subscale, I can replace the missing item by the mean of the present six items.
i have the HADS-Score items per observation stored in following variables:
- subscale HADS-D (Adding up to total subscale = hads_anx_score). variables: hads_tense_rec, hads_glad_rec, hads_omen_rec, hads_laugh_rec, hads_trouble_rec, hads_happy_rec, hads_relax_rec
- subscale HADS-A (Adding up to total subscale = hads_depr_score). variables: hads_limited_rec, hads_scary_rec, hads_looks_rec, hads_restless_rec, hads_future_rec, hads_panic_rec, hads_enjoy_rec
I broke the code down to following steps:
-
Initialize the subscale scores: creating variables for the subscales HADS-D and HADS-A.
-
Identify missing values. I created a new variable
is_missing_
to identify if it is missing. -
Count missing items using
egen
withrowtotal
to count the number of missing items in each subscale. -
Drop observations: I dropped any observation where more than one item was missing in either subscale.
-
Replacing the missing items for each subscale. If an item is missing, it is replaced with the mean of the other six items in the subscale.
-
Calculate total scores: Sums up the scores for each subscale to get the final scores.
PROBLEM: Somehow my codes does not replace the missing items in each subscale with the loop I created in Step 5. and leaves missing data (== .)
*STEP 1: Initialize the HADS-A and HADS-D subscales
gen hads_anx_score = .
gen hads_depr_score = .
* STEP 2:Loop over each observation
foreach var in hads_tense_rec hads_glad_rec hads_omen_rec hads_laugh_rec hads_trouble_rec hads_happy_rec hads_relax_rec hads_limited_rec hads_scary_rec hads_looks_rec hads_restless_rec hads_future_rec hads_panic_rec hads_enjoy_rec {
gen is_missing_`var' = missing(`var')
}
* STEP 3: Calculate the number of missing items per subscale
egen missing_hads_anx = rowtotal(is_missing_hads_tense_rec is_missing_hads_glad_rec is_missing_hads_omen_rec is_missing_hads_laugh_rec is_missing_hads_trouble_rec is_missing_hads_happy_rec is_missing_hads_relax_rec)
egen missing_hads_depr = rowtotal(is_missing_hads_limited_rec is_missing_hads_scary_rec is_missing_hads_looks_rec is_missing_hads_restless_rec is_missing_hads_future_rec is_missing_hads_panic_rec is_missing_hads_enjoy_rec)
* STEP 4. Drop observations with more than one missing item in any subscale
drop if missing_hads_anx > 1 | missing_hads_depr > 1
**STEP 5.** Replace single missing items with the mean of the present six items
foreach var in hads_tense_rec hads_glad_rec hads_omen_rec hads_laugh_rec hads_trouble_rec hads_happy_rec hads_relax_rec {
qui replace `var' = (hads_tense_rec + hads_glad_rec + hads_omen_rec + hads_laugh_rec + hads_trouble_rec + hads_happy_rec + hads_relax_rec - `var') / 6 if is_missing_`var' == 1 & missing_hads_anx == 1
}
foreach var in hads_limited_rec hads_scary_rec hads_looks_rec hads_restless_rec hads_future_rec hads_panic_rec hads_enjoy_rec {
qui replace `var' = (hads_limited_rec + hads_scary_rec + hads_looks_rec + hads_restless_rec + hads_future_rec + hads_panic_rec + hads_enjoy_rec - `var') / 6 if is_missing_`var' == 1 & missing_hads_depr == 1
}
Now, if I run the **fifth step**, there are still missing data (for example hads_limited_rec == . ).
Thanks for any suggestions!
user26529879 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.