I am working on an analytics task for my job to match control units to treated units to monitor the effectiveness of a new marketing initiative. We decided to use a propensity score matching method but neither my partner nor I had previous experience with this.
We decided to use the MatchIt package in R which takes in a formula in the form of
test ~ x + y + z + ...
but we were observing near perfect differentiation, i.e. when you plot the propensity scores, the treated units all have propensity scores very close to 1 and all matched and potential controls have propensity scores very close to 0. Below is an image that shows a jitter plot of the propensity scores for the units.
Jitter plot of the propensity scores for controls and test units. All control units have propensity scores very close to 0 while all test units have propensity scores very close to 1.
I cannot find any sources online that have encountered the same problem so I am wondering if it is more appropriate to either
- use a formula with fewer predictors to get more variation in the propensity scores, or
- leave the formula as is.
Additionally, if anyone recommends a resource that goes into propensity score matching methods in more depth, we would really appreciate this.
Emma is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.