I have a customer scoring problem I’m working on specifically on predicting conversion and coming up with a probability score on conversion (using xgboost classifier atm). There’s a feature I want to introduce, but I am having a hard time formulating what the feature definition should be.
Specifically, I know that when an event A happens recently (eg, customer phones our office), that is an indicator that the customer is interested in our product and might convert. So to do this, I created a recency feature that is basically: (today – event date) in days.
The problem is that this does not capture the influence of older customer records. For example, a customer might have called us a year ago (event A triggered) and converted soon thereafter and using that formula, the recency feature will be relatively large. I want the model to learn that low recency values translate to higher probability.
Are there any good ways to engineer the feature to capture this relationship?