Relative Content

Tag Archive for regressionclassificationcategorical-datafeature-selection

How to approach classification with many categorical features

I’m new to ML and would like to know more about classification. I have a small dataset of n=600 scored samples and thousands of potential metrics, all categorical (True or False). Basically, I would like to tell which of these thousands of metrics have the best predictive value against the known score so I can use them on an unknown dataset. I’m also thinking of summing up the true values of the good features together to have a single numerical metric that would easily show which samples are the most likely to have high scores (assuming having a true value in each feature correlates with having a higher score)