I have a dataset with postal code column. They have some significance and I want to use it as a feature.I’m in preprocessing phase and still not sure about the algorithm i’ll use.
I need a suggestions on best way to use zipcode column as a feature.
Thanks in advance !!
I found that you can use one-hot encoding but in my case i’m dealing with a large number of unique zip codes (such as 3513 different ones) , one-hot encoding (or dummy encoding) is impractical due to the high dimensionality it introduces.
khadija is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.