Good day!
There is a dataset of vehicle configurations:
- ID
- Brand
- Model
- Generation
- Engine Type
- Engine Capacity
- Number of cylinders
- Body type
- Gearbox type
- Engine code
- Year of manufacture
Task:
It is necessary to determine how one configuration is similar to another.
I assumed to represent each entry in the dataset as a vector, and calculate the cosine similarity between the vectors.
But there is a misunderstanding of how to represent values in numerical form, for example, Body Type: sedan, crossover, coupe, etc.
Thanks for your help