I am working on a project for my resume, and I want to create a playlist generator app. The app will receive a playlist as input and generate a playlist that fits as output. My data will include the song name, lyrics, and attributes of the songs extracted from the Spotify for Developers API, and the dataset size will be around 150K entries.
I am considering using K-NN for this unsupervised task (or another unsupervised algorithm), but I started to think that it might be problematic because I don’t know how to evaluate my model afterward. I can’t manually create labeled test data, as it is not feasible for me, and even if I could, there is no ground truth for a given playlist.
So when I finish, how will I know if my model is doing its job?