I have a large spreadsheet with over 18,000 comments that I have web-scraped from various social media platforms and forums.
I want to categorise these comments in the spreadsheet. For example, comments mentioning “delivery” should be placed in a category called “Delivery.” Additionally, I want to further categorise these comments into subcategories, such as “Delivery” -> “Delivery Issues” -> “Item Not Received”.
Since some comments may fall into multiple categories, I would like to duplicate those comments, assigning each duplicate to a different relevant category (e.g., a comment related to both “Delivery” and “Customer Service” should appear in both categories).
Is there a systematic way to achieve this using Python code, rather than manually going through all 18,000 comments?
Any suggestions would be appreciated.
Thanks
I am a bit stuck on how to do this. I have tried using machine learning classifying data but with text it wasn’t very accurate.
3