Relative Content

Tag Archive for pythonnlptext-classificationdata-preprocessingrecipe

Singularizing (multi-term) ingredients from cooking recipes

I am working on a recipe classification system and I am struggling with the preprocessing of my data. The data is from Food.com and I need to make sure that all ingredients are in singular form to reduce the number of unique ingredients. But I am not so sure how to do that/what library to use. It is also a bit tricky because some ingredient names are very long e.g. “del monte crushed tomatoes with mild green chilies” or they include specific brand names e.g. “baker’s angel flake sweetened coconut”.