I hope you are doing well.
I am currently using a dataset that contains 3 classes with distributions of 15%, 31%, and 52% for each class.
I wanted to ask whether I need to balance the number of samples for each class for the sentiment analysis project or not.
One of the considerations I had in mind is that it might be easier to classify positive sentences, so the distribution of positive class labels in the dataset is low. Or, due to the difficulty of identifying neutral comments, there should be more samples in the dataset.
For this reason, I am confused about whether I should equalize the number of samples for each class or use the same distribution.
My dataset is available at the following link: https://huggingface.co/datasets/Khedesh/MirasOpinion
Please help me use the dataset with the appropriate class frequency percentage.