Small corpus, want to find associations. Word2Vec?
I’m a psychologist, and I’m diving into the field of AI. I could really use some help for a project. This semester, I discovered Word2Vec and was mesmerized by its capability to find associations. So, I decided to try it on an artificial corpus of psychotherapy documents created by ChatGPT. However, I have a practical issue regarding the size of the vocabulary. Psychotherapy reports from an analyst don’t exceed 10,000 words, and I know that cleaning this corpus typically reduces it by at least half.