I’m analyzing a corpus of documents and I noticed that there are 4 instances of tokens that look identical but are recognized as different. Today I imported the dataset to another software I it highlighted what looked like an empty space before the word:
I then tried to copy/paste this from Gephi to Excel and a weird dot was displayed before the term:
I tried to copy/paste the text to text-compare.com in an attempt to identify the character and to my jupyter lab notebook in an attempt to fix the words, but whenever I do that the odd character disappears and it is not picked up so I’m unable to select the terms that I’m trying to correct. Any idea about how to handle this?