In a Python notebook, I am trying to convert text from a MS Word document to standard text that is JSON compliant. Some of the text uses the apostrophe (quote), but word has the quotes as the Unicode character U+2019, Right Single Quotation Mark. Since the text was in a pandas df, I used str.replace, but it didn’t work.
Further investigating, doing ord(“’”) (with the U+2019 character pasted from Word) returned 8217, not 2019. Why does this happen? How would I replace U+2019 with the standard quote symbol?
popat-topa-green is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.