I requested a data download from Instagram and I chose the JSON format. However, when I got the file and unzipped it, every non-ASCII character was represented as a Unicode escape sequence. E.g.:
"sender_name": "Leu00c3u00b3 Taku00c3u00a1cs"
The correct text would be: “sender_name”: “Leó Takács”
I tried parsing the JSON file with Python and correcting the errors somehow, but instead of getting “ó” for “u00c3u00b3”, I got ó. It seems like, every way I tried, it always returned the characters individually decoded. The same thing happened with emojies too, so hardcoding every problematic character to be replaced would be a bit of a headache. I would prefer a solution that is doable programatically, but at this point any idea including 3rd party software plays.