I have 8000 JSON files, each consisting of multiple tweets and their information, like the one on the image. one tweet ‘structure’ of many in 1 of the 8000 files
I want to load these 8000 files (path to them: C:UserssienvDocumentsTwitter_DataTwitter_data_airport_2023) in python using VSC and then preprocess them in such way they can be used for frequency analysis & topic modeling.
Can you help me with the loading and preprocessing in an efficient way?
Thank you in advance!
I tried several times to upload the data using the path and a loop to go through the folder but i never worked, my df stayed empty or it kept running all night long.
Sien Van Herreweghe is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.