Scrapped this massive (10M+ entries) Twitter dataset using academictwitteR, and as I am preparing to do some network analysis, I’ve come up against an issue whereby the dataset only identifies the used ID if a particular tweet is responding to another user (see mockup below). What I am trying to do across this dataset is a conditional replace whereby the user ID in the “in response to” column is replaced by the username.
Current database
ID_column | Username | In_response_to
ID12345 | JohnA | NA
ID54321 | JaneB | ID12345
ID51243 | MarkE | ID54321
Desired outcome
ID_column | Username | In_response_to
ID12345 | JohnA | NA
ID54321 | JaneB | JohnA
ID51243 | MarkE | JaneB
I have looked around extensively through SO and other forums for solutions, but I haven’t managed to. Being relatively new to R, I am sure the answer will be staring me in the face…
radiationpsych is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.