I am currently working on my master thesis. I want to run an sentiment analysis on the tweets directed at airlines that are about their sustainability initiatives. I wanted to use the API of X but Elon Musk thought it was necessary to ask for $100 per month to acces the data. Some guys from MIT found a way around this paywall.
- https://medium.com/@vladkens/how-to-still-scrape-millions-of-tweets-in-2023-using-twscrape-97f5d3881434
- https://github.com/vladkens/twscrape/tree/main
But the thing is that I am not familiar with the use if Python. Normally I use R for all my data analyzes, so I don’t know what I have to do. I need a CSV file with the tweet and the date so that I can run my analysis in R again.
I think I’ve altered the code so that I only get the tweets and the date but I can’t seem to get the output. The code stops after logging into the account that I’ve made for this to work. Before I actually got some output but I wasn’t able to export it to a CSV file unfortunately.
import asyncio
from twscrape import API, gather
from twscrape.logger import set_log_level
async def main():
api = API() # or API("path-to.db") - default is `accounts.db`
# ADD ACCOUNTS (for CLI usage see BELOW)
await api.pool.add_account("pevers123", "2051Lb57", "[email protected]", "2051Lb57")
await api.pool.add_account("maxevers123", "2051Lb57", "[email protected]", "2051Lb57")
await api.pool.login_all()
# API USAGE
# search (latest tab)
await gather(api.search("easyjet eco-friendly environment", limit=20)) # list[Tweet]
# change search tab (product), can be: Top, Latest (default), Media
await gather(api.search("easyjet eco-friendly environment", limit=20, kv={"product": "Top"}))
# NOTE 1: gather is a helper function to receive all data as list, FOR can be used as well:
async for tweet in api.search("easyjet eco-friendly environment"):
print(tweet.user.username, tweet.rawContent, tweet.date) # tweet is `Tweet` object
if __name__ == "__main__":
asyncio.run(main())