I have a dataframe of 400k OpenAlex Author IDs. I want to obtain the actual Author Name and Surname associated with the particular Author ID. Currently, I’m passing 1 author ID to the Authors() class and I only take the display_name from the response.
However, this approach seems to be quite slow and inefficient.
That’s why I started to wonder whether there’s a faster and more efficient way of obtaining names and surnames (in the current form, the script will have to make 400k queries as I understand it). Unfortunately, the documentation wasn’t of much help. Link to the README.
Current implementation:
pyalex.config.email = "[email protected]"
pyalex.config.max_retries = 3
pyalex.config.retry_backoff_factor = 0.1
for row in range(len(data["openalex_id"])):
name = Authors()[data.loc[row, "openalex_id"]]["display_name"]
data.loc[row, "Name_Surname"] = name