I have a dataset that contains 750,000 rows. I want to query each row and get the postcodes using the latitudes and longitudes.
Problem:
The code is executing very fast when i query like 100 rows. But when i try to query like 10,000 rows it takes hours.
Question(s):
Is there any way i can speedup the querying so that the code will execute faster?.
OR
Is there any new approach i can take to get a quicker and accurate result?.
My Code:
import folium
from geopy.geocoders import Nominatim
from geopy.distance import geodesic
postcodes = []
geolocator = Nominatim(user_agent = "app_name")
for index, row in new_df.iterrows():
row_location = geolocator.reverse(f"{row['Pickup_latitude']}, {row['Pickup_longitude']}", timeout = 10)
postcode = row_location.raw['address']['postcode'] if 'postcode' in row_location.raw['address'] else '00000'
postcodes.append(postcode)
new_df['start_postcode'] = postcodes