I’m a complete beginner and new to web scraping.
I’m trying to scrape data from this website’s calendar, https://www.nycforfree.co/events, as far back as February 2023 into a csv file.
I need the date, location, and name of each event.
Where do I start?
I’ve tried different variations of this and adjusted class names etc. based on inspection of the html in browser.
<code>import requests
from bs4 import BeautifulSoup
import pandas as pd
url = 'https://www.nycforfree.co/events/valentines-day'
response = requests.get(url)
response.raise_for_status()
soup = BeautifulSoup(response.content, 'html.parser')
calendar_data = soup.find_all('tr', class_='yui3-calendar-row')
calendar_entries = []
for entry in calendar_data:
date = entry.find('span', class_='date').text
event_name = entry.find('span', class_='event-name').text
location = entry.find('span', class_='location').text
calendar_entries.append([date, event_name, location])
df = pd.DataFrame(calendar_entries, columns=['Date', 'Event Name', 'Location'])
df.to_csv('calendar_data.csv', index=False)
</code>
<code>import requests
from bs4 import BeautifulSoup
import pandas as pd
url = 'https://www.nycforfree.co/events/valentines-day'
response = requests.get(url)
response.raise_for_status()
soup = BeautifulSoup(response.content, 'html.parser')
calendar_data = soup.find_all('tr', class_='yui3-calendar-row')
calendar_entries = []
for entry in calendar_data:
date = entry.find('span', class_='date').text
event_name = entry.find('span', class_='event-name').text
location = entry.find('span', class_='location').text
calendar_entries.append([date, event_name, location])
df = pd.DataFrame(calendar_entries, columns=['Date', 'Event Name', 'Location'])
df.to_csv('calendar_data.csv', index=False)
</code>
import requests
from bs4 import BeautifulSoup
import pandas as pd
url = 'https://www.nycforfree.co/events/valentines-day'
response = requests.get(url)
response.raise_for_status()
soup = BeautifulSoup(response.content, 'html.parser')
calendar_data = soup.find_all('tr', class_='yui3-calendar-row')
calendar_entries = []
for entry in calendar_data:
date = entry.find('span', class_='date').text
event_name = entry.find('span', class_='event-name').text
location = entry.find('span', class_='location').text
calendar_entries.append([date, event_name, location])
df = pd.DataFrame(calendar_entries, columns=['Date', 'Event Name', 'Location'])
df.to_csv('calendar_data.csv', index=False)
New contributor
Muireann O Connor is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.