I’m trying to scrape a few fields of each accelerator out of 300 accelerators from a webpage. I can scrape the fields from the first accelerator, but I fail to grab fields from the rest of the accelerators since there are no uniform containers that I can make use of.
Please note that not all the fields in all the accelerators are always present, so the logic of using index will lead to getting the wrong fields or the field of another accelerator.
import requests
from bs4 import BeautifulSoup
link = 'https://www.failory.com/startups/united-states-accelerators-incubators'
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/125.0.0.0 Safari/537.36'
}
res = requests.get(link,headers=headers)
soup = BeautifulSoup(res.text,"html.parser")
accelerator_name = soup.select_one("h3").get_text(strip=True)
equity_taken = soup.select_one("li:-soup-contains('Equity Taken:')").get_text(strip=True)
accelerator_duration = soup.select_one("li:-soup-contains('Accelerator Duration')").get_text(strip=True)
website = soup.select_one("p a[href]").get("href")
print(accelerator_name,equity_taken,accelerator_duration,website)
How can I scrape the fields from all the accelerators?