This is the link to the website I am trying to scrape data from: https://www.fotmob.com/leagues/47/stats/season/20720/players/goals/premier-league.
I want to select the section with class = ‘css-653rx1-StatsContainer eozqs6r5’ using beautifulsoup4.
Before you mention find() and find_all() I have used both but for some reason it’s like the section tag doesn’t exist. When I tried section = soup.find(‘section’, class_=’css-653rx1-StatsContainer eozqs6r5′) it returned none. when I tried section = soup.find_all(‘section’, class_=’css-653rx1-StatsContainer eozqs6r5′) it returned an empty list.
I then traversed the DOM and I was able to select every div prior to the section. oncd I tried to access the section, it returned none.
This is my code
import requests
from bs4 import BeautifulSoup
import pandas as pd
# URL of the webpage you want to scrape
url = 'https://www.fotmob.com/leagues/47/stats/season/20720/players/goals/premier-league'
# Send HTTP request to the URL
response = requests.get(url)
# Parse the HTML content of the page
soup = BeautifulSoup(response.content, 'html.parser')
# Remove <style> tags
for style in soup.find_all('style'):
style.decompose()
# Remove <script> tags
for script in soup.find_all('script'):
script.decompose()
outer_main = soup.find('main', class_='css-1cyagd9-PageContainerStyles e19hkjx10')
section = soup.find_all('section', class_='css-653rx1-StatsContainer eozqs6r5')
for sec in section:
print("T")
#print(soup.prettify())
# Print the HTML of the outer <main> for debugging
if outer_main:
print("Outer <main> found.")
else:
print("The outer <main> tag with the specified class was not found.")
# Navigate through the HTML structure to find the target div
main_div = outer_main.find('main') if outer_main else None # Adjust this if nested <main> exists
if main_div:
print("Found inner <main>.")
else:
print("Inner <main> not found.")
div1 = main_div.find('div', class_='css-xxmbx0-LeagueSeasonStatsColumn eozqs6r0') if main_div else None
if div1:
print("Found div1 with class 'css-xxmbx0-LeagueSeasonStatsColumn eozqs6r0'.")
else:
print("div1 with class 'css-xxmbx0-LeagueSeasonStatsColumn eozqs6r0' not found.")
div1 = div1.find('div', class_='css-1wb2t24-CardCSS e1mlfzv61') if div1 else None
if div1:
print("Found div1 with class 'css-1wb2t24-CardCSS e1mlfzv61'.")
else:
print("div1 with class 'css-1wb2t24-CardCSS e1mlfzv61' not found.")
div1 = div1.find('div', class_='css-1yndnk3-LeagueSeasonStatsContainerCSS eozqs6r1') if div1 else None
if div1:
print("Found div1 with class 'css-1yndnk3-LeagueSeasonStatsContainerCSS eozqs6r1'.")
else:
print("div1 with class 'css-1yndnk3-LeagueSeasonStatsContainerCSS eozqs6r1' not found.")
div1 = div1.find('section', class_='css-653rx1-StatsContainer eozqs6r5') if div1 else None
if div1:
print("Found section with class 'css-653rx1-StatsContainer eozqs6r5'.")
else:
print("section with class 'css-653rx1-StatsContainer eozqs6r5' not found.")
div1 = div1.find('div', class_='css-fvfi51-LeagueSeasonStatsTableCSS e15r3kn20') if div1 else None
if div1:
print("Found div with class 'css-fvfi51-LeagueSeasonStatsTableCSS e15r3kn20'.")
else:
print("div with class 'css-fvfi51-LeagueSeasonStatsTableCSS e15r3kn20' not found.")
RESULT:
Outer <main> found.
Found inner <main>.
Found div1 with class 'css-xxmbx0-LeagueSeasonStatsColumn eozqs6r0'.
Found div1 with class 'css-1wb2t24-CardCSS e1mlfzv61'.
Found div1 with class 'css-1yndnk3-LeagueSeasonStatsContainerCSS eozqs6r1'.
section with class 'css-653rx1-StatsContainer eozqs6r5' not found.
div with class 'css-fvfi51-LeagueSeasonStatsTableCSS e15r3kn20' not found.
I tried removing the script and style tags because I couldn’t see them in the HTML inside the Develeper tools.
Basically I am selecting every div based on its class but for some reason when I try to select the section it doesn’t work.
I also tried traversing through every element after the parent div that contains the section but for some reason it just skips over the section and goes on the next html element like the section isn’t there.
I’m not even sure what to do at this point. When I ran print(soup.prettify()), the section tag didn’t show up. It’s very confusing because I can clearly see the section tag in the developer tools. ANY HELP WITH HOW TO SELECT THE SECTION TOOL WOULD BE APPRECIATED!
Also for what it’s worth, I tried Selenium but that was making me go crazy. It seems like there isn’t a Chromedriver for my version of chrome (Version 127.0.6533.100) whereas the newest version of chromedrivers is (Version: 127.0.6533.99). That’s my best explantion at least.
justanotheruser is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.