I’m trying to scrape Bitcoin market data from CoinGecko using Selenium in headless mode, but the script returns an empty DataFrame. The table rows are not being detected even though I’ve added a wait time. Here is a simplified version of the code I’m using to set up the WebDriver, navigate to the page, and extract the table data using XPath. The relevant parts of the log indicate that the requests are being made correctly, but no elements are found. What could be causing this issue, and how can I ensure the table data is correctly scraped in headless mode?.
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.chrome.options import Options
import pandas as pd
import time
# Path to your ChromeDriver
chrome_driver_path = 'C:\Users\hamid\OneDrive\Desktop\chromedriver-win64\chromedriver.exe'
# Set up headless mode
options = Options()
options.headless = True
options.add_argument("--window-size=1920,1080")
# Set up the WebDriver
driver = webdriver.Chrome(executable_path=chrome_driver_path, options=options)
# Navigate to the CoinGecko Bitcoin page
driver.get('https://www.coingecko.com/en/coins/bitcoin')
# Wait for the page to load
time.sleep(5)
# Extract data from the page
rows = driver.find_elements(By.XPATH, '//table[@class="table"]/tbody/tr')
market_data = []
for row in rows:
exchange = row.find_element(By.XPATH, './/td[2]/a').text
pair = row.find_element(By.XPATH, './/td[3]/a/b').text
price = row.find_element(By.XPATH, './/td[4]/span').text
volume_24h = row.find_element(By.XPATH, './/td[5]/span').text
volume_percentage = row.find_element(By.XPATH, './/td[6]').text
category = row.find_element(By.XPATH, './/td[7]').text
updated = row.find_element(By.XPATH, './/td[8]').text
market_data.append({
'exchange': exchange,
'pair': pair,
'price': price,
'volume_24h': volume_24h,
'volume_percentage': volume_percentage,
'category': category,
'updated': updated
})
# Close the WebDriver
driver.quit()
# Convert to DataFrame
df = pd.DataFrame(market_data)
print(df)
When I run the script, I get the following output:
Empty DataFrame
Columns: []
Index: []