I’m trying to scrape data from a sandbox website just to practice and start using python to scrape web data.
I have managed to extract a lot of data using the basics however I have found an element that is loaded in dynamically after the initial page load.
My question is how to extract this data after the element has loaded in?
Here is my code I’m currently using:
import pandas as pd
from bs4 import BeautifulSoup
from selenium import webdriver
options = webdriver.FirefoxOptions()
driver = webdriver.Firefox(options=options)
driver.get('https://sandbox.oxylabs.io/products')
results = []
other_results = []
status = []
content = driver.page_source
soup = BeautifulSoup(content, 'html.parser')
for element in soup.find_all(attrs={'class': 'product-card'}):
name = element.find('h4')
if name not in results:
results.append(name.text)
for b in soup.find_all(attrs={'class': 'product-card'}):
# Note the use of 'attrs' to again select an element with the specified class.
name2 = b.find(attrs={'class': 'price-wrapper'})
# stock = b.find(attrs={'class': 'price-wrapper'}).find_next_sibling('p')
# status.append(stock.text)
other_results.append(name2.text)
print(b)
df = pd.DataFrame({'Names': results, 'Prices': other_results, 'Stock': status})
df.to_csv('products.csv', index=False, encoding='utf-8')
The element im trying to get is the stock
An example of what it looks like when loaded in is
Out of Stock
However it does not appear in the print(b) for any of the elements
I tried using css selector and .find() however there are two issues with that which are out of stock items have a class of out-of-stock and in stock items have a class of in-stock.
I tried using an if statement to manually set a variable on if it has in-stock class then setting it to “In Stock” else “Out of Stock” however due to the element not existing it always returns out of stock.
mattie malling is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.