Web scraping large amounts of data using Selenium – Improve robustness
I’m looking for a bit of advice around how I can make this scraping script more robust, the script does run but one of the main issues I have with it is that it often fails with an exception when part way through (hence all the error handling, retries and saving).
How do I access to an element that is in a shadow-root inside another shadow-root with Selenium (python)?
I have the following code and HTML structure (I’m not an expert on this).
Trying to extract a text rendered by JavaScript using python gives empty output
I used to use a code to extract affiliation text from this page https://www.sciencedirect.com/science/article/abs/pii/S0011916424004600
you can find the the affiliation text after you click “Show more” at the top of the page.
Selenium (Python) cannot locate element using CLASS_NAME
When the code below is run it raises this exception: selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element: {“method”:”css selector”,”selector”:”.match-on no-top-border “}
Selenium don’t find class
I’m trying to obtain a value from Google Shopping and have attempted to use CSS, className, and XPath.
However, nothing seems to work and it always returns an empty value. As you can see from the print, I have the exact class and CSS name, and still, I can’t get it to work. Has anyone who uses Selenium encountered something like this?
Selenium unable to click on element
I am working on a web scraping project using Selenium in Python and am encountering an issue while interacting with the Lufthansa homepage. My goal is to click a specific element (the Departure field) after accepting cookies. While accepting the cookies works fine, attempting to click the Departure field results in a TimeoutException.
How to obtain data from an IFRAME with Python and Selenium
I am trying to obtain a value from this page: https://www.bbva.com.co/personas/productos/inversion/fondos/pais.html
How to get username for top posts for a certain keyword using selenium in python
Trying to get to the top posts for a keyword on instagram and capture their username and save it into a csv. Since the username is only displayed when the post is clicked I am unable to find the correct xpath that would make selenium click on that post. Any insights would be helpful!
How do i scrape a website whos robots.txt disallows it?
I want to webscrape data from a website for the time frame of last 10 years, the data is a pdf that i want to download that changes everyday. When i open the website normally in a browser the pdf is downloaded normally but when i try to do the same using selenium in python it gives me an error. The script works perfectly and has no errors in it itself but the pdfs dont download. The robots.txt for this website disallows webscraping for a certain area of the website (for ex. Market data) but the url i open using the driver doesnt have market data in it but the tab is already selected when i open the url.
Why wont my scraper scrape the desired elements?
I am trying to scrape the sku and description on this site: https://www.dewalt.com/products/power-tools/