I am trying to scrape a webpage using Python, but I am encountering an issue where the content I retrieve does not match the content I see in the browser’s DevTools. Instead of getting the dynamically loaded data, I only see calls to functions and incomplete content.
What I tried :
1. Waiting for Elements with Selenium:
table = WebDriverWait(self.driver, 20).until(EC.presence_of_element_located((By.ID, 'table')))
I used Selenium to wait for the elements to be loaded, but it still didn’t capture the complete dynamically loaded data as seen in DevTools.
2. Selenium with JavaScript Execution:
page_source = self.driver.execute_script("return document.documentElement.outerHTML;")
This method doesn’t capture the dynamically loaded content that appears after JavaScript execution.
What I Need:
I need a way to programmatically capture the fully loaded HTML content of a web page as it appears in the browser’s DevTools after all JavaScript has executed and all data has been dynamically loaded. Essentially, I want to retrieve the complete dynamic content seen in the DevTools.
WhiteWall13 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.