I’m trying to scrape Instagram photos with Selenium. The script is working to get the first image of all types of posts (single, video, carousels) but when I try to get the src of any subsequent images of a carousel post, it always returns the first image’s src. There’s no errors, just not the desired output. The issue lies in the new_image_element variable, I provided more code for context. I’ve only used Selenium, is this where BeautifulSoup would be handy or is there a different solution? Any help or insights would be greatly appreciated!
image_element= post_element.find_element(By.XPATH, ".//div[@class='_aagv']/img")
image_src= image_element.get_attribute('src')
next_picture=WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "button[aria-label='Next']")))
driver.execute_script("arguments[0].click();", next_picture)
new_image_element= post_element.find_element(By.XPATH, ".//div[@class='_aagv']/img")
new_image_src= new_image_element.get_attribute('src')
I’ve tried using WebDriverWait until presence of element, using both driver and post_element, hoping that it would wait until the new image was present to get the src or at least possibly give me the second image src when it got to the 4th slide or so but it still retrieved the first image src. I tried driver.find_element(By.XPATH, ".//div[@class='_aagv']/img")
, but that returns the image src of the first post of the page, not the post that I’m on.
Using all the above, I also tried implementing a while statement that continues to search for a new image src while the new image src still matches the original one, as well as a for loop with a range of 10 that runs the new_image_element and new_image_src variable and breaks when the image src finally changes. I was thinking that the script just didn’t have enough time to find the new src but both the while and for loop never grabbed a different image src.
Angel Mosley is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.