I want to extract link which is nested as /html/body/div[1]/div[2]/div[1]/div/div/div/div/div/a
in xpath , also see detailed nesting image
if helpful, these div have some class also.
I tried
from selenium import webdriver
from bs4 import BeautifulSoup
browser=webdriver.Chrome()
browser.get('https://www.visionias.in/resources/daily_current_affairs_programs.php?type=1&m=05&y=2024')
soup=BeautifulSoup(browser.page_source)
element = soup.find_element_by_xpath("./html/body/div[1]/div[2]/div[1]/div/div/div/div/div/a")
href = element.get_attribute('href')
print(href)
this code gave error
line 9, in <module>
element = soup.find_element_by_xpath("./html/body/div[1]/div[2]/div[1]/div/div/div/div/div/a")
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: 'NoneType' object is not callable
and also tried other method
from selenium import webdriver
from bs4 import BeautifulSoup
browser=webdriver.Chrome()
browser.get('https://www.visionias.in/resources/daily_current_affairs_programs.php?type=1&m=05&y=2024')
soup=BeautifulSoup(browser.page_source)
href = soup('a')('div')[1]('div')[2]('div')[1]('div')[0]('div')[0]('div')[0]('div')[0]('div')[0][href]
#href = element.get_attribute('href')
print(href)
this gave error
href = soup('a')('div')[1]('div')[2]('div')[1]('div')[0]('div')[0]('div')[0]('div')[0]('div')[0][href]
^^^^^^^^^^^^^^^^
TypeError: 'ResultSet' object is not callable
expected outcome should be : https://www.visionias.in/resources/material/?id=3731&type=daily_current_affairs or material/?id=3731&type=daily_current_affairs
Also some other links have same kind of nesting as above, is there any way to filter the links using the text inside/html/body/div[1]/div[2]/div[1]/div/div/p
, for example text here is 18 may 2024, this p tag has an id also but it is not consisent or doesnt have a pattern, so not quite usuable to me.
I have seen other answers on stackoverflow but that isn’t working for me
Also if possible please elaborate the answer, as I have to apply same code to some other sites as well.
xmsk is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.