for url_index in len(urls):
req = requests.get(urls[url_index])
soup = BeautifulSoup(req.text, 'html.parser')
article = soup.find_all('div', attrs={'class','p'})
for i in range(len(articles)):
# Print the index and the text of each article
print(f"Article {i + 1 + url_index * len(articles)}: {articles[i].text}")
In the first for loop, inside the variable article I’m trying to get the text from tag ‘p’ which is inside tag ‘div’ which has attribute class, I didn’t really understand what find_all() method does, so I went through multiple websites also chatGPT to know how to enter multiple tags and get the element from them, there were many answers and I didn’t know how to use them.
Now, what I want to know is how to extract the text in the div that contains the article which is actually contained in ‘p’ tags.
Hedra Lotfy is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.