I am trying to extract the text content from a ::before
pseudo-element in an HTML document. I understand that traditional HTML parsers like BeautifulSoup
cannot render CSS and thus cannot access pseudo-elements directly. Therefore, I am using Selenium to achieve this.
enter image description here
I want to get the text content of the ::before pseudo-element associated with the span element that has the class hs_kw11_configMs.
from selenium import webdriver
driver = webdriver.Chrome()
driver.get('example.com')
# The class is different every time. The general pattern is "hs_kw/d/d_config/w/w"
element = driver.find_element_by_css_selector('.hs_kw11_configMs')
before_content = driver.execute_script(
"return window.getComputedStyle(arguments[0], '::before').getPropertyValue('content');",
element
)
print(before_content.strip('"'))
driver.quit()
#Output: "content"
#Actual output:
enter image description here
While this script runs without errors, I am not getting the expected text content from the ::before pseudo-element. Can anyone help me identify what I might be doing wrong or suggest a better approach to achieve this?
reference url: https://car.autohome.com.cn/config/spec/67034.html#pvareaid=3454541
Uzair Ahmed. is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
3