I want to scrape the data from the only line chart on https://www.bltindex.com/
The goal is to in the end have a pandas DataFrame with one time series from the chart in it
After watching this video I tried to apply the same method and look for some csv or json file in the Network of the page while the page was loading, but could not find any. The only thing I found was a css file that had the word “chart” in it with a link https://docs.google.com/static/spreadsheets2/client/css/838001818-v3-ritz_chart_css_ltr.css and saw that it had a request link as well (it is in the code below)
I tried the following code:
import requests
from bs4 import BeautifulSoup
url = 'https://docs.google.com/spreadsheets/d/e/2PACX-1vQG9TYlv8_LpCvO7EI3Y3s8MoxQEfOHTd3-EqccN5PoeHcdxraxZC0y8UWFx_2NnogVIIuk1i-phvFe/pubchart?oid=813038046&format=interactive'
html = requests.get(url)
soup = BeautifulSoup(html.content)
print(soup.prettify())
The code returned a string and in the <script nonce="yyTSUqBQUPTxI-ZkIM7OKw">
I indeed saw the values that I want to get. However, I do not know how to get them from this string without doing it manually. Is there perhaps some more convenient way to get the data?
Mr. Ivan is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.