I am trying to get data from a specific table (https://data.cms.gov/provider-data/dataset/yizn-abxn) on the the Centers for Medicare & Medicaid Services website using their API.
Because the API only provides 500 rows per request, I am using a while loop.
However, my concat isn’t working for some reason. I tried googling for a solution, but the results didn’t really help.
Below is my code.
import pandas as pd
import requests
url = "https://data.cms.gov/provider-data/api/1/datastore/query/yizn-abxn/0"
def loop_json():
i = 0
total_rows = 1500
want = []
json_norm = []
while i < total_rows:
size = 500
offset_url = f"{url}?size={size}&offset={i}"
print(offset_url)
offset = i
offset_response = requests.request("GET", offset_url)
print(f"Made request for {size} results at offset {i}")
# if you want to save the data you would do that here
json_resp = offset_response.json()
json_norm = pd.json_normalize(json_resp['results'])
# want = pd.concat(json_norm)
i += size
return json_norm
df = loop_json()
The function currently works if want = pd.concat(json_norm)
is commented out, but it only gives me the last 242 rows.
Obviously, I would like the function to give me all 1,242 rows.
Any help would be greatly appreciated, especially since I am new to Python, JSON, and APIs.