I have a dict like this:
{'id': '123', 'sales_attributes': [], 'seller_sku': '123'}
I want to generate a dataframe to store it.When I use pd.DataFrame(), it’s generate a empty dataframe
When I use pd.json_normalize(), I got the desired results
Why? Can anyone help me?
You have to wrap everything in lists:
dic = {'id': ['123'], 'sales_attributes': [[]], 'seller_sku': ['123']}
print(pd.DataFrame(dic))
id sales_attributes seller_sku
0 123 [] 123
Why?
When you pass a dictionary to the DataFrame
constructor it expected lists as items, with all lists having the same length:
dic = {'id': ['123', '456'], 'seller_sku': ['123', '456']}
df = pd.DataFrame(dic)
id seller_sku
0 123 123
1 456 456
As a convenience, you can also pass literals and they will be expanded to all rows:
dic = {'id': ['123', '456'], 'seller_sku': ['123', '456'], 'other': 'X'}
df = pd.DataFrame(dic)
id seller_sku other
0 123 123 X
1 456 456 X
Any other format would fail, such as uneven lists or only literals:
dic = {'id': ['123', '456'], 'seller_sku': ['123']}
print(pd.DataFrame(dic))
# ValueError: All arrays must be of the same length
dic = {'id': '123', 'seller_sku': '123'}
print(pd.DataFrame(dic))
# ValueError: If using all scalar values, you must pass an index
In your case, there is one attribute defining the number of rows (0) and 2 literals, so the output is an empty DataFrame:
dic = {'id': '123', 'sales_attributes': [], 'seller_sku': '123'}
df = pd.DataFrame(dic)
Empty DataFrame
Columns: [id, sales_attributes, seller_sku]
Index: []
3