Consider the following multi-level column index dataframe:
import numpy as np
import pandas as pd
arrays = [
["A", "A", "B", "B"],
["one", "two", "one", "two"],
["1", "2", "3", "pd.NA"],
]
idx = pd.MultiIndex.from_arrays(arrays, names=["level_0", "level_1", "level_2"])
data = np.random.randn(3, 4)
df = pd.DataFrame(data, columns=idx)
print(df)
level_0 A B
level_1 one two one two
level_2 1 2 3 pd.NA
0 -1.249285 0.314225 0.011139 0.675274
1 -0.654808 -0.492350 0.596338 -0.087334
2 0.113570 0.566687 -0.361334 0.085368
level_2
holds values of type object
(str
really).
df.columns.get_level_values(2)
Index(['1', '2', '3', 'pd.NA'], dtype='object', name='level_2')
I need to parse it to the correct data type and change this particular column level.
new_level_2 = [
pd.NA if x == "pd.NA" else int(x) for x in df.columns.get_level_values(2)
]
I am looking for a pythonic way to replace the old level_2
with new_level_2
.