I have a pandas dataframe with data like this
Name,Year,Value
----------------
Wendys,2021,6915
Wendys,2021,6916
Wendys,2022,6943
Wendys,2022,7016
Wendys,2022,7026
Wendys,2022,7028
Wendys,2023,7055
Wendys,2023,7128
Wendys,2023,7129
Wendys,2023,7138
Wendys,2023,7140
Wendys,2024,7166
White Castle,1950,88
White Castle,1950,90
White Castle,1951,91
White Castle,1951,92
White Castle,1951,93
I want to transform it to this:
Name,Year,Value
----------------
Wendys,2021,6915.5
Wendys,2022,6979.5
Wendys,2023,7129
Wendys,2024,7166
White Castle,1950,89
White Castle,1951,92
Note above:
6915.5 is median(6915,6916)
7129 is median(7055,7128,7129,7138,7140)
0
u can use df.groupby(["Name", "Year"])["Value"].median()
. Full example:
import pandas as pd
data = {
"Name": ["Wendys", "Wendys", "Wendys", "Wendys", "Wendys", "Wendys", "Wendys", "Wendys", "Wendys", "Wendys", "Wendys",
"Wendys", "White Castle", "White Castle", "White Castle", "White Castle", "White Castle"],
"Year": [2021, 2021, 2022, 2022, 2022, 2022, 2023, 2023, 2023, 2023, 2023, 2024, 1950, 1950, 1951, 1951, 1951],
"Value": [6915, 6916, 6943, 7016, 7026, 7028, 7055, 7128, 7129, 7138, 7140, 7166, 88, 90, 91, 92, 93]
}
df = pd.DataFrame(data)
df.groupby(["Name", "Year"])["Value"].median()
result will be:
Name Year
Wendys 2021 6915.5
2022 7021.0
2023 7129.0
2024 7166.0
White Castle 1950 89.0
1951 92.0
or:
df.groupby(["Name", "Year"], as_index=False)["Value"].median()
and result will be:
Name Year Value
0 Wendys 2021 6915.5
1 Wendys 2022 7021.0
2 Wendys 2023 7129.0
3 Wendys 2024 7166.0
4 White Castle 1950 89.0
5 White Castle 1951 92.0
1