I’m trying to order this dataframe in quarterly order using the list sortTo as reference to put it into a table.
import pandas as pd
# Sample DataFrame
data = {'QuarterYear': ["Q1 2024", "Q2 2024", "Q3 2023", 'Q3 2024', "Q4 2023", "Q4 2024"], 'data1': [5, 6, 2, 1, 10, 3], 'data2': [12, 4, 2, 7, 2, 9]}
sortTo = ["Q3 2023", "Q4 2023", "Q1 2024", 'Q2 2024', "Q3 2024", "Q4 2024"]
df = pd.DataFrame(data)
df.reindex(sortTo)
I’ve tried re-index, sort_values to no avail. I cannot use np.sort as the quarters are not numerical.
Current output:
QuarterYear data1 data2
0 Q1 2024 5 12
1 Q2 2024 6 4
2 Q3 2023 2 2
3 Q3 2024 1 7
4 Q4 2023 10 2
5 Q4 2024 3 9
1
Not sure if there is any constraint but why don’t you change column to category and sort. In this case, the category are in order as expected so expected so simply .sort_values
would work without need of extra list “sortTo” for sorting.
df['QuarterYear'] = df['QuarterYear'].astype("category")
df.sort_values(['QuarterYear'], ignore_index=True)
1
I would recommend to convert your string to Quarterly Periods. Then you’ll be able to sort naturally:
df['QuarterYear'] = pd.PeriodIndex(
df['QuarterYear'].str.replace(r'(Qd) (d{4})', r'2-1', regex=True),
freq='Q',
)
out = df.sort_values(by='QuarterYear')
Output:
QuarterYear data1 data2
2 2023Q3 2 2
4 2023Q4 10 2
0 2024Q1 5 12
1 2024Q2 6 4
3 2024Q3 1 7
5 2024Q4 3 9
If you insist on the custom order from a list, use a CategoricalDtype
:
cat = pd.CategoricalDtype(sortTo, ordered=True)
out = df.astype({'QuarterYear': cat}).sort_values(by='QuarterYear')
Output:
QuarterYear data1 data2
2 Q3 2023 2 2
4 Q4 2023 10 2
0 Q1 2024 5 12
1 Q2 2024 6 4
3 Q3 2024 1 7
5 Q4 2024 3 9
Code
use key
parameter for custom sorting.
out = df.sort_values(
'QuarterYear',
key=lambda x: x.map({k: n for n, k in enumerate(sortTo)})
)
out:
QuarterYear data1 data2
2 Q3 2023 2 2
4 Q4 2023 10 2
0 Q1 2024 5 12
1 Q2 2024 6 4
3 Q3 2024 1 7
5 Q4 2024 3 9
If your data consists of years and quarters, the following code should do the sorting you need without sortTo
.
out = df.sort_values(
'QuarterYear',
key=lambda x: x.str.replace(r'(Qd) (d+)', r'2 1', regex=True)
)
You can convert ‘QuarterYear’ column to a categorical type with the order specified in ‘sortTo’
sortTo = ["Q3 2023", "Q4 2023", "Q1 2024", 'Q2 2024', "Q3 2024", "Q4 2024"]
df['QuarterYear'] = pd.Categorical(df['QuarterYear'], categories=sortTo, ordered=True)
df_sorted = df.sort_values(by='QuarterYear')
Output
QuarterYear data1 data2
2 Q3 2023 2 2
4 Q4 2023 10 2
0 Q1 2024 5 12
1 Q2 2024 6 4
3 Q3 2024 1 7
5 Q4 2024 3 9