I am relatively new to pandas and have a dataset which requires categorising the values by a subset (the year) of the dataset in pandas.
The dataset is one where there are individual rows for each reported date and I need to bin and categorise each of them into eight categories.
The current project is here: Scottish Crime Pandas Project where you can find the current notebook and parquet of where the dataset is currently.
Here is a sample of the dataset
code | area | area_type | crime_category | count | count_per_10k | date_start | date_end | count_category_historic |
---|---|---|---|---|---|---|---|---|
S12000005 | Clackmannanshire | Council Area | Offences: Group 8: Speeding | 277 | 54 | 2013 | 2014 | Minimal |
S12000006 | Dumfries and Galloway | Council Area | Offences: Group 8: Speeding | 5474 | 364 | 2013 | 2014 | Extremely High |
S12000042 | Dundee City | Council Area | Offences: Group 8: Speeding | 1487 | 100 | 2013 | 2014 | Very Low |
S12000005 | Clackmannanshire | Council Area | Offences: Group 8: Speeding | 209 | 41 | 2014 | 2015 | Minimal |
S12000006 | Dumfries and Galloway | Council Area | Offences: Group 8: Speeding | 5478 | 365 | 2014 | 2015 | Extremely High |
S12000042 | Dundee City | Council Area | Offences: Group 8: Speeding | 857 | 58 | 2014 | 2015 | Minimal |
The date_start column is to be used to bin and categorise the data and we are looking to bin based on the count_per_10k column.
I have successfully categorised each row using np.linspace
and pd.cut
for the whole historic dataset. This has been done by:
count_10k_bins = np.linspace(min(df_crime['count_per_10k']), max(df_crime['count_per_10k']), 8)
count_10k_bins_names = ['Minimal', 'Very Low', 'Low', 'Medium', 'High', 'Very High', 'Extremely High']
df_crime['count_category_historic'] = pd.cut(df_crime['count_per_10k'],
count_10k_bins,
labels=count_10k_bins_names,
include_lowest=True)
Please see this image for a sample of the dataset – Datset Example
I cannot work out a way to categorise each individual row in the dataset based on the subset of the year it falls in
Max Brown is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.