I have a dataset that consists of a type of unique transaction for all stores in the last two years. Every combination of possibility is captured between stores, unique transaction_Classification, & time_day_id. Many instances exist when a certain transaction does not exist on that day for that unique transaction. I want to build a custom missing value handling that would be built to handle all stores differently based on what unique store type they are.
-
Example: New Stores would be treated as similar store types in their particular region. Handle Missing Values for New Stores: (Group By: Store Type, Region, Unique_Transaction_Classification, Month, Week_Day)
-
Example: Store_Type: FM stores would be treated as 0’s in their null’s.
Handle Missing Values for FM stores: (Where Store_Type = ‘FM’) , fillna(0)
I want to replace the null value with a new column that will represent ‘Replace_Null_Transaction_Count_Column ‘
- If the actual value existed in the original transaction count column
bring over into new column. - All null values will get treated with the custom logic and put in
this new column.
Example of Data for imputing Missing for Store_Type & Store_Classified
I replaced null’s in the same column with a group by for the store_classified, region, week_day. It replaces new_store & established store with null in the same manner which I do not want. I want to treat different combinations of Store_Classified & Store type with null’s differently until I get all null’s have been handled. I am not sure of the best way to do this step?
Scottie Tucker is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.