Relative Content

Tag Archive for pythonpandasdataframe

How to set value to a slice of a multi-index dataframe from another slice

I have a multi-index pandas dataframe, and I need assign the value to a slice of the dataframe (based on one index), with a calculation on another slice of the same dataframe (based on the same index).
I have tried to assign the values using loc, but the entire slice ends up NaN.
I have written this simple example code to make clear the problem I’m having.

Convert category structure to merge with other df

I have 2 dfs, the first df has transactions that each have a category id, the categories are multi layered and the number of layers varies. The 2. df has the categories. for each category it has the category id and the parent id. I would like to prepare df 2 in such a way that i can merge it with df 1 and then have all the layers of the categories in df1.

sampling unbalanced data frame columns

If I have a data frame df, which has five columns: ‘A’, ‘B’, ‘C’, ‘D’, and ‘E’, which contains python strings. Currently, ‘B’, ‘C’, ‘D’, and ‘E’ has unbalanced unique values (i.e., some unique values have more rows than the others). How can I sample df so that column ‘B’, ‘C’, ‘D’, and ‘E’ have balanced number of unique values (i.e., each unique value in a specific column has the same number of rows)? I want to sample with replacement so that the resulting data frame has the same length as the original data frame, though some rows may be duplicated and some may be omitted. Thanks!

How to divide values in columns in one dataframe by the same value in another df in Pandas?

I want to divide all values from particular columns in the dataframe rpk by the same value from the dataframe scaling_factor, according to sample_name. I know how to do it for a particular value (e.g. for the column ‘P1-6’ in rpk all the values should be divided by 2 – according to value factor for ‘P1-6’ in scaling_factor dataframe) but how to do it for all samples?.