I have a Pandas dataframe like this:
>>> df = pd.DataFrame({'2012':['A','A','B','A'],'2013':['A','B','C','C'],'2014':['A','C','Z','C']})
>>> df
2012 2013 2014
0 A A A
1 A B C
2 B C Z
3 A C C
From it, I need to create another dataframe like this:
2012 2013 2014
A 3 1 1
B 1 1 0
C 0 2 2
where I am basically counting some (A,B,C but not Z) of the occurrences of the labels in every column, turning them into indexes, and showing their count per year.
I did come out with a solution that involves iteration:
>>> indexes = ['A','B','C']
>>> for idx in indexes:
`df2.loc[idx] = (df == idx).sum()
>>> df2
2012 2013 2014
A 3 1 1
B 1 1 0
C 0 2 2
This is exactly what I need. But I wonder, is there a way to do it in one shot without iteration?
Thanks in advance to whoever may help!
I played around with values_counts(), pivot_table() and groupby() without success.
All Google searches I found point to this type of count but across one column only.
Eventually I found the solution but it involves iteration.
nicobar is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.