I first calculated monthly sum by counting IDs within a group using group-by and mutate. However, as I try to use cumsum to calculate monthly cumulative sum by group using cumsum, the data just does not look right. It would be really helpful if someone can help me. Thanks~
Assuming original data frame looks like so:
person id | company | yyyy | month |
---|---|---|---|
1 | A | 2011 | January |
2 | A | 2011 | January |
3 | A | 2011 | Feburary |
4 | A | 2011 | Feburary |
5 | A | 2011 | Feburary |
6 | A | 2011 | March |
7 | B | 2011 | January |
8 | B | 2011 | January |
9 | B | 2011 | January |
10 | B | 2011 | Feburary |
11 | B | 2011 | Feburary |
12 | B | 2011 | Feburary |
13 | B | 2011 | Feburary |
14 | B | 2011 | March |
15 | B | 2011 | March |
16 | B | 2011 | April |
I would like to have a final output data frame like the one below.
person id | company | yyyy | month | monthly sum | cum monthly sum |
---|---|---|---|---|---|
1 | A | 2011 | January | 2 | 2 |
2 | A | 2011 | January | 2 | 2 |
3 | A | 2011 | Feburary | 2 | 4 |
4 | A | 2011 | Feburary | 2 | 4 |
5 | A | 2011 | Feburary | 2 | 4 |
6 | A | 2011 | March | 1 | 5 |
7 | B | 2011 | January | 3 | 3 |
8 | B | 2011 | January | 3 | 3 |
9 | B | 2011 | January | 3 | 3 |
10 | B | 2011 | Feburary | 4 | 7 |
11 | B | 2011 | Feburary | 4 | 7 |
12 | B | 2011 | Feburary | 4 | 7 |
13 | B | 2011 | Feburary | 4 | 7 |
14 | B | 2011 | March | 2 | 9 |
15 | B | 2011 | March | 2 | 9 |
16 | B | 2011 | April | 1 | 10 |