I have a pandas dataframe that looks like this:
type location pass enrolled
student US yes highschool
student CA yes highschool
teacher US yes college
student US no highschool
student CA yes college
student CA no college
I want to get the pass rate and group it by type, location, and enrolled so it looks something like this:
type location enrolled pass_rate
student US highschool .5
student CA highschool 1.0
student CA college .5
teacher US college 1.0
To create the above dataframe:
import pandas as pd
list_of_dict = [
{"type": "student", "location": "US", "pass": "yes", "enrolled": "highschool"},
{"type": "student", "location": "CA", "pass": "yes", "enrolled": "highschool"},
{"type": "teacher", "location": "US", "pass": "yes", "enrolled": "college"},
{"type": "teacher", "location": "US", "pass": "no", "enrolled": "college"},
{"type": "student", "location": "US", "pass": "no", "enrolled": "highschool"},
{"type": "student", "location": "CA", "pass": "yes", "enrolled": "college"},
{"type": "student", "location": "CA", "pass": "no", "enrolled": "college"},
]
df = pd.DataFrame(list_of_dict)
I know I need to get the .count() while grouping by “type”, “location” and “enrolled”.
I’ve tried
df = df.groupby(["type", "location", "enrolled"]).count().mean()
but this just gives me an integer.
unlocknew is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.