I have a dataset with two categorical variables and would like to have a bar plot with the axis being % and bar labels having both % and N(number of observations). I have explored countplot
and it seems to achieve what I want with the exception that the y-axis is count and not percent and also the bar labels aren’t what I want.
Below is the code
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import numpy as np
data = {
'id': [1, 1, 2, 2, 3, 3, 4, 4, 5, 5, 6, 6, 7, 7, 8, 8, 9, 9, 10, 10, 11, 11, 12, 12, 13, 13, 14, 14, 15, 15, 16, 16, 17, 17, 18, 18, 19, 19, 20, 20],
'survey': ['baseline', 'endline', 'baseline', 'endline', 'baseline', 'endline', 'baseline', 'endline', 'baseline', 'endline', 'baseline', 'endline', 'baseline', 'endline', 'baseline', 'endline', 'baseline', 'endline', 'baseline', 'endline', 'baseline', 'endline', 'baseline', 'endline', 'baseline', 'endline', 'baseline', 'endline', 'baseline', 'endline', 'baseline', 'endline', 'baseline', 'endline', 'baseline', 'endline', 'baseline', 'endline', 'baseline', 'endline'],
'level': ['low', 'high', 'medium', 'low', 'high', 'medium', 'medium', 'high', 'low', 'low', 'medium', 'high', 'low', 'medium', 'low', 'high', 'low', 'low', 'medium', 'high', 'high', 'high', 'high', 'medium', 'low', 'low', 'medium', 'high', 'low', 'medium', 'high', 'medium', 'low', 'high', 'high', 'medium', 'medium', 'low', 'high', 'low']
}
df = pd.DataFrame(data)
df.survey.value_counts()
plt.figure(figsize=(8,5))
sns.countplot(x = 'survey', data = df, palette = 'rainbow', hue = 'level')
Thanks in advance!
1