I have a dataset of students with their grades and I want to evaluate their evolution over the years.
I’ve already cleaned up the data, and I’ve associated each student with his or her quintile within the year (from 1 to 5) and the year of study (A1, A2 or A3).
enter image description here
I want to generate an Sankey Diagram to visualize their evolution, like this one below.
enter image description here
I work on R and I tried to use the ggalluvial package, but it didn’t work with a reccurent error.
Here’s my code for now :
library(ggplot2)
library(ggalluvial)
library(dplyr)
# Data loading
data <- read.csv("/mydata.csv")
# Converting Year and Quintile as factors
data$Year <- factor(data$Year, levels = c("A1", "A2", "A3"))
data$Quintile <- as.factor(data$Quintile)
data$individual <- as.factor(data$Student_ID)
# Check the structure of the data
str(data)
# Prepare the data for the alluvial plot
data_long <- data %>%
select(individual, Year, Quintile) %>%
group_by(individual, Year, Quintile) %>%
summarise(Freq = n(), .groups = 'drop') %>%
ungroup()
# Convert data to lodes format
data_lodes <- to_lodes_form(data_long, key = "Year", value = "Quintile", id = "individual")
# Create the alluvial plot
ggplot(data_lodes, aes(x = Year, stratum = Quintile, alluvium = individual, y = Freq, fill = Quintile, label = Quintile)) +
geom_flow(stat = "alluvium", aes(fill = Quintile), lode.guidance = "rightleft", color = "darkgray") +
geom_stratum() +
geom_text(stat = "stratum", aes(label = after_stat(stratum))) +
scale_x_discrete(limits = c("A1", "A2", "A3")) +
theme_minimal() +
labs(title = "Evolution of Student Distribution Across Quintiles", x = "Year", y = "Number of Students", fill = "Quintile")
And the error :
Error in `geom_flow()`:
! Problem while computing stat.
ℹ Error occurred in the 1st layer.
Caused by error in `setup_data()`:
! Data is not in a recognized alluvial form (see `help('alluvial-data')` for details).
Run `rlang::last_trace()` to see where the error occurred.
I’ve tried reformatting my data (in particular using the to_lodes_form() function), converting it into factors, but I can’t solve the problem.
Any ideas ?
Thanks
Augustin Gaudemer is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.