Background and goal:
I have a start and end distribution of market data. Specifically bonds with different coupons and maturities which experience in- and outflow of the market. The problem is, that the values between the total distribution and how it evolves are too large to get a useful graph out of. Which is why I am looking for some ideas to improve the chart.
A reproducible code:
library(ggplot2)
result <- data.frame(
PlotColumn = c("Start Distribution","1 50", "1 50 IO10", "1 53", "1 53 IO10", "1 53 IO30", "1.5 52",
"1.5 52 IO10", "3.5 53", "4 53", "4 56", "5 53", "5 53 IO10",
"5 56", "5 56 IO10", "Residuals","End Distribution"),
net_inflow = c(1171006120490, -3424160223, -3249962038, -1038146156, -673687115, -170610985,
-1176478066, -693687721, -342259850, -486942250, 13478909889,
-1713949813, -750447300, 433740682, 471070681, -1018816544, 1170650693681)
)
waterfall <- function(data) {
desc_col <- names(data)[1]
amount_col <- names(data)[2]
data[[desc_col]] <- factor(data[[desc_col]], levels = data[[desc_col]])
data$id <- seq_along(data[[amount_col]])
data$type <- ifelse(data[[amount_col]] > 0, "increase", "decrease")
data[data$id %in% c(1, nrow(data)), "type"] <- "net" # Set first and last as net
data$end <- cumsum(data[[amount_col]])
data$end <- c(head(data$end, -1), 0)
data$start <- c(0, head(data$end, -1))
data$type <- factor(data$type, levels = c("decrease", "increase", "net"))
p <- ggplot(data, aes(x = as.numeric(id), fill = type)) +
geom_rect(aes(xmin = id - 0.45, xmax = id + 0.45, ymin = end, ymax = start)) +
scale_x_continuous(breaks = data$id, labels = data[[desc_col]]) +
scale_fill_manual(values = c("decrease" = "red", "increase" = "green", "net" = "blue")) +
geom_text(aes(x = id, y = ifelse(type == "increase", end, start), label = comma(data[[amount_col]]), vjust = ifelse(data[[amount_col]] > 0, -0.3, 1.3)), size = 3, fontface = "bold") +
labs(x = NULL, y = NULL, fill = NULL) +
theme_minimal() +
theme(legend.position = "none", axis.text.x = element_text(angle = 45, hjust = 1))
return(p)
}
waterfall(result)
Which yields the following graph:
Question
So this graph is clearly not very useful due to the small values between the start and end distribution. So, my question is.. Do you folks have any ideas on how to improve this? By rescaling the values? Split the graphs up in to? Or anything else? Or maybe the data in it self is not that suited for this type of chart?