Alluvial plots are similar to sankey diagrams and visualise categorical data
over multiple dimensions as flows. (Rosvall M, Bergstrom CT (2010) Mapping Change in
Large Networks. PLoS ONE 5(1): e8694.
Alluvial plots are similar to sankey diagrams and visualise categorical data over multiple dimensions as flows. Rosval et. al. 2010 Their graphical grammar however is a bit more complex then that of a regular x/y plots. The
ggalluvial package made a great job of translating that grammar into
ggplot2 syntax and gives you many option to tweak the appearance of an alluvial plot, however there still remains a multi-layered complexity that makes it difficult to use 'ggalluvial' for explorative data analysis. 'easyalluvial' provides a simple interface to this package that allows you to produce a decent alluvial plot from any dataframe in either long or wide format from a single line of code while also handling continuous data. It is meant to allow a quick visualisation of entire dataframes with a focus on different colouring options that can make alluvial plots a great tool for data exploration.
In order to learn about all the features an how they can be useful check out the following tutorials:
suppressPackageStartupMessages( require(tidyverse) )suppressPackageStartupMessages( require(easyalluvial) )data = as_tibble(mtcars)categoricals = c('cyl', 'vs', 'am', 'gear', 'carb')numericals = c('mpg', 'cyl', 'disp', 'hp', 'drat', 'wt', 'qsec')data = data %>%mutate_at( vars(categoricals), as.factor )
Continuous Variables will be automatically binned as follows.
alluvial_wide( data = data, max_variables = 5, fill_by = 'first_variable' )
knitr::kable( head(quarterly_flights) )
|N0EGMQ LGA BNA MQ||MQ||LGA||BNA||Q1||on_time|
|N0EGMQ LGA BNA MQ||MQ||LGA||BNA||Q2||on_time|
|N0EGMQ LGA BNA MQ||MQ||LGA||BNA||Q3||on_time|
|N0EGMQ LGA BNA MQ||MQ||LGA||BNA||Q4||on_time|
|N11150 EWR MCI EV||EV||EWR||MCI||Q1||late|
|N11150 EWR MCI EV||EV||EWR||MCI||Q2||late|
alluvial_long( quarterly_flights, key = qu, value = mean_arr_delay, id = tailnum, fill = carrier )
vdiffris now used to test plots and added as a suggested dependency
manip_bin_numerics()accepts c('median', 'mean', 'cuts', 'min_max') as bin_labels argument which will be converted to bin label.
alluvial_long()do not crash anymore when dataframes are grouped