Generate Data Summary in a Tidy Format

Functions that simplify the process of generating print-ready data summary using 'dplyr' syntax.


Hadley's dplyr provides a grammar to talk about data manipulation and another his package, tidyr provides a mindset to think about data. These two tools really makes it a lot easier to perform data manipulation today. This package ezsummary packed up some commonly used dplyr and tidyr steps to generate data summarization to help you save some typing time. It also comes with some table decoration tolls that basically allows you to pipe the results directly into a table generating function like knitr::kable() to render out.

For example, if you only use dplyr and tidyr to generate a statistical summary table by group. You need to go through the following steps.

library(dplyr)
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(tidyr)
 
mtcars %>%
  select(cyl, mpg, wt, hp) %>%
  group_by(cyl) %>%
  summarize_each(funs(mean, sd)) %>%
  gather(variable, value, -cyl) %>%
  mutate(value = round(value, 3)) %>%
  separate(variable, into = c("variable", "analysis")) %>%
  spread(analysis, value) %>%
  mutate(variable = factor(variable, levels = c("mpg", "wt", "hp"))) %>%
  arrange(variable, cyl) %>%
  kable()
cylvariablemeansd
4mpg26.6644.510
6mpg19.7431.454
8mpg15.1002.560
4wt2.2860.570
6wt3.1170.356
8wt3.9990.759
4hp82.63620.935
6hp122.28624.260
8hp209.21450.977

For people who are familar with "tidyverse", I'm sure the above codes are very straightforward. However, it's a bit annoying to type it again and again. With ezsummary, you don't need to think too much about it. You can just type:

library(ezsummary)
 
mtcars %>%
  select(cyl, mpg, wt, hp) %>%
  group_by(cyl) %>%
  ezsummary() %>%
  kable()
cylvariablemeansd
4mpg26.6644.510
6mpg19.7431.454
8mpg15.1002.560
4wt2.2860.570
6wt3.1170.356
8wt3.9990.759
4hp82.63620.935
6hp122.28624.260
8hp209.21450.977
    install.packages("ezsummary")

Or

    install.packages("devtools")
    devtools::install_github("haozhu233/ezsummary")

Here, I will show another quick demo of how to use this package here. For detailed package documentation, please check the package vignette.

library(dplyr)
library(ezsummary)
 
mtcars %>%
  # q: quantitative/continuous variables; c: categorical variables
  var_types("qcqqqqqcccc") %>%
  group_by(am) %>%
  ezsummary(flavor = "wide", unit_markup = "[. (.)]",
            digits = 1, p_type = "percent") %>%
  kable(col.names = c("variable", "Manual", "Automatic"))
variableManualAutomatic
mpg17.1 (3.8)24.4 (6.2)
cyl_43 (15.8%)8 (61.5%)
cyl_64 (21.1%)3 (23.1%)
cyl_812 (63.2%)2 (15.4%)
disp290.4 (110.2)143.5 (87.2)
hp160.3 (53.9)126.8 (84.1)
drat3.3 (0.4)4 (0.4)
wt3.8 (0.8)2.4 (0.6)
qsec18.2 (1.8)17.4 (1.8)
vs_012 (63.2%)6 (46.2%)
vs_17 (36.8%)7 (53.8%)
gear_315 (78.9%)0 (0)
gear_44 (21.1%)8 (61.5%)
gear_50 (0)5 (38.5%)
carb_13 (15.8%)4 (30.8%)
carb_26 (31.6%)4 (30.8%)
carb_33 (15.8%)0 (0)
carb_47 (36.8%)3 (23.1%)
carb_60 (0)1 (7.7%)
carb_80 (0)1 (7.7%)

If you ever find any issues, please feel free to report it in the issues tracking part on github. https://github.com/haozhu233/simple.summary/issues.

Thanks for using this package!

News

It has been almost 8 months since ezsummary 0.1.9 was released on CRAN. I hope this is a good time, if not too late, to do a major update. In this new version, I introduced a few attractive features by completely recoding some key functions in this package. In most cases, you will find the outcome of ezsummary() look the same as the outcome of the old version. However, there are a few cases that the new version behaves slightly differently with the old version (mostly with column namings). I'm sorry for any inconvenience caused by this update.

Here is a list of new features introduced by this update:

  • Two shorthand function ezsummary_q() and ezsummary_c() were introduced for ezsummary_quantitative() and ezsummary_categorical().
  • ezsummary() now takes ... as options and these options will be passed to ezsummary_q() and ezsummary_c().
  • You can now define customized analysis in ezsummary() and ezsummary_q() to produce any analyses you want. Please see ******* for details.
  • When you want ezsummary() to analyze both quantitative and categorical variables. now you have two output modes: ez(default) and details. Mode ez will generate an integrated table like ezsummary 0.1.9 while Mode details will generate a list of two with quantitative and categorical variables in different tables. Mode details allows you to have different number of analyses run for those two types of data.
  • Option round.N will be deprecated. Use digits instead.
  • You now have the option to select rounding methods from round, signif, ceiling and floor.
  • In categorical analyses, option P (for percentages) will be deprecated. Instead, now you can select the type of proportion outputby setting the value of p_type to be either "percent" or "decimal". You can no longer display both percent and decimal outputs because it's sort of meaningless.
  • There is a new switch called total to display counts of records including NA.
  • There is a new switch called missing to display counts of missing records.
  • A new option called fill is available when you set the flavor to be wide. The value of fill will be passed to tidyr::spread() and decide what to fill the NA slots generated by the "spread" step.

Reference manual

It appears you don't have a PDF plugin for this browser. You can click here to download the reference manual.

install.packages("ezsummary")

0.2.1 by Hao Zhu, 9 months ago


https://github.com/haozhu233/ezsummary


Report a bug at https://github.com/haozhu233/ezsummary/issues


Browse source code at https://github.com/cran/ezsummary


Authors: Hao Zhu [aut, cre], Thomas Travison [ctb], Timothy Tsai [ctb], Akhmed Umyarov [ctb]


Documentation:   PDF Manual  


MIT + file LICENSE license


Imports dplyr, tidyr

Suggests testthat, knitr, rmarkdown


See at CRAN