Tools to Quickly and Neatly Summarize Data

Data frame summaries, cross-tabulations, weight-enabled frequency tables and common univariate statistics in concise tables available in a variety of formats (plain ASCII, Markdown and HTML). A good point-of-entry for exploring data, both for experienced and new R users.


Version 0.8.1

Added vignettes. Improved handling of negative column indexing by the parsing function. Fixed issue with all-NA factors in dfSummary. Changed column for displaying "All NA's" for numerical vectors in dfSummary. Fixed issue with graphs in dfSummary. Removed accentuated characters from docs.

Version 0.8.0

Introducing graphs in dfSummary results. Improved alignment of numbers in both html and ascii tables. Cleanup in css and upgrade to Bootstrap 4 beta. Added rudimentary support for lapply() to be used with freq(), and improved support for by() and with() with descr(), freq(), and ctable(). Some backward compatibility notes: in dfSummary(), parameter name 'display.labels' now corresponds to 'labels.col', for consistency reasons. Also, see Notes for Version 0.6.9 about the 'file' parameter.

Version 0.7.0

(Release on GitHub only). Changes introduced in 0.6.9 have been further tested and enhanced. Improved alignment in cells having counts + proportions. Updated the vignette to reflect latest changes and added examples using the included datasets ('exams' and 'tobacco'). dfSummary()'s last column now includes counts and percentages of both valid and missing data. Internal changes: coding style has been reviewed to be more standard; Roxygen2 is now used to generate help files.

Version 0.6.9

(Release on GitHub only). Introduced ctable() for cross-tabulations. Added extended support for printing objects created using by() and/or with() - variable names, labels and by-groups are now displayed correctly. view() is now more than just a wrapper function build around print.summarytools(); it is the function to use when printing an object created using by(). Appending to summarytools-generated html files is now supported. Most pander options stored in summarytools objects can now be overridden when using print() or view(). freq() has an new parameter, 'order', allowing to order rows by count rather than values sorted alphabetically. Alignment of numbers in descr() observations table has been improved. An important change causing a minor break in backward compatibility: the file= parameter must now be used with print() or view(); its use with other functions is now deprecated. Environment variables keep track of all temporary html files created during session, and also make it possible to print lists of summarytools objects created using by().

Version 0.6.5

Fixed a problem with dfSummary() which arose when number of factors exceeded max.distinct.values. Improved the way dfSummary() reports frequencies on character variables. Fixed problems with outputs when using weights. Added hashtags to table headings to improve markdown integration. Added an option to generic print() function to suppress the footer in HTML outputs.

Version 0.6

Added Introduction vignette. Fixed markdown output that would not render strings such as . Improved multiline tables linefeeds. Introduced cleartmp() function to delete temporary HTML files used with print method "viewer" or "browser". Modifications to sample datasets. Bootstrap stylesheets now only include table-specific elements. Changed the way method="browser" sends file path to browser for better cross-platform compatibility. Improved results when using by(). Since version 0.5 was only made available through devtools::install_github(), please see changes for that version if updating from CRAN.

Version 0.5

Function descr() now also supports weights. Output from has been simplified. Other changes are transparent to the user (but make the internals more consistent across functions).

Version 0.4

Added cat() functions to fully support knitr's document generation. Also added sample datasets so that users can experiment using summarytools functions with them. freq() now supports weights.

Version 0.3

Another round of major changes.

  • Bringing in HTML table built with htmltools and viewable in RStudio's Viewer
  • function desc is renamed descr (mainly to avoid conflict with plyr's desc)
  • argument "echo" is deprecated; either display with pander or use as.table()
  • Returned objects are now of class "summarytools" and have several attributes that are used by print.summarytools
    • st.type : one of "freq", "descr" and "dfSummary"
    • date : date at which the function was called
    • & var.label : for 'freq', and also 'desc' when a single vector is used.
    • pander.args : 'style', 'justify', 'plain.ascii', 'split.table'
  • print.summarytools has argument "method" that can be one of "pander", "viewer", or "browser", the last two being used to display an HTML version of the output, using bootstrap's css (
  • rows indexing is "detected" and reported (function .parse.arg.x takes care of this)
  • rounding now only occurs at the printing stage

Version 0.2

Several major changes since version 0.1

  • 'unistats' is now called 'desc'.
  • 'frequencies' is now called 'freq'.
  • 'properties' is now called 'prop'.
  • shortcuts have been added to keep backward-compatibility.
  • 'desc' now accepts dataframes as first argument; factors and character columns will be ignored.
  • 'desc' can be transposed to suit one's preferences.
  • 'freq' just returns a matrix-table, not a list anymore.
  • in 'desc' and 'freq', no more argument 'display.label'. Those are displayed automatically when present.
  • rapportools is used instead of Hmisc for variable labels
  • function 'properties' was removed for now. Maybe reintegrated in a future update.

Reference manual

It appears you don't have a PDF plugin for this browser. You can click here to download the reference manual.