Data frame summaries, cross-tabulations, weight-enabled frequency tables and common descriptive (univariate) statistics in concise tables available in a variety of formats (plain ASCII, Markdown and HTML). A good point-of-entry for exploring data, both for experienced and new R users.
descr()outputs into "tidy" tibbles
useTranslations()and triggers an Open File... dialog when no argument is supplied
define_keywords()allows defining translatable terms in GUI and optionally save the results in a csv file (through Save File... dialog)
byst()had to be dropped because of issues related to objects names; so only
stby()is accepted from now on
useTranslations()has been replaced by
cumulallows turning on or off cumulative proportions
orderparameter: "names", "freq", and "levels" values now have their counterparts "-names" (or "names-"), "-freq" and "-levels"
rowshas been added; it allows subsetting the output table either with a numeric vector, a character vector, or a single search string (regex)
No changes (re-submission of 0.9.1 to CRAN)
For users updating solely from CRAN, this is a major update. Many changes were introduced since version 0.8.8 (versions 0.8.9 and 0.9.0 were released solely on GitHub). Please refer to the README file, the two vignettes and the information below for all the details.
In this version:
stby(), a summarytools-specific version of
by(), is introduced. It is highly recommended that you use it instead of
by(); its syntax is identical and it greatly simplifies the printing of the generated objects
example()function to access them
st_options(lang='fr')gives access to French translations. Spanish ('es') translations are also available.
useTranslations()allows using custom translations; see the introductory vignette for details
descr(), the weighting variable (when used) is automatically removed from the list of variables to analyze
dfSummary(), images are processed using functions from the magick package, improving the general layout of the output tables
descr(), the 'stats' parameter accepts values "common" and "fivenum"
st_options(), setting multiple options at once is now possible; all options have their own parameter (the legacy way of setting options is still supported)
view()- refer to the
print()method's documentation to learn more
Two somewhat backward-compatibility breaking changes:
Special thanks to Paul Feitsma for his numerous suggestions.
dfSummary()where reported percentages could exceed 100% under specific circumstances
dfSummary()when missing values were present along with whole numbers
descr()where group was not shown for the first group when omitting headings
descr(), fixed a calculation error for coefficient of variation (cv)
ctable()html outputs, '<' and '>' are properly escaped when appearing in row or column names
ctable(), argument 'useNA' now correctly accepts value "no"
dfSummary()histograms changed (from
dfSummary()with time objects
dfSummary()when "[1 other value]" was displayed; this actual value is now displayed instead
freq(): 'totals' and 'display.nas'
descr(), Q1 and Q3 were added; also, the order in the 'stats' argument is now reflected in the output table
view(), argument 'html.table.class' is now called 'table.classes' and its usage is simplified (please refer to the corresponding help files for details
lapply()when used with
lapply()to be used with
Backward-compatibility notes: in
dfSummary(), parameter name 'display.labels'
has been changed to 'labels.col', for consistency reasons. Also, see Notes
for Version 0.6.9 about the 'file' parameter.
Another GitHub-only release
dfSummary()'s last column now includes counts and percentages for both valid and missing data
In this GitHub-only release:
with(): variable names, labels and by-groups are now displayed correctly
view()is now more than just a wrapper function for the
print()method; it is the function to use when printing an object created with
freq()has an new parameter, 'order', allowing to order rows by count rather than values
descr()observations table has been improved
An important change causing a minor break in backward compatibility:
the 'file' parameter must now be used with
use with other functions is now deprecated.
dfSummary()which arose when number of factor levels exceeded max.distinct.values
dfSummary()reports frequencies for character variables
print()method to suppress the footnote in HTML outputs
method = "browser"sends file path to browser for better cross-platform compatibility
For this GitHub-only release:
descr()now supports weights
what.is()has been simplified
cat() functions to fully support knitr's document generation. Also added
sample datasets so that users can experiment using summarytools functions with
freq() now supports weights.
Another round of major changes
descr()(mainly to avoid conflict with
freq(), and also
descr()when a single vector is used
print.summarytools()has argument 'method' that can be one of "pander", "viewer", or "browser", the last two being used to display an HTML version of the output, using Bootstrap's css (https://getbootstrap.com)
.parse.arg.x()takes care of this)
Several major changes since version 0.1
unistats()is now called
frequencies()is now called
properties()is now called
desc()now accepts data frames as first argument; factors and character columns will be ignored
desc()can be transposed to suit one's preferences
freq()just returns a matrix-table, not a list anymore
freq(), no more argument 'display.label'. Those are displayed automatically when present
properties()was removed for now. May be reintegrated in a future version