Tools for Data Diagnosis, Exploration, Transformation

A collection of tools that support data diagnosis, exploration, and transformation. Data diagnostics provides information and visualization of missing values and outliers and unique and negative values to help you understand the distribution and quality of your data. Data exploration provides information and visualization of the descriptive statistics of univariate variables, normality tests and outliers, correlation of two variables, and relationship between target variable and predictor. Data transformation supports binning for categorizing continuous variables, imputates missing values and outliers, resolving skewness. And it creates automated reports that support these three tasks.


News

dlookr 0.3.2

  • plot.relate() supports hexabin plotting when this target variable is numeric and the predictor is also a numeric type.

  • Add a new function get_column_info() to show the table information of the DBMS.

  • diagnose() supports diagnosing columns of table in the DBMS.

  • diagnose_category() supports diagnosing character columns of table in the DBMS.

  • diagnose_numeric() supports diagnosing numeric columns of table in the DBMS.

  • diagnose_outlier() supports diagnosing outlier of numeric columns of table in the DBMS.

  • plot_outlier() supports diagnosing outlier of numeric columns of table in the DBMS.

  • normality() supports test of normality for numeric columns of table in the DBMS.

  • plot_normality() supports test of normality for numeric columns of table in the DBMS.

  • correlate() supports Computing the correlation coefficient of numeric columns of table in the DBMS.

  • plot_correlate() supports computing the correlation coefficient of numeric columns of table in the DBMS.

  • describe() supports computing descriptive statistic of numeric columns of table in the DBMS.

  • target_by() supports columns of table in the DBMS.

  • Fix in 4.1.1 of EDA report without target variable.

dlookr 0.3.1

  • Fix typographical errors in EDA Report headings (@hangtime79, #2).

  • The plot_outlier() supports a col argument that a color to be used to fill the bars. (@hangtime79, #3).

  • Remove the name of the numeric vector to return when index = TRUE in find_na (), find_outliers(), find_skewness().

Reference manual

It appears you don't have a PDF plugin for this browser. You can click here to download the reference manual.