Found 1067 packages in 0.04 seconds
Explainable Outlier Detection Through Decision Tree Conditioning
Outlier detection method that flags suspicious values within observations,
constrasting them against the normal values in a user-readable format, potentially
describing conditions within the data that make a given outlier more rare.
Full procedure is described in Cortes (2020)
Interactive Grammar of Graphics
An implementation of an interactive grammar of graphics, taking the best parts of 'ggplot2', combining them with the reactive framework of 'shiny' and drawing web graphics using 'vega'.
Nearest Neighbor Observation Imputation and Evaluation Tools
Performs nearest neighbor-based imputation using one or more alternative approaches to processing multivariate data. These include methods based on canonical correlation: analysis, canonical correspondence analysis, and a multivariate adaptation of the random forest classification and regression techniques of Leo Breiman and Adele Cutler. Additional methods are also offered. The package includes functions for comparing the results from running alternative techniques, detecting imputation targets that are notably distant from reference observations, detecting and correcting for bias, bootstrapping and building ensemble imputations, and mapping results.
Agent-Based Bird Point Count Simulator
A highly scientific and utterly addictive
bird point count simulator
to test statistical assumptions, aid survey design,
and have fun while doing it (Solymos 2024
Google's Compact Language Detector 2
Bindings to Google's C++ library Compact Language Detector 2 (see < https://github.com/cld2owners/cld2#readme> for more information). Probabilistically detects over 80 languages in plain text or HTML. For mixed-language input it returns the top three detected languages and their approximate proportion of the total classified text bytes (e.g. 80% English and 20% French out of 1000 bytes). There is also a 'cld3' package on CRAN which uses a neural network model instead.
Vector Generalized Linear and Additive Models
An implementation of about 6 major classes of
statistical regression models. The central algorithm is
Fisher scoring and iterative reweighted least squares.
At the heart of this package are the vector generalized linear
and additive model (VGLM/VGAM) classes. VGLMs can be loosely
thought of as multivariate GLMs. VGAMs are data-driven
VGLMs that use smoothing. The book "Vector Generalized
Linear and Additive Models: With an Implementation in R"
(Yee, 2015)
Tools for Cleaning Rectangular Data
A dependency-free collection of simple functions for cleaning rectangular data. This package allows to detect, count and replace values or discard rows/columns using a predicate function. In addition, it provides tools to check conditions and return informative error messages.
Signal Detection Analysis
Exploring time series for signal detection. It is specifically designed
to detect possible outbreaks using infectious disease surveillance data
at the European Union / European Economic Area or country level.
Automatic detection tools used are presented in the paper
"Monitoring count time series in R: aberration detection in public health surveillance",
by Salmon (2016)
Event Detection Framework
Detect events in time-series data. Combines multiple well-known R packages like 'forecast' and 'neuralnet' to deliver an easily configurable tool for multivariate event detection.
Dimension Reduction for Outlier Detection
A dimension reduction technique for outlier detection. DOBIN: a Distance
based Outlier BasIs using Neighbours, constructs a set of basis vectors for outlier
detection. This is not an outlier detection method; rather it is a pre-processing
method for outlier detection. It brings outliers to the fore-front using fewer basis
vectors (Kandanaarachchi, Hyndman 2020)