Examples: visualization, C++, networks, data cleaning, html widgets, ropensci.

Found 1120 packages in 0.03 seconds

text — by Oscar Kjell, 2 months ago

Analyses of Text using Transformers Models from HuggingFace, Natural Language Processing and Machine Learning

Link R with Transformers from Hugging Face to transform text variables to word embeddings; where the word embeddings are used to statistically test the mean difference between set of texts, compute semantic similarity scores between texts, predict numerical variables, and visual statistically significant words according to various dimensions etc. For more information see < https://www.r-text.org>.

cpm — by Gordon J. Ross, 6 years ago

Sequential and Batch Change Detection Using Parametric and Nonparametric Methods

Sequential and batch change detection for univariate data streams, using the change point model framework. Functions are provided to allow nonparametric distribution-free change detection in the mean, variance, or general distribution of a given sequence of observations. Parametric change detection methods are also provided for Gaussian, Bernoulli and Exponential sequences. Both the batch (Phase I) and sequential (Phase II) settings are supported, and the sequences may contain either a single or multiple change points. A full description of this package is available in Ross, G.J (2015) - "Parametric and nonparametric sequential change detection in R" available at < https://www.jstatsoft.org/article/view/v066i03>.

yaImpute — by Jeffrey S. Evans, 2 years ago

Nearest Neighbor Observation Imputation and Evaluation Tools

Performs nearest neighbor-based imputation using one or more alternative approaches to processing multivariate data. These include methods based on canonical correlation: analysis, canonical correspondence analysis, and a multivariate adaptation of the random forest classification and regression techniques of Leo Breiman and Adele Cutler. Additional methods are also offered. The package includes functions for comparing the results from running alternative techniques, detecting imputation targets that are notably distant from reference observations, detecting and correcting for bias, bootstrapping and building ensemble imputations, and mapping results.

cld2 — by Jeroen Ooms, a year ago

Google's Compact Language Detector 2

Bindings to Google's C++ library Compact Language Detector 2 (see < https://github.com/cld2owners/cld2#readme> for more information). Probabilistically detects over 80 languages in plain text or HTML. For mixed-language input it returns the top three detected languages and their approximate proportion of the total classified text bytes (e.g. 80% English and 20% French out of 1000 bytes). There is also a 'cld3' package on CRAN which uses a neural network model instead.

ggvis — by Hadley Wickham, 2 months ago

Interactive Grammar of Graphics

An implementation of an interactive grammar of graphics, taking the best parts of 'ggplot2', combining them with the reactive framework of 'shiny' and drawing web graphics using 'vega'.

arkhe — by Nicolas Frerebeau, a year ago

Tools for Cleaning Rectangular Data

A dependency-free collection of simple functions for cleaning rectangular data. This package allows to detect, count and replace values or discard rows/columns using a predicate function. In addition, it provides tools to check conditions and return informative error messages.

univOutl — by Marcello D'Orazio, 2 months ago

Detection of Univariate Outliers

Provides well-known techniques for detecting univariate outliers. Methods for handling skewed distributions are included. The Hidiroglou-Berthelot (1986) method for detecting outliers in ratios of historical data is also implemented. When available, survey weights can be incorporated in the detection process.

difNLR — by Adela Hladka, 5 months ago

DIF and DDF Detection by Non-Linear Regression Models

Detection of differential item functioning (DIF) among dichotomously scored items and differential distractor functioning (DDF) among unscored items with non-linear regression procedures based on generalized logistic regression models (Hladka & Martinkova, 2020, ).

bSims — by Peter Solymos, 9 months ago

Agent-Based Bird Point Count Simulator

A highly scientific and utterly addictive bird point count simulator to test statistical assumptions, aid survey design, and have fun while doing it (Solymos 2024 ). The simulations follow time-removal and distance sampling models based on Matsuoka et al. (2012) , Solymos et al. (2013) , and Solymos et al. (2018) , and sound attenuation experiments by Yip et al. (2017) .

EpiSignalDetection — by Lore Merdrignac, a month ago

Signal Detection Analysis

Exploring time series for signal detection. It is specifically designed to detect possible outbreaks using infectious disease surveillance data at the European Union / European Economic Area or country level. Automatic detection tools used are presented in the paper "Monitoring count time series in R: aberration detection in public health surveillance", by Salmon (2016) . The package includes: - Signal Detection tool, an interactive 'shiny' application in which the user can import external data and perform basic signal detection analyses; - An automated report in HTML format, presenting the results of the time series analysis in tables and graphs. This report can also be stratified by population characteristics (see 'Population' variable). This project was funded by the European Centre for Disease Prevention and Control.