Found 12 packages in 0.01 seconds
Extension of `data.frame`
Fast aggregation of large data (e.g. 100GB in RAM), fast ordered joins, fast add/modify/delete of columns by group using no copies at all, list columns, friendly and fast character-separated-value read/write. Offers a natural and flexible syntax, for faster development.
Advanced and Fast Data Transformation
A C/C++ based package for advanced data transformation and statistical computing in R that is extremely fast, class-agnostic, robust and programmer friendly. Core functionality includes a rich set of S3 generic grouped and weighted statistical functions for vectors, matrices and data frames, which provide efficient low-level vectorizations, OpenMP multithreading, and skip missing values by default. These are integrated with fast grouping and ordering algorithms (also callable from C), and efficient data manipulation functions. The package also provides a flexible and rigorous approach to time series and panel data in R. It further includes fast functions for common statistical procedures, detailed (grouped, weighted) summary statistics, powerful tools to work with nested data, fast data object conversions, functions for memory efficient R programming, and helpers to effectively deal with variable labels, attributes, and missing data. It is well integrated with base R classes, 'dplyr'/'tibble', 'data.table', 'sf', 'units', 'plm' (panel-series and data frames), and 'xts'/'zoo'.
Bay Area Bike Share Trips in 2014
Anonymised Bay Area bike share trip data for the year 2014. Also contains additional metadata on stations and weather.
Splitting-Coalescence-Estimation Method
We introduce improved methods for statistically assessing birth seasonality and intra-annual variation. The first method we propose is a new idea that uses a nonparametric clustering procedure to group individuals with similar time series data and estimate birth seasonality based on the clusters. One can use the function SCEM() to implement this method. The second method estimates input parameters for use with a previously-developed parametric approach (Tornero et al., 2013). The relevant code for this approach is makeFits_OLS(), while makeFits_initial() is the code to implement the same method but with given initial conditions for two parameters. The latter can be used to show the disadvantage of the existing approach. One can use the function makeFits() to generate parametric birth seasonality estimates using either initialization. Detailed description can be found here: Chazin Hannah, Soudeep Deb, Joshua Falk, and Arun Srinivasan. (2019) "New Statistical Approaches to Intra-Individual Isotopic Analysis and Modeling Birth Seasonality in Studies of Herd Animals."
PHATE - Potential of Heat-Diffusion for Affinity-Based Transition Embedding
PHATE is a tool for visualizing high dimensional single-cell data
with natural progressions or trajectories. PHATE uses a novel conceptual framework
for learning and visualizing the manifold inherent to biological systems in which
smooth transitions mark the progressions of cells from one state to another.
To see how PHATE can be applied to single-cell RNA-seq datasets from hematopoietic
stem cells, human embryonic stem cells, and bone marrow samples, check out our publication in Nature Biotechnology
at
nose Package for R
The nose package consists of a collection of three functions for classifying sparseness in typical 2 x 2 data sets with at least one cell should have zero count. These functions are based on the three widely applied summary measures for 2 x 2 categorical data viz, Risk Difference (RD), Relative Risk (RR), Odds Ratio (OR). This package helps to identify suitable continuity correction for zero cells when a multi centre analysis or a meta analysis is carried out. Further, it can be considered as a tool for sensitivity analysis for adding a continuity correction and to identify the presence of Simpson's paradox.
Facilities for Simulating from ODE-Based Models
Facilities for running simulations from ordinary differential equation ('ODE') models, such as pharmacometrics and other compartmental models. A compilation manager translates the ODE model into C, compiles it, and dynamically loads the object code into R for improved computational efficiency. An event table object facilitates the specification of complex dosing regimens (optional) and sampling schedules. NB: The use of this package requires both C and Fortran compilers, for details on their use with R please see Section 6.3, Appendix A, and Appendix D in the "R Administration and Installation" manual. Also the code is mostly released under GPL. The 'VODE' and 'LSODA' are in the public domain. The information is available in the inst/COPYRIGHTS.
Compose Interoperable Analysis Pipelines & Put Them in Production
Enables data scientists to compose pipelines of analysis which consist of data manipulation, exploratory analysis & reporting, as well as modeling steps. Data scientists can use tools of their choice through an R interface, and compose interoperable pipelines between R, Spark, and Python. Credits to Mu Sigma for supporting the development of the package. Note - To enable pipelines involving Spark tasks, the package uses the 'SparkR' package. The SparkR package needs to be installed to use Spark as an engine within a pipeline. SparkR is distributed natively with Apache Spark and is not distributed on CRAN. The SparkR version needs to directly map to the Spark version (hence the native distribution), and care needs to be taken to ensure that this is configured properly. To install SparkR from Github, run the following command if you know the Spark version: 'devtools::install_github('apache/[email protected]', subdir='R/pkg')'. The other option is to install SparkR by running the following terminal commands if Spark has already been installed: '$ export SPARK_HOME=/path/to/spark/directory && cd $SPARK_HOME/R/lib/SparkR/ && R -e "devtools::install('.')"'.
Seed Germination Indices and Curve Fitting
Provides functions to compute various germination indices such as germinability, median germination time, mean germination time, mean germination rate, speed of germination, Timson's index, germination value, coefficient of uniformity of germination, uncertainty of germination process, synchrony of germination etc. from germination count data. Includes functions for fitting cumulative seed germination curves using four-parameter hill function and computation of associated parameters. See the vignette for more, including full list of citations for the methods implemented.
Time Series Factor Models for Asset Returns
Supports teaching methods of estimating and testing time series
factor models for use in robust portfolio construction and analysis. Unique
in providing not only classical least squares, but also modern robust model
fitting methods which are not much influenced by outliers. Includes
returns and risk decompositions, with user choice of standard deviation,
value-at-risk, and expected shortfall risk measures. "Robust Statistics
Theory and Methods (with R)", R. A. Maronna, R. D. Martin, V. J. Yohai,
M. Salibian-Barrera (2019)