Examples: visualization, C++, networks, data cleaning, html widgets, ropensci.

Found 514 packages in 0.03 seconds

ff — by Jens Oehlschlägel, 2 months ago

Memory-Efficient Storage of Large Data on Disk and Fast Access Functions

The ff package provides data structures that are stored on disk but behave (almost) as if they were in RAM by transparently mapping only a section (pagesize) in main memory - the effective virtual memory consumption per ff object. ff supports R's standard atomic data types 'double', 'logical', 'raw' and 'integer' and non-standard atomic types boolean (1 bit), quad (2 bit unsigned), nibble (4 bit unsigned), byte (1 byte signed with NAs), ubyte (1 byte unsigned), short (2 byte signed with NAs), ushort (2 byte unsigned), single (4 byte float with NAs). For example 'quad' allows efficient storage of genomic data as an 'A','T','G','C' factor. The unsigned types support 'circular' arithmetic. There is also support for close-to-atomic types 'factor', 'ordered', 'POSIXct', 'Date' and custom close-to-atomic types. ff not only has native C-support for vectors, matrices and arrays with flexible dimorder (major column-order, major row-order and generalizations for arrays). There is also a ffdf class not unlike data.frames and import/export filters for csv files. ff objects store raw data in binary flat files in native encoding, and complement this with metadata stored in R as physical and virtual attributes. ff objects have well-defined hybrid copying semantics, which gives rise to certain performance improvements through virtualization. ff objects can be stored and reopened across R sessions. ff files can be shared by multiple ff R objects (using different data en/de-coding schemes) in the same process or from multiple R processes to exploit parallelism. A wide choice of finalizer options allows to work with 'permanent' files as well as creating/removing 'temporary' ff files completely transparent to the user. On certain OS/Filesystem combinations, creating the ff files works without notable delay thanks to using sparse file allocation. Several access optimization techniques such as Hybrid Index Preprocessing and Virtualization are implemented to achieve good performance even with large datasets, for example virtual matrix transpose without touching a single byte on disk. Further, to reduce disk I/O, 'logicals' and non-standard data types get stored native and compact on binary flat files i.e. logicals take up exactly 2 bits to represent TRUE, FALSE and NA. Beyond basic access functions, the ff package also provides compatibility functions that facilitate writing code for ff and ram objects and support for batch processing on ff objects (e.g. as.ram, as.ff, ffapply). ff interfaces closely with functionality from package 'bit': chunked looping, fast bit operations and coercions between different objects that can store subscript information ('bit', 'bitwhich', ff 'boolean', ri range index, hi hybrid index). This allows to work interactively with selections of large datasets and quickly modify selection criteria. Further high-performance enhancements can be made available upon request.

KOR.addrlink — by Daniel Schürmann, 9 months ago

Matching Address Data to Reference Index

Matches a data set with semi-structured address data, e.g., street and house number as a concatenated string, wrongly spelled street names or non-existing house numbers to a reference index. The methods are specifically designed for German municipalities ('KOR'-community) and German address schemes.

SeedVigorIndex — by Tanuj Misra, 6 months ago

Seed Vigor Index

Seed vigor is defined as the sum total of those properties of the seed which determine the level of activity and performance of the seed or seed lot during germination and seedling emergence. Testing for vigor becomes more important for carryover seeds, especially if seeds were stored under unknown conditions or under unfavorable storage conditions. Seed vigor testing is also used as indicator of the storage potential of a seed lot and in ranking various seed lots with different qualities. The vigour index is calculated using the equation given by (Ling et al. 2014) .

mutualinf — by Rafael Fuentealba-Chaura, 10 months ago

Computation and Decomposition of the Mutual Information Index

The Mutual Information Index (M) introduced to social science literature by Theil and Finizza (1971) is a multigroup segregation measure that is highly decomposable and that according to Frankel and Volij (2011) and Mora and Ruiz-Castillo (2011) satisfies the Strong Unit Decomposability and Strong Group Decomposability properties. This package allows computing and decomposing the total index value into its "between" and "within" terms. These last terms can also be decomposed into their contributions, either by group or unit characteristics. The factors that produce each "within" term can also be displayed at the user's request. The results can be computed considering a variable or sets of variables that define separate clusters.

hicp — by Sebastian Weinand, 4 months ago

Harmonised Index of Consumer Prices

The Harmonised Index of Consumer Prices (HICP) is the key economic figure to measure inflation in the euro area. The methodology underlying the HICP is documented in the HICP Methodological Manual (< https://ec.europa.eu/eurostat/web/products-manuals-and-guidelines/w/ks-gq-24-003>). Based on the manual, this package provides functions to access and work with HICP data from Eurostat's public database (< https://ec.europa.eu/eurostat/data/database>).

isni — by Hui Xie, 3 years ago

Index of Local Sensitivity to Nonignorability

The current version provides functions to compute, print and summarize the Index of Sensitivity to Nonignorability (ISNI) in the generalized linear model for independent data, and in the marginal multivariate Gaussian model and the mixed-effects models for continuous and binary longitudinal/clustered data. It allows for arbitrary patterns of missingness in the regression outcomes caused by dropout and/or intermittent missingness. One can compute the sensitivity index without estimating any nonignorable models or positing specific magnitude of nonignorability. Thus ISNI provides a simple quantitative assessment of how robust the standard estimates assuming missing at random is with respect to the assumption of ignorability. For a tutorial, download at < https://huixie.people.uic.edu/Research/ISNI_R_tutorial.pdf>. For more details, see Troxel Ma and Heitjan (2004) and Xie and Heitjan (2004) and Ma Troxel and Heitjan (2005) and Xie (2008) and Xie (2012) and Xie and Qian (2012) .

REDI — by Alexia Grenouillat, a year ago

Robust Exponential Decreasing Index

Implementation of the Robust Exponential Decreasing Index (REDI), proposed in the article by Issa Moussa, Arthur Leroy et al. (2019) < https://bmjopensem.bmj.com/content/bmjosem/5/1/e000573.full.pdf>. The REDI represents a measure of cumulated workload, robust to missing data, providing control of the decreasing influence of workload over time. Various functions are provided to format data, compute REDI, and visualise results in a simple and convenient way.

compindexR — by Olgun Aydin, a year ago

Calculates Composite Index

It uses the first-order sensitivity index to measure whether the weights assigned by the creator of the composite indicator match the actual importance of the variables. Moreover, the variance inflation factor is used to reduce the set of correlated variables. In the case of a discrepancy between the importance and the assigned weight, the script determines weights that allow adjustment of the weights to the intended impact of variables. If the optimised weights are unable to reflect the desired importance, the highly correlated variables are reduced, taking into account variance inflation factor. The final outcome of the script is the calculated value of the composite indicator based on optimal weights and a reduced set of variables, and the linear ordering of the analysed objects.

fExtremes — by Paul J. Northrop, a year ago

Rmetrics - Modelling Extreme Events in Finance

Provides functions for analysing and modelling extreme events in financial time Series. The topics include: (i) data pre-processing, (ii) explorative data analysis, (iii) peak over threshold modelling, (iv) block maxima modelling, (v) estimation of VaR and CVaR, and (vi) the computation of the extreme index.

BI — by Marc Schwartz, 2 years ago

Blinding Assessment Indexes for Randomized, Controlled, Clinical Trials

Generate the James Blinding Index, as described in James et al (1996) < https://pubmed.ncbi.nlm.nih.gov/8841652/> and the Bang Blinding Index, as described in Bang et al (2004) < https://pubmed.ncbi.nlm.nih.gov/15020033/>. These are measures to assess whether or not satisfactory blinding has been maintained in a randomized, controlled, clinical trial. These can be generated for trial subjects, research coordinators and principal investigators, based upon standardized questionnaires that have been administered, to assess whether or not they can correctly guess to which treatment arm (e.g. placebo or treatment) subjects were assigned at randomization.