EBGM Disproportionality Scores for Adverse Event Data Mining

An implementation of DuMouchel's (1999) Bayesian data mining method for the market basket problem. Calculates Empirical Bayes Geometric Mean (EBGM) and quantile scores from the posterior distribution using the Gamma-Poisson Shrinker (GPS) model to find unusually large cell counts in large, sparse contingency tables. Can be used to find unusually high reporting rates of adverse events associated with products. In general, can be used to mine any database where the co-occurrence of two variables or items is of interest. Also calculates relative and proportional reporting ratios. Builds on the work of the 'PhViD' package, from which much of the code is derived. Some of the added features include stratification to adjust for confounding variables and data squashing to improve computational efficiency. Now includes an implementation of the EM algorithm for hyperparameter estimation loosely derived from the 'mederrRank' package.


openEBGM v0.8.2

  • Adjusted calculation for expected counts using suggestion from Piotr Świnarski. Previously, calculation failed when marginal counts became too large for integer multiplication.

openEBGM v0.8.1

  • Corrected unit test failures for processRaw() resulting from base R changes to the sample() function.
  • Added DEoptim::DEoptim() example to hyperparameter estimation vignette.

openEBGM v0.8.0

  • processRaw() now lists all strata when stratification is used.
  • Added argument 'list_ids' to processRaw().

openEBGM v0.7.0

  • Added the autoSquash() function to automate data squashing.
  • Changed exit condition for while loop in hyperEM(). hyperEM() now throws an error if the number of "stuck" or repeated estimates of theta exceeds 20 when using 'method = "nlminb"'.
  • Changed upper limit from 1 to 0.999 in hidden functions .updateThetaLL() and .updateThetaLLD(), which are called by hyperEM().

openEBGM v0.6.0

  • Changed 'keep_bins' formal argument in squashData() to 'keep_pts' for added flexibility.

openEBGM v0.5.0

  • Efficiency and code hygiene improvements to processRaw() and squashData().

openEBGM v0.4.0

  • Added the hyperEM() function to estimate hyperparameters using an implementation of the EM algorithm.

openEBGM v0.3.0

  • Added confidence intervals to autoHyper() and standard errors to autoHyper() and exploreHypers().
  • processRaw() now returns Inf instead of 99999 when PRR results in division by zero.
  • Fixed minor bug in exploreHypers().

openEBGM v0.2.0

  • Minor aesthetic changes to plot(), summary(), and print() methods.
  • Relaxed convergence requirements for exploreHypers() and autoHyper().

Reference manual

It appears you don't have a PDF plugin for this browser. You can click here to download the reference manual.


0.8.3 by John Ihrie, a year ago


Browse source code at https://github.com/cran/openEBGM

Authors: John Ihrie [cre, aut] , Travis Canida [aut] , Ismaïl Ahmed [ctb] (author of 'PhViD' package (derived code)) , Antoine Poncet [ctb] (author of 'PhViD') , Sergio Venturini [ctb] (author of 'mederrRank' package (derived code)) , Jessica Myers [ctb] (author of 'mederrRank')

Documentation:   PDF Manual  

Task views: Bayesian Inference

GPL-2 | GPL-3 license

Imports data.table, ggplot2, stats

Suggests DEoptim, dplyr, knitr, rmarkdown, testthat, tidyr

See at CRAN