Statistical Matching

Integration of two data sources referred to the same target population which share a number of common variables (aka data fusion). Some functions can also be used to impute missing values in data sets through hot deck imputation methods. Methods to perform statistical matching when dealing with data from complex sample surveys are available too.


1.2.4 added the new function pBayes for applying pseudo-Bayes estimator to sparse contingency tables

    modified comb.samples to handle a continuous target variable (Y or Z)
    Faster versions of and
 now provides a richer output. 

1.2.3 corrected a bug in RANDwNND.hotdeck. Thanks to Kirill Muller

1.2.2 added 3 data sets used in the function's help pages and in the vignette

    modified the RANDwNND.hotdeck function to identify the subset of the donors by
    simple comparing the values of a single matching variable 

    Minor modification of the hotdeck functions to handle and monitor the processing
    when dealing with donation classes

1.2.1 now can be called just to compute the uncertainty bounds when no X variables are available.

    RANDwNND.hotdeck can search for the closest k nearest neighbours by using the
    function nn2() in the package RANN (wrap of the Artificial Neural Network
    implemented in the package ANN).  It is very fast and efficient when dealing
    with large data sources.
    Fix of a minor bug in mixed.mtc()

1.2.0 new function comp.prop() for computing similarities/dissimilarities between marginal/joint distributions of one or more categorical variables

    new function pw.assoc() to compute pairwise association measures among 
    categorical response variable and a series of categorical predictors 
    rankNND.hotdeck() can perform constrained matching too
    rankNND.hotdeck(), NND.hotdeck() and mixed.mtc() solve constrained problems 
    more efficiently  and faster by using solve_LSAP() in package "clue" 
    or (slower) by means of functions in the package "lpSolve".  
    It is no more possible to solve constrained  problems by means 
    of functions in package "optmatch"
    NDD.hotdeck(), RDDwNND.hotdeck() and rankNND.hotdeck() are more
    efficient in handling donation classes (thanks to Alexis Eidelman
    for suggestion).
    fixed a bug in mahalanobis.dist (thanks to Bruno C. Vidigal)

1.1.0 The function comb.samples() now allows to derive predictions at micro level for the target variables Y and Z

1.0.5 fixed some minor bugs

1.0.4 fixed some minor bugs

1.0.3 now mixed.mtc() can handle also categorical common variables

    fixed a bug in comb.samples() when handling factor levels

    new error messages in RANDwNND.hotdeck() when computing ditances 
    between units with missing values

1.0.2 new function mahalanobis.dist() to compute the mahalanobis distance

    fixed a bug in mixed.mtc() when computing the range of admissible values
    for rho_yz
    fixed a bug in NND.hotdeck()  and RANDwNND.hotdeck() when
    managing the row.names

1.0.1 new functions harmonize.x() and comb.samples() to perform statistical matching when dealing with complex sample survey data via weight calibration.

    new function to explore uncertainty when dealing with 
    categorical variables. The function permits to
    identify the subset of the common variables that performs better in reducing 
    New function rankNND.hotdeck() to perform rank hot deck distance
    Update of RANDwNND.hotdeck() to use donor weight in selecting a donor
    new function maximum.dist() that computes distances according to the
    L^Inf norm. A rank transformation of the variables can be used.

0.8 fixed some bugs in NND.hotdeck() and RANDwNND.hotdeck()

Reference manual

It appears you don't have a PDF plugin for this browser. You can click here to download the reference manual.


1.2.5 by Marcello D'Orazio, 2 months ago

Browse source code at

Authors: Marcello D'Orazio

Documentation:   PDF Manual  

Task views: Official Statistics & Survey Methodology

GPL (>= 2) license

Depends on proxy, clue, survey, RANN, lpSolve

Suggests MASS, Hmisc

Imported by convoSPAT, dprep, sparkTable.

Depended on by geosptdb.

Suggested by corehunter, funrar.

See at CRAN