Statistical Matching

Integration of two data sources referred to the same target population which share a number of common variables (aka data fusion). Some functions can also be used to impute missing values in data sets through hot deck imputation methods. Methods to perform statistical matching when dealing with data from complex sample surveys are available too.


1.2.5 gower.dist is faster and more efficient due improvements of Jan van der Laan (also thanks to Ton de Waal )

	NND.hotdeck allows performing constrained search of donors, allowing donor to be selected not more than k times (k>=1). 
            argument k is set by the user		

	fixed a minor bug in RANDwNND.hotdeck (not affecting results)
	richer output in and Fb.widths.byx

1.2.4 added the new function pBayes for applying pseudo-Bayes estimator to sparse contingency tables

	modified comb.samples to handle a continuous target variable (Y or Z)
	Faster versions of and now provides a richer output. 

1.2.3 corrected a bug in RANDwNND.hotdeck. Thanks to Kirill Muller

1.2.2 added 3 data sets used in the function's help pages and in the vignette

	modified the RANDwNND.hotdeck function to identify the subset of the donors by
	simple comparing the values of a single matching variable 

	Minor modification of the hotdeck functions to handle and monitor the processing
	when dealing with donation classes

1.2.1 now can be called just to compute the uncertainty bounds when no X variables are available.

	RANDwNND.hotdeck can search for the closest k nearest neighbours by using the
	function nn2() in the package RANN (wrap of the Artificial Neural Network
	implemented in the package ANN).  It is very fast and efficient when dealing
	with large data sources.
	Fix of a minor bug in mixed.mtc()

1.2.0 new function comp.prop() for computing similarities/dissimilarities between marginal/joint distributions of one or more categorical variables

	new function pw.assoc() to compute pairwise association measures among 
	categorical response variable and a series of categorical predictors 
	rankNND.hotdeck() can perform constrained matching too
	rankNND.hotdeck(), NND.hotdeck() and mixed.mtc() solve constrained problems 
	more efficiently  and faster by using solve_LSAP() in package "clue" 
	or (slower) by means of functions in the package "lpSolve".  
	It is no more possible to solve constrained  problems by means 
	of functions in package "optmatch"
	NDD.hotdeck(), RDDwNND.hotdeck() and rankNND.hotdeck() are more
	efficient in handling donation classes (thanks to Alexis Eidelman
	for suggestion).
	fixed a bug in mahalanobis.dist (thanks to Bruno C. Vidigal)

1.1.0 The function comb.samples() now allows to derive predictions at micro level for the target variables Y and Z

1.0.5 fixed some minor bugs

1.0.4 fixed some minor bugs

1.0.3 now mixed.mtc() can handle also categorical common variables

	fixed a bug in comb.samples() when handling factor levels

	new error messages in RANDwNND.hotdeck() when computing ditances 
	between units with missing values

1.0.2 new function mahalanobis.dist() to compute the mahalanobis distance

	fixed a bug in mixed.mtc() when computing the range of admissible values
	for rho_yz
	fixed a bug in NND.hotdeck()  and RANDwNND.hotdeck() when
	managing the row.names

1.0.1 new functions harmonize.x() and comb.samples() to perform statistical matching when dealing with complex sample survey data via weight calibration.

	new function to explore uncertainty when dealing with 
	categorical variables. The function	permits to
	identify the subset of the common variables that performs better in reducing 
	New function rankNND.hotdeck() to perform rank hot deck distance
	Update of RANDwNND.hotdeck() to use donor weight in selecting a donor
	new function maximum.dist() that computes distances according to the
	L^Inf norm. A rank transformation of the variables can be used.

0.8 fixed some bugs in NND.hotdeck() and RANDwNND.hotdeck()

Reference manual

It appears you don't have a PDF plugin for this browser. You can click here to download the reference manual.


1.2.5 by Marcello D'Orazio, a year ago

Browse source code at

Authors: Marcello D'Orazio

Documentation:   PDF Manual  

Task views: Official Statistics & Survey Methodology

GPL (>= 2) license

Depends on proxy, clue, survey, RANN, lpSolve

Suggests MASS, Hmisc

Imported by convoSPAT, dprep, semiArtificial, sparkTable.

Depended on by geosptdb.

Suggested by corehunter.

See at CRAN