Detection of Outliers in Time Series

Detection of outliers in time series following the Chen and Liu (1993) procedure. Innovational outliers, additive outliers, level shifts, temporary changes and seasonal level shifts are considered.


Changes in version 0.6-8 2019-02-24

o Fixed the Date field in the DESCRIPTION file (2018 -> 2019).

Changes in version 0.6-7 2019-02-23

o The utilities for structural time series models have been disabled. The package 'stsm' and the function 'KFKSDS::KF' are no longer imported. Those utilities were an experimental version that extended the procedure for detection of outliers in ARIMA models to other models in state space form. The vignette still documents this approach. If interested in the code, check previous versions or contact the maintainer.

o Fix in 'outliers.effects': computations were not correct for the case of innovational outliers. 'c(1, mas)' is now used instead of 'c(1, -mas)' as the MA coefficients. Reported by D. Kehl.

o Tentative approach enabled by argument 'check.rank' in 'discard.outliers' in order to deal with perfect collinearity among the regressor variables.

o Further fixes in function 'calendar.effects' to make it independent of the locale.

Changes in version 0.6-6 2017-05-27

o Fix in function 'calendar.effects'. Reported by C. Espeleta. By using weekday as a decimal number instead of a character, the function now works regardless of the the locale in use.

o Fix in 'locate.outliers.oloop'. If the series contains NAs, the NAs in the residuals that are passed to 'outliers.tstatistics' are now imputed with the mean of the remaining residuals; before, those NAs were propagated throughout the t-statistics preventing the detection of outliers at some points.

o Fix in 'discard.outliers'. If all the outliers were discarded by method "en-masse", the matrix of outliers resulted in a zero-column matrix and 'forecast::auto.arima()' returned error when trying to assign column names to it. The zero column matrix
is now replaced by 'NULL'.

o The maximum number of iterations in the outer loop can now be set by the user (before it was set to 4, which is currently the default). It can be set through the argument 'maxit.oloop' in functions 'tso', 'tso0' and 'locate.outliers.oloop'. Suggested by J. Byers and M. Dothan.

o The function 'remove.outliers' has been renamed as 'discard.outliers'. The arguments 'remove.method' and 'remove.cval' in function 'tso' have been renamed as 'discard.method' and 'discard.cval'. The old names still work (except in 'tso0'), but a warning is given as they will be most likely be ignored in a future version.

The name 'remove' may be misleading to users. A user expected this function to 
remove the outliers from the original data. The series cleaned from outliers 
is returned by 'tso' in the list element named 'yadj'. This function discards
(and removes from the set of potential outliers) those outliers that turned out 
to be non-significant in the updated model.

o Column names (if not provided) are now assigned to argument 'xreg'. If it is a vector, it is converted to a one-column matrix. This simplifies the usage of this argument. Example: before, 'tso(x, xreg=cbind(series1=series1))' was the correct usage, now 'tso(x, xreg=series1)' or 'tso(x, xreg=seq_along(x))' work.

Changes in version 0.6-5 2016-12-08

o Created "find.consecutive.outliers" to find outliers that were detected at consecutive time points by "locate.outliers.oloop" and "locate.outliers.iloop". If is called within the inner and outer loops and now also by "remove.outliers", so that the whole set of outliers is also checked (not only those at the current inner or outer iterations). The outlier with the highest t-statistic in absolute value is kept and the others are discarded. This is done for all types of outliers, before it was done only for LS.

o "coefs2poly" has been rebuilt. The code has been simplified using the output x$model$phi and x$model$theta from stats::arima().

o Note: the following change in R 3.2.1, 'arima(*, xreg = .) (for d >= 1) computes estimated variances based on a the number of effective observations as in R version 3.0.1 and earlier. (PR#16278)', may lead to changes in the results returned by 'tso'. I have observed that in some examples of the manual the same coefficient estimates are obtained, but the standard errors change slightly compared to the version from November 2014. This is most likely due to the above-mentioned change in 'arima' and may affect the t-statistics that are used for the detection of outliers: e.g., the example 'resAirp' in the documentation reported 4 outliers in the version from November 2014, now and additional outlier LS39 is obtained.

o remove.outliers(remove.method="bottom-up"): the coefficients and t-statistics are also updated in "moall".

o remove.outliers(remove.method="en-masse"): if outliers are discarded then the model is refit without the discarded outliers and the coefficients and t-statistics in "moall" are updated.

o When 'maxit' is greater than 1 in tso(), duplicates and outliers at consecutive type points (if any) are discarded keeping the outlier from the older iteration.

Changes in version 0.6-4 2016-07-28

o Development version distributed to some users (not submitted to CRAN).

o Fix in "locate.outliers": in some instances consecutive LS were not removed correctly. Thanks to Paymon Ashourian for reporting this issue.

Changes in version 0.6-3 2016-07-07

o Imported some functions in the NAMESPACE suggested by R CMD check.

o Fixed the link to some URLs given in the vignette.

Changes in version 0.6-2 2015-05-18

o Development version not submitted to CRAN.

o Updated the function "coefs2poly" in order to deal with ARIMA models that apply the regular difference operator two times, I(2). Email Michael Dothan, May 18, 2015. Email Diego Selle, November 24, 2015.

Changes in version 0.6-1 2015-04-15

o Development version not submitted to CRAN.

o Imported stats::filter in DESCRIPTION and NAMESPACE files in order to avoid conflict with dply::filter. Thanks to Nathan Kugland, Derek Holmes and Eric Goldsmith for reporting this.

Changes in version 0.6 2015-01-27

o The functionalities of package "stsm" have been restored now that the package "stsm" has been updated on CRAN.

Changes in version 0.5 2015-01-25

o This is submitted to CRAN as a transitory version where some functions related to package "stsm" are disabled. Package "stsm.class" has been merged into package "stsm". If the new version of "stsm" is submitted these warnings are obtained: "Warning: replacing previous import by 'stsm.class::char2numeric' when loading 'tsoutliers'". If "tsoutliers" is submitted first, it requires disabling "stsm" functionalities since the current version of "stsm" on CRAN is not updated with the class and methods defined in package "stsm.class".

o Adapted function "tso" to the new output format from functions "maxlik.Xd.XXX", where now information related to "xreg" is not stored as a separate list element.

o Removed option in remove.method = "linear-regression" since now exogenous regressors are better handled by package "stms" (despite possible difficulties computing standard errors).

o Added warning informing that currently "maxit" > 1 is not allowed in function "tso" for tsmethod="stsm". Before an uninformative error was returned.

o Fixed function 'locate.outliers' for the case where the length of argument 'types' is one. Thanks to Boriss Siliverstovs for reporting this issue.

o The examples, DESCRIPTION and NAMESPACE files have been updated now that the class and methods defined in package "stsm.class" have been merged in package "stsm".

o Non-standardised residuals are used in 'locate.outliers.oloop'. Despite the standardised residuals can be used to obtain t-statistics and decide on the significance of outliers, coefficient estimates of these regressors do not reflect the scale of the data and hence the series was not adjusted correctly for the presence of outliers in the iterations run in 'locate.outliers.oloop'.

o The default arguments to fit a structural time series model has been changed from 'stsm::maxlik.fd.scoring' to '' in order to take advantage of some recent enhancements in package "stsm" when regressors are included in the model.

o Some issues have been modified in order to deal with the presence of missing observations. - 'locate.outliers': NA values (if any) are now ignored when obtaining 'id0resid'. - 'outliers.tstatistics.ArimaPars' and 'outliers.tstatistics.stsmSS': na.rm=TRUE is used in when 'quantile' is called; NAs generated by 'filter' when obtaining 'ao.xy' are now explicitly removed instead of removing all possible NAs present in the original series.

o The package has benefited from some updates in package "stsm" that deal with external regressors. Now more reliable results are obtained when a "stsm" model is used. See for example application to Nile time series in section 6.1 of the document tsoutliers.pdf attached to the package.

o Results from some simulation exercises are reported in Section 5 of the document tsoutliers.pdf attached to the package.

o Fixed file 'index.html' in the 'doc' directory and validated at as HTML 4.01 Transitional.

Changes in version 0.4 2014-06-26

o The function 'tsoutliers' has been renamed as 'tso' to avoid conflict with the function of the same name in package 'forecast'. To keep resemblance in the names, the function 'tsoutliers0' has been renamed as 'tso0'.

o The function 'jarque.bera.test' has been renamed as 'JarqueBera.test' to avoid conflict with the function of the same name in package 'tseries'.

o function 'plot.tsoutliers': fixed "plot(cbind(x$y, x$yadj), ..." a typo ("x$adj" instead of "x$yadj") made the range of the y-axis not cover the range of the original and adjusted series in some cases.

Changes in version 0.3 2014-05-17

o First version available on CRAN.

o Some level of abstraction is given to some functions using S3 methods. The interfaces "outliers.tstatistics" and "outliers.regressors" can now be used regardless of whether the baseline model is an ARIMA or a structural time series model. This has simplified some parts of the code and the prototype of some functions where the argument "tsmethod" is no longer needed.

o The functions 'IOtstat.arima', 'AOtstat.arima', 'LStstat.arima', 'TCtstat.arima' and 'SLStstat.arima' are not exported in the NAMESPACE. A new function 'outliers.tstatistics' has been designed as a common interface to those functions.

o Debugging versions for the computation of the t-statistics has been removed. The versions of 'IOtstat.arima', 'AOtstat.arima', 'LStstat.arima', 'TCtstat.arima' and 'SLStstat.arima' that seem to be faster have been chosen. Recomputation of 'ARMAtoMA()' for each type of outlier is avoided.

o New function 'outliers' has been added. This function is useful to easily create the input 'data.frame' (containing characters and integers) of functions 'outliers.effects' and 'outliers.regressors.arima'. Thus, these functions can be used as a common interface for functions 'XXeffect' and 'XXxreg', which are not necessary to be exported.

o The functions 'IOeffect', 'AOeffect', 'LSeffect', 'TCeffect and 'SLSeffect' are not exported in the NAMESPACE. The function 'outliers.effects' is used as a common interface to those functions.

o 'outliers.regressors.arima' has been simplified and speeded-up by choosing the versions of 'AOxreg', 'LSxreg', 'TCxreg' and 'SLSxreg' that seem to be faster. Recomputation of 'ARMAtoMA()' for each type of outlier is avoided. Argument 'arimafit' is no longer needed.

o The functions 'IOxreg', 'AOxreg', 'LSxreg', 'TCxreg' and 'SLSxreg' are not exported in the NAMESPACE. The function 'outliers.regressors.arima' is used as a common interface to those functions.

Version 0.2

o The function 'outliers.regressors.arima' is a debugging version that uses different versions to do the computations.

Reference manual

It appears you don't have a PDF plugin for this browser. You can click here to download the reference manual.


0.6-8 by Javier López-de-Lacalle, 3 years ago

Browse source code at

Authors: Javier López-de-Lacalle <[email protected]>

Documentation:   PDF Manual  

Task views: Time Series Analysis

GPL-2 license

Imports forecast, stats

Imported by SLBDD, composits, dsa.

See at CRAN