Time Series Clustering Along with Optimizations for the Dynamic Time Warping Distance

Time series clustering along with optimized techniques related to the Dynamic Time Warping distance and its corresponding lower bounds. Implementations of partitional, hierarchical, fuzzy, k-Shape and TADPole clustering are available. Functionality can be easily extended with custom distance measures and centroid definitions. Implementations of DTW barycenter averaging, a distance based on global alignment kernels, and the soft-DTW distance and centroid routines are also provided. All included distance functions have custom loops optimized for the calculation of cross-distance matrices, including parallelization support. Several cluster validity indices are included.

Time Series Clustering With Dynamic Time Warping Distance (DTW)

This package attempts to consolidate some of the recent techniques related to time series clustering under DTW and implement them in R. Most of these algorithms make use of traditional clustering techniques (partitional and hierarchical clustering) but change the distance definition. In this case, the distance between time series is measured with DTW.

DTW is, however, computationally expensive, so several optimization techniques exist. They mostly deal with bounding the DTW distance. These bounds are only defined for time series of equal lengths. Nevertheless, if the length of the time series of interest vary only slightly, reinterpolating them to a common length is probably appropriate.

Additionally, a recently proposed algorithm called k-Shape could serve as an alternative. k-Shape clustering relies on custom distance and centroid definitions, which are unrelated to DTW. The shape extraction algorithm proposed therein is particularly interesting if time series can be normalized.

Many of the algorithms and optimizations require that all series have the same length. The ones that don't are usually slow but can still be used.

Please see the included references for more information.

# Reinterpolate data to equal lengths
data <- lapply(CharTraj, reinterpolate, newLength = 205)
kc <- dtwclust(data = data, k = 20, distance = "dtw_lb",
               window.size = 20, centroid = "pam",
               save.data = TRUE,
               seed = 3247, trace = TRUE)
#>      1 Changes / Distsum : 100 / 1157.388 
#>      2 Changes / Distsum : 17 / 826.3151 
#>      3 Changes / Distsum : 4 / 752.9349 
#>      4 Changes / Distsum : 1 / 752.1322 
#>      5 Changes / Distsum : 0 / 752.1322 
#>  Elapsed time is 5.53 seconds.

  • Partitional procedures are implemented by leveraging the flexclust package.
  • Hierarchical procedures use the native hclust function.
  • Cross-distances make use of the proxy package.
  • The core DTW calculations are done by the dtw package.
  • Plotting is done with the ggplot2 package.
  • Keogh's and Lemire's lower bounds
  • DTW Barycenter Averaging
  • k-Shape clustering
  • TADPole clustering


Reference manual

It appears you don't have a PDF plugin for this browser. You can click here to download the reference manual.


5.5.6 by Alexis Sarda, 2 years ago


Report a bug at https://github.com/asardaes/dtwclust/issues

Browse source code at https://github.com/cran/dtwclust

Authors: Alexis Sarda-Espinosa

Documentation:   PDF Manual  

Task views: Time Series Analysis

GPL-3 license

Imports parallel, stats, utils, bigmemory, clue, cluster, dplyr, flexclust, foreach, ggplot2, ggrepel, Matrix, nloptr, RSpectra, Rcpp, RcppParallel, reshape2, shiny, shinyjs

Depends on methods, proxy, dtw

Suggests doParallel, knitr, rmarkdown, testthat

Linking to Rcpp, RcppArmadillo, RcppParallel, RcppThread

System requirements: C++11, GNU make

Suggested by IncDTW, latrend.

See at CRAN