Supervised and Unsupervised Self-Organising Maps

Functions to train self-organising maps (SOMs). Also interrogation of the maps and prediction using trained maps are supported. The name of the package refers to Teuvo Kohonen, the inventor of the SOM.


The kohonen package version 3.0.6 is described in R. Wehrens and J. Kruisselbrink, J. Stat. Softw. 2018 (to appear)

Changes in version 3.0.7:

  • reinstated Makevars, that is necessary for OpenMP flags (without it the package still compiles but no parallel computation is possible)
  • fixed a bug in FindBestMatchingUnit that in case of ties did not allow unit 1 to be the winning unit. Longstanding bug, also present in the class library (function VR_onlineSOM), but not very prominent since it only surfaces in cases of ties in distances.
  • added a working example to the degelder data, using the custom WCCd distance function also used in the J. Stat. Softw. paper.
  • added the published version of the 2018 J. Stat. Softw. paper to the doc directory

Changes in version 3.0.6:

  • fixed a bug in object.distances
  • require Rcpp version 0.12.12, and removed Makevars and kohonen_init.c because of this (suggestions by Gilberto Camara)

Changes in version 3.0.5:

  • columns with too many NA values are no longer removed; it is virtually impossible to guarantee that the right columns are matched when new data are submitted to the map (based on an error reported by Ignacio Chacón).
  • is now always called for all data layers of new data objects, not just for whatmap data layers. Affects functions map, predict, supersom. Fixes a problem where presenting a data.frame as test data led to a segfault (noticed by @miranska)
  • removed superfluous argument maxNA.fraction from
  • added 2018 JSS paper (pre-pub) to /inst/doc

Changes in version 3.0.4:

  • if whatmap is used during training, some of the codebook layers will be NULL. In version 3.0.2 this led to errors in predict.kohonen, which now explicitly only checks codebook layers in whatmap.
  • Added text in the supersom man page about the incorrect naming of the Tanimoto distance (noted by Pau Carrió Gaspar).
  • Added a check in supersom to make sure the Tanimoto distance is not used for data outside the range [0-1].
  • Added extra check in the function for empty data layers after the maxNA.fraction check (pointed out by Xu Kai)
  • Added pepper image (incl man file) used in JSS software description: peppaPic
  • added BC.cpp and wcc.cpp to inst/Distances directory
  • added JSSdemo1.R, JSSdemo2.R, JSSdemo3.R to demo directory
  • added RGBdemo.R to demo directory
  • added kohonen-package.Rd to man pages
  • reordered NEWS contents to list most recent entry first

Changes in version 3.0.3:

  • improved manual pages
  • fixed a bug in checking whether meanDistances contain zeros

Changes in version 3.0.2:

  • added checks for the presence of openMP headers

Changes in version 3.0.1:

  • added correct OpenMP headers
  • removed wine.classes from the wines data set; this is simply as.integer(vintages)

Changes in version 3.0.0:

  • som and xyf training methods are now considered to be special cases of supersom
  • function bdk is deprecated
  • C engine replaced by C++
  • several new distance functions have been added, including the possibility to provide user-defined distances (function pointers, made possible by the Rcpp package)
  • distances may now be defined per layer
  • many performance improvements, in particular reducing memory consumption
  • much more consistent syntax for the predict function

The kohonen package version 2.0.0 is described in R. Wehrens and L.M.C. Buydens, J. Stat. Softw. 21(5), 2007

Changes in version 2.0.20:

  • added support for hexagonal and square units in plotting maps (contribution by Romain Bossart)

Changes in version 2.0.19:

  • replaced MASS:::eqscplot simply by eqscplot
  • added imports from default packages to NAMESPACE file

Changes in verson 2.0.18:

  • added a package-specific NAMESPACE file

Changes in version 2.0.16:

  • added check to xyf and bdk functions to avoid matrices with differing numbers of rows. This was happily accepted but led to incorrect results. (Thanks to Johanna Laibe for noticing this behaviour)

Changes in version 2.0.15:

  • renamed this file to "NEWS"
  • updated maintainer email address

Changes in version 2.0.14:

  • fixed a bug, disguised as a memory leak, where map.kohonen attempted to calculate a distance between elements of a factor, rather than using the membership matrix. (Spotted by CRAN checks focusing on memory leaks, BDRipley)

Changes in version 2.0.13:

  • predict.kohonen choked on a matrix containing only one sample (Max Kuhn, who also provided the solution)
  • plotting for toroidal maps has been improved once again: not only are the bonudaries finally shown correctly, also the cluster boundaries at the edges of the map are shown twice when necessary for better interpretability.

Changes in version 2.0.12:

  • removed the restriction that a legend in a codes plot was only shown when the number of variables was smaller than 15 (segments)
  • fixed a bug in add.cluster.boundaries: now the correct boundaries are shown also for toroidal maps (Thomas Campagne)
  • added identify.kohonen
  • predict.kohonen now returns a factor if the training was also done with a factor; man page is updated so that the examples reflect this correctly. (Triggered by Ben Harrison)

Changes in version 2.0.11:

  • parameter contin was effectively not used in functions xyf, supersom and bdk (spotted by Andrey Ziyatdinov); now corrected
  • in supersom the original code to estimate contin was giving errors; now corrected
  • removed a bug in assigning column names to the codebook vectors in supersom
  • added topographic error measures (proposed independently by John Pearce and Marco Pomati)

Changes in version 2.0.10:

  • added xpd = NA to add.cluster.boundaries so that also segments outside the official plotting area are shown
  • added an example for component planes (suggestion by Andreas Henelius)
  • fixed a bug concerning maxlegendcols (spotted by Andreas Henelius)

Changes in version 2.0.9:

  • Added useDynLib directive to NAMESPACE for compatibility with older R versions

Changes in version 2.0.8:

  • changed contact email address once again to
  • added name space
  • compressed documentation file kohonen-manual.pdf

Changes in version 2.0.6:

  • Added as argument to plot.kohcode
  • Function-specific default palettes
  • Changed contact address to ron.wehrens at
  • Added visualisation of clustering of units (Leo Lopes)

Changes in version 2.0.5:

  • Changed title of the "quality" type plotting function to "Distance" rather than "Similarity"
  • Extra parameter "heatkeywidth" (thanks to Henning Rust) for control over the size of the heatkey. Useful for cases with multiple plots, and probably elsewhere too.
  • Added plotting type "dist.neighbours" which shows Ultsch' U-matrix: units are coloured according to the sum of the distances to their immediate neighbours. A high distance can be interpreted as being on a class boundary.
  • Plotting types "quality", "dist.neighbours" and "counts" now invisibly return what they show.
  • Added information on the wine varieties, as well as a reference to the original publication.
  • Cleaned up C-code and man pages to eliminate warnings in the development version of R 2.9.0

Changes in version 2.0.4:

  • introduction of tricolor function
  • prediction of multiple layers
  • Y in function xyf now can also be a factor
  • any element of the data argument in function supersom can now also be a factor
  • added new prediction examples for som, xyf and supersom
  • uniform output of som, xyf and supersom functions

Changes in version 2.0.3:

  • removed a bug in supersom, introduced in 2.0.1 when checking if the data are numeric.
  • added an example for supersom.
  • removed a bug in plot.kohcounts, arising when empty units are present
  • updated man files for methods 'plot', 'predict', and 'map'

Changes in version 2.0.2:

  • changed the 'predict.kohonen' function to take account of a change in the behaviour of 'aggregate' in R version 2.6.0 and later

Changes in version 2.0.1:

  • fixed a bug in unit.distances: for rectangular maps the toroidal argument was ignored. Updated the corresponding example as well.
  • added an argument 'n.hood' to functions som, xyf, bdk and supersom, so that the user can decide what distance measure to use when calculating distances between units. Useful to compare the results with other som implementations.
  • specification of radius as (start, stop), similar to alpha. This gives better control (suggestion of Hadley Wickham).
  • more sanity checking in functions som, xyf, bdk and supersom (suggestion of Hadley Wickham).
  • added the J. Stat. Softw. paper and a CITATION file to the distribution.

Reference manual

It appears you don't have a PDF plugin for this browser. You can click here to download the reference manual.


3.0.8 by Ron Wehrens, a month ago

Browse source code at

Authors: Ron Wehrens and Johannes Kruisselbrink

Documentation:   PDF Manual  

Task views: Chemometrics and Computational Physics, Multivariate Statistics

GPL (>= 2) license

Imports MASS, Rcpp

Linking to Rcpp

Imported by ChemometricsWithR, GenomicMating, Rsomoclu, UNCLES, diceR, ggsom, multisom, optCluster, som.nn.

Suggested by RankAggreg, clValid, clusterfly, fscaret.

See at CRAN