High dimensional interaction search by brute force requires a quadratic computational cost in the number of variables. The xyz algorithm provably finds strong interactions in almost linear time. For details of the algorithm see: G. Thanei, N. Meinshausen and R. Shah (2016). The xyz algorithm for fast interaction search in high-dimensional data < https://arxiv.org/pdf/1610.05108v1.pdf>.
R package of the xyz algorithm for fast interaction search in high-dimensional data.
Finding interactions in high-dimensional data can be computationally very expensive. If the data set has p variables then naive search incurs a quadratic cost in p.
The xyz algorithm finds strong interaction in almost linear time.
A simple example in R:
library(xyz)n<-300p<-1000#construct a binary matrixX<-matrix(sample(c(-1,1),replace=TRUE,n*p),n,p)#set an interaction of the pair (1,2)Y<-X[,1]*X[,2]+rnorm(n)#run the interaction searchresult<-xyz_search(X,Y,L=10,N=10,binary=TRUE,negative=TRUE)#print the resultprint(result)
This package is only on github and not yet on CRAN. To install run the following code: