In the era of big data and hyperconnectivity, learning high-dimensional structures such as graphs from data has become a prominent task in machine learning and has found applications in many fields such as finance, health care, and networks. 'spectralGraphTopology' is an open source, documented, and well-tested R package for learning graphs from data. It provides implementations of state of the art algorithms such as Combinatorial Graph Laplacian Learning (CGL), Spectral Graph Learning (SGL), Graph Estimation based on Majorization-Minimization (GLE-MM), and Graph Estimation based on Alternating Direction Method of Multipliers (GLE-ADMM). In addition, graph learning has been widely employed for clustering, where specific algorithms are available in the literature. To this end, we provide an implementation of the Constrained Laplacian Rank (CLR) algorithm.
spectralGraphTopology provides estimators to learn k-component, bipartite, and k-component bipartite graphs from data by imposing spectral constraints on the eigenvalues and eigenvectors of the Laplacian and adjacency matrices. Those estimators leverages spectral properties of the graphical models as a prior information, which turn out to play key roles in unsupervised machine learning tasks such as community detection.
From inside an R session, type:
Alternatively, you can install the development version from GitHub:
$ git clone https://github.com/dppalomar/spectralGraphTopology.git
$ cd spectralGraphTopology
$ make build && make install
On MS Windows environments, make sure to install the most recent version of Rtools
.
We illustrate the usage of the package with simulated data, as follows:
library(spectralGraphTopology)library(clusterSim)library(igraph)set.seed(42) # generate graph and datan <- 50 # number of nodes per clustertwomoon <- clusterSim::shapes.two.moon(n) # generate datapointsk <- 2 # number of components # estimate underlying graphS <- crossprod(t(twomoon$data))graph <- learn_k_component_graph(S, k = k, beta = .5, verbose = FALSE, abstol = 1e-3) # plot# build networknet <- igraph::graph_from_adjacency_matrix(graph$Adjacency, mode = "undirected", weighted = TRUE)# colorify nodes and edgescolors <- c("#706FD3", "#FF5252")V(net)$cluster <- twomoon$clustersE(net)$color <- apply(as.data.frame(get.edgelist(net)), 1, function(x) ifelse(V(net)$cluster[x[1]] == V(net)$cluster[x[2]], colors[V(net)$cluster[x[1]]], '#000000'))V(net)$color <- colors[twomoon$clusters]# plot nodesplot(net, layout = twomoon$data, vertex.label = NA, vertex.size = 3)
For more examples, check out our gallery.
We welcome all sorts of contributions. Please feel free to open an issue to report a bug or discuss a feature request.
If you made use of this software please consider citing:
In case you made use of the function cluster_k_component_graph
, consider citing:
Package: GitHub
README file: GitHub-readme