A Tidy API for Graph Manipulation

A graph, while not "tidy" in itself, can be thought of as two tidy data frames describing node and edge data respectively. 'tidygraph' provides an approach to manipulate these two virtual data frames using the API defined in the 'dplyr' package, as well as provides tidy interfaces to a lot of common graph algorithms.


Travis-CI Build Status AppVeyor Build Status CRAN_Release_Badge CRAN_Download_Badge Coverage Status

This package provides a tidy API for graph/network manipulation. While network data itself is not tidy, it can be envisioned as two tidy tables, one for node data and one for edge data. tidygraph provides a way to switch between the two tables and provides dplyr verbs for manipulating them. Furthermore it provides access to a lot of graph algorithms with return values that facilitate their use in a tidy workflow.

An example

library(tidygraph)
 
play_erdos_renyi(10, 0.5) %>% 
  activate(nodes) %>% 
  mutate(degree = centrality_degree()) %>% 
  activate(edges) %>% 
  mutate(centrality = centrality_edge_betweenness()) %>% 
  arrange(centrality)
#> #
#> # A directed simple graph with 1 component
#> #
#> # Edge Data: 37 x 3 (active)
#>    from    to centrality
#>   <int> <int>      <dbl>
#> 1    10     3   1.500000
#> 2     5     6   1.500000
#> 3     2     7   1.500000
#> 4    10     9   1.500000
#> 5     8     7   1.833333
#> 6     5     8   1.833333
#> # ... with 31 more rows
#> #
#> # Node Data: 10 x 1
#>   degree
#>    <dbl>
#> 1      5
#> 2      3
#> 3      4
#> # ... with 7 more rows

Overview

tidygraph is a huge package that exports 280 different functions and methods. It more or less wraps the full functionality of igraph in a tidy API giving you access to almost all of the dplyr verbs plus a few more, developed for use with relational data.

More verbs

tidygraph adds some extra verbs for specific use in network analysis and manipulation. The activate() defines wether one is manipulating node or edge data at the moment as shown in the example above. bind_edges(), bind_nodes(), and bind_graphs() lets you expand the graph structure you're working with, while graph_join() lets you merge two graphs on some node identifier. reroute() on the other hand lets you change the terminal nodes of the edges in the graph.

More algorithms

tidygraph wraps almost all of igraphs graph algorithms and provides a consistent interface and output that always matches the sequence of nodes and edges. All tidygraph algorithm wrappers are intended for use inside verbs where they know the context they are being called in. In the example above it is not necessary to supply the graph nor the node/edge ids to centrality_degree() and centrality_edge_betweenness() as they are aware of that already. This leads to much clearer code and less typing.

More maps

tidygraph goes beyond dplyr and also implement graph centric version of the purrr map functions. You can now call a function on the nodes in the order of a breath or depth first search while getting access to the result of the previous calls.

More morphs

tidygraph lets you temporarily change the representation of your graph, do some manipulation of the node and edge data, and then change back to the original graph with the changes being merged in automatically. This is powered by the new morph()/unmorph() verbs hat lets you e.g. contract nodes, work on the linegraph representation, split communities to seperate graphs etc. If you wish to continue with the morphed version, the crystallise() verb lets you freeze the temporary representation into a proper tbl_graph.

More data structure support

While tidygraph is powered by igraph underneath it wants everyone to join the fun. the as_tbl_graph() function can easily convert relational data from all your favourite objects, such as network, phylo, dendrogram, data.tree, graph, etc. More conversion will be added in the order I get aware of them.

Visualisation

tidygraph itself does not provide any means of visualisation, but it works flawlessly with ggraph. This division makes it easy to develop the visualisation and manipulation code at different speeds depending on where the needs arise.

Installation

tidygraph is available on CRAN and can be installed simply, using install.packages(tidygraph). For the development version available on GitHub, use the devtools package for installation:

devtools::install_github('thomasp85/tidygraph')

Thanks

tidygraph stands on the shoulders of particularly the igraph and dplyr/tidyverse teams. It would not have happened without them, so thanks so much to them.

News

tidygraph 1.1.2

  • Compatibility with dplyr 0.8

tidygraph 1.1.1

  • Better conversion of network objects. Old conversion could mess up edge attributes.
  • Changes to anticipate new version of tibble and dplyr
  • tibble-like dimming of non-data text in printing
  • Edge-length is now preserved when converting from phylo
  • Added to_subcomponent morpher to work with a single component containing a specified node
  • Morphers that reference nodes now correctly tidy eval the node argument
  • Add node_is_adjacent to query which nodes are directly connected to a set of nodes
  • Add fortify method for tbl_graph object for plotting as regular data with ggplot2

tidygraph 1.1.0

  • Fix bug when coercing to tbl_graph from an adjacency list containing NULL or NA elements.
  • Change license to MIT
  • Add convert verb to perform both morph and crystallise in one go, returning a single tbl_graph
  • When collapsing edges or nodes during morph the original data will be stored in .orig_data instead of .data to avoid conflicts with .data argument in many tidyverse verbs (BREAKING)
  • as_tbl_graph.data.frame now recognises set tables (each column gives eachs rows membership to that set)
  • Add with_graph to allow computation of algorithms outside of verbs
  • graph_is_* set of querying functions has been added that all returns logical scalars.
  • Add %N>% and %E>% for activating nodes and edges respectively as part of the piping.
  • mutate now lets you reference created columns in graph algorithms so it behaves in line with expected mutate behaviour. This has led to a slight performance decrease (millisecond scale). The old behaviour can be accessed using mutate_as_tbl where the graph will only get updated in the end.
  • When using to_subgraph with edges, isolated nodes are no longer deleted
  • bind_graphs now work with a single tbl_graph
  • Added .register_graph_context to allow the use of tidygraph algorithms in external functions.
  • Added to_unfolded_tree, to_directed, and to_undirected morphers
  • Add the node_rank_* family of algorithms for seriation of nodes
  • Added to_hierarchical_clusters morpher to work with hierarchical representations of community detection algorithms.
  • All group_* algorithms now ensure that the groups are enumerated in descending order based on size, i.e. members of the largest group/community will always have 1, etc.
  • Fix a bug when filtering all nodes or edges where no nodes/edges would be removed (#42)
  • Added interface to netrankr resulting in 19 new centrality scores and a manual mode for composing new centrality scores
  • Added edge_is_[from|to|between|incident]() to help find edges related to certain nodes

Reference manual

It appears you don't have a PDF plugin for this browser. You can click here to download the reference manual.

install.packages("tidygraph")

1.1.2 by Thomas Lin Pedersen, 8 months ago


https://github.com/thomasp85/tidygraph


Report a bug at https://github.com/thomasp85/tidygraph/issues


Browse source code at https://github.com/cran/tidygraph


Authors: Thomas Lin Pedersen [cre, aut]


Documentation:   PDF Manual  


MIT + file LICENSE license


Imports tibble, dplyr, igraph, magrittr, utils, rlang, R6, Rcpp, tools, stats, tidyr, pillar

Suggests network, data.tree, ape, graph, methods, testthat, covr, seriation, netrankr, influenceR, NetSwan

Linking to Rcpp


Imported by clustree, clustringr, egor, fastnet, ggdag, ggraph, particles, scholar, stminsights.

Suggested by dodgr, rmangal, see, visNetwork.


See at CRAN