Python Data Analysis Cookbook
上QQ阅读APP看书,第一时间看更新

Visualizing network graphs with hive plots

A hive plot is a visualization technique for plotting network graphs. In hive plots, we draw edges as curved lines. We group nodes by some property and display them on radial axes. NetworkX is one of the most famous Python network graph libraries; however, it doesn't support hive plots yet (July 2015). Luckily, several libraries exist that specialize in hive plots. Also, we will use an API to partition the graph of Facebook users available at Stanford Network Analysis Project (SNAP), which also has a Python API. Unfortunately, the SNAP API doesn't support Python 3 yet.

Getting ready

I have NetworkX 1.9.1 via Anaconda. The instructions to install NetworkX are at community package at hiveplot package hosted at https://github.com/ericmjl/hiveplot (retrieved July 2015):

$ [sudo] pip install hiveplot

I wrote the code with hiveplot 0.1.7.4.

How to do it...

  1. The imports are as follows:
    import networkx as nx
    import community
    import matplotlib.pyplot as plt
    from hiveplot import HivePlot
    from collections import defaultdict
    from dautil import plotting
    from dautil import data
  2. Load the data and create a NetworkX Graph object:
    fb_file = data.SPANFB().load()
    G = nx.read_edgelist(fb_file, 
                         create_using = nx.Graph(), 
                         nodetype = int)
    print(nx.info(G))
  3. Partition the graph and create a nodes dictionary as follows:
    parts = community.best_partition(G)
    nodes = defaultdict(list)
    
    for n, d in parts.items():
        nodes[d].append(n)
  4. The graph is pretty big, so we will just create three groups of edges:
    edges = defaultdict(list)
    
    for u, v in nx.edges(G, nodes[0]):
        edges[0].append((u, v, 0))
    
    for u, v in nx.edges(G, nodes[1]):
        edges[1].append((u, v, 1))
    
    for u, v in nx.edges(G, nodes[2]):
        edges[2].append((u, v, 2))
  5. Plotting will take about six minutes:
    %matplotlib inline
    cmap = plotting.sample_hex_cmap(name='hot', ncolors=len(nodes.keys()))
    h = HivePlot(nodes, edges, cmap, cmap)
    h.draw()
    plt.title('Facebook Network Hive Plot')

After the waiting period, we get the following plot:

The code is in the hive_plot.ipynb file in this book's code bundle.