Beautiful Moments in Information Architecture II
« May 2007 | Main | July 2007 »
My network code is evolving. Today, I was able to look at the phase change phenomenon in a randomly linked network.
To build a random network choose two nodes at random and connect them together. As one continues to add links to the system, some nodes by chance end up with more links than others. The number of nodes directly linked (the nearest neighbors) to a chosen node is called the “degree”. The distribution of degree for a 1,000 node system is shown for various numbers of total links in the system in my post from a couple of days ago.
As more links are added to the system, more and more nodes are connected together. Some nodes get connected to nodes that already have links, making larger groups of nodes linked together. As one adds links to the system, new groups of 2 form as well as groups becoming linked together to form larger groups.
In the system simulated here, I have 1,000 nodes. If I add 500 random links to the system, this would be enough to link every node once. But because links are added randomly, some nodes are highly linked, and others aren’t linked at all. One might expect the groups to get bigger gradually until there are so many links in the system, that nearly all the nodes are in one big group.
Instead, something else interesting happens as we add enough links in the system to theoretically connect every node (500 in my simulation).
If we look at the groups that form, the system goes through an abrupt transition from the unlinked phase to the linked phase when the (links in system)/(nodes in system) ~ ½. See Figure 1 below. The largest group rapidly grows from containing a tiny fraction of the nodes to containing nearly all of them. This is the phase transition in random networks depicted in Watts’ book Six Degrees on pg 45 (hardback edition). The value Average Links/Node = 0.5 is known as the critical point. This was explained in 1959 by Erdos and Renyi (without the benefit of a straight-forward simulation like the one here).
Figure 1. Phase Transition Diagram
What is the distribution of group size at Average Links/Node = 0.5? Figure 2 shows a histogram the group sizes for one simulation run.
Figure 2. Distribution of group sizes at the Critical Point. The largest groups are too
far off to the right to show on the histogram. The largest group in the system
at the critical point has 118 nodes, or about 12% of the nodes.
I had a good trail run with Hal (from HalsHoP) today. We did a slightly adjusted version of Steven Bragg's #21 Golden Gate Canyon Loop (from his guidebook Run the Rockies). Because we were coming from Boulder, we approached via Hwy 72 and started from the Buffalo Trail Head. This means we started with a steep descent before the climb up to the pass near Windy Peak. We did the loop in about 90 minutes.
www.flickr.com
|
For the last week or so I have been revising some code I wrote a couple of years ago when I first read Barabasi's excellent book, Linked. The refactored code is a better object-oriented design and my use of Python has improved. I will make the code available here soon.
To get to some results quickly, I started with a simple case: a network in which the nodes are randomly linked. For a network (graph) where the nodes are linked (edges) randomly, some basic results are fairly easy to get to and straight forward reproduce.
The node rank (or degree) is the number of links a given node has to other nodes. If one starts with 100s of nodes, but only a few links, it is easy to see that the node rank of most of the nodes will be zero. As the number of links in the network increases, the rank distribution evolves. What is the distribution of node rank? How does it depend on the ratio of links to nodes?
Below are three examples cases from the network simulations. The first figure shows the distribution of rank for the case when there are half as many links as nodes (l = .5 n), the second figure shows the distributino when the number of links equals the number nodes (l = n), and the third figure shows the case when links are about 2.5 times the number of nodes (l = 2.5 n). The distributions are said to be Poisson which the results here resemble--will have to test for that later.
Figure 1. What I labeled "Node Rank" is also called "Degree" in the literature and is the number of nodes one link away from a given node.
Figure 2
This trail is a lot of fun. I have hiked this trail many times and usually start near the hospital, crossing over the ridge and making a loop up the south west side, then back down the valley trail. The views are great and the trail challenging and fun. There are a lot of birds along the creek in the valley, so take your binoculars if you are interested.
This is trail #11 in Steven Bragg's Run the Rockies. Today I walked the trail in about 90 minutes.
| www.flickr.com |