This review was first posted at ScraperWiki.
I am on a bit of a graph theory binge, it started with an attempt to learn about Gephi, the graph visualisation software, which developed into reading a proper grown up book on graph theory. I then learnt a little more about practicalities on reading Seven Databases in Seven Weeks, which included a section on Neo4J – a graph database. Now I move on to Linked by Albert-László Barabási, this is a popular account of the rise of the analysis of complex networks in the late nineties. A short subtitle used on earlier editions was “The New Science of Networks”. The rather lengthy subtitle on this edition is “How Everything Is Connected to Everything Else and What It Means for Business, Science, and Everyday Life”.
In mathematical terms a graph is an abstract collection of nodes linked by edges. My social network is a graph comprised of people, the nodes, and their interactions such as friendships, which are the edges. The internet is a graph, comprising routers at the nodes and the links between them are edges. “Network” is a less formal term often used synonymously with graph, “complex” is more a matter of taste but it implies large and with a structure which cannot be trivially described i.e. each node has four edges is not a complex network.
The models used for the complex networks discussed in this book are the descendants of the random networks first constructed by Erdős and Rényi. They imagined a simple scheme whereby nodes in a network were randomly connected with some fixed probability. This generates a particular type of random network which do not replicate real-world networks such as social networks or the internet. The innovations introduced by Barabási and others are in the measurement of real world networks and new methods of construction which produce small-world and scale-free network models. Small-world networks are characterised by clusters of tightly interconnected nodes with a few links between those clusters, they describe social networks. Scale-free networks contain nodes with any number of connections but where nodes with larger numbers of connections are less common than those with a small number. For example on the web there are many web pages (nodes) with a few links (edges) but there exist some web pages with thousands and thousands of links, and all values in between.
I’ve long been aware of Barabási’s work, dating back to my time as an academic where I worked in the area of soft condensed matter. The study of complex networks was becoming a thing at the time, and of all the areas of physics soft condensed matter was closest to it. Barabási’s work was one of the sparks that set the area going. The connection with physics is around so-called power laws which are found in a wide range of physical systems. The networks that Barabási is so interested in show power law behaviour in the number of connections a node has. This has implications for a wide range of properties of the system such as robustness to the removal of nodes, transport properties and so forth. The book starts with some historical vignettes on the origins of graph theory, with Euler and the bridges of Königsberg
problem. It then goes on to discuss various complex networks with some coverage of the origins of their study and the work that Barabási has done in the area. As such it is a pretty personal review. Barabási also recounts some of the history of “six degrees of separation
”, the idea that everyone is linked to everyone else by only six links. This idea had its traceable origins back in the early years of the 20th century in Budapest.
Graph theory has been around for a long while, and the study of random networks for 50 years or so. Why the sudden surge in interest? It boils down to a couple of factors, the first is the internet which provides a complex network of physical connections on which a further complex network of connections sit in the form of the web. The graph structure of this infrastructure is relatively easy to explore using automatic tools, you can build a map of millions of nodes with relative ease compared to networks in the “real” world. Furthermore, this complex network infrastructure and the rise of automated experiments has improved our ability to explore and disseminate information on physical networks. For example, the network of chemical interactions in a cell, the network of actors in movies, our social interactions, the spread of disease and so forth. In the past getting such detailed information on large networks was tiresome and the distribution mechanisms for such data slow and inconvenient.
For a book written a few short years ago, Linked can feel strangely dated. It discusses Apple’s failure in the handheld computing market with the Newton palm top device, and the success of Palm with their subsequent range. Names of long forgotten internet companies float by, although even at the time of writing Google was beginning its dominance.
If you are new to graph theory and want an unchallenging introduction then Linked is a good place to start. It’s readable and has a whole load of interesting examples of scale free networks in the wild. Whilst not the whole of graph theory, this is where interesting new things are happening.