Albert-Laszlo Barabasi’s Linked: A Summary, Part II

The first major concept in chapter ten is innovation and innovators. These are the people, usually a small group, willing to take risks with brand new ideas and products (127). We then learn that innovations spread from innovators to hubs, which then send the information out along their numerous links.

Barabasi defines hubs, in the context of social media and human networks, as the statistically rare, highly connected individuals who keep social networks together (129). He restates the definition of hubs by writing that often referred to as “opinion leaders”, “power users”, or “influencers” in marketing terms, human hubs are people who communicate with more people about a certain product than the average person (129). Because of their numerous contacts they are among the first to notice and use the experience of innovators (130).

In the second major concept of chapter ten, Barabasi next asks why some inventions, rumors, and viruses take over the globe, while others spread only partially or disappear (131). This is where the reader is introduced to the threshold model, where a threshold is assigned to each individual quantifying the likelihood that s/he will adopt an innovation (131).

We then learn about the spreading rate, which is the likelihood that an innovation will be adopted by a person introduced to it. But, the spreading rate by itself is not enough to decide the fate of an innovation, Barabasi tells us. For that, the critical threshold has to be known.

Critical threshold – a quantity determined by the properties of the network in which the innovation spreads. If the spreading rate of the innovation is less than the critical threshold, it will die out shortly. If it is over the threshold, the number of people adopting it will increase until everybody who could use it does (131).

Barabasi writes that for decades nobody questioned the spreading rate and critical threshold paradigm, but that “recently” (circa 2002) it had been learned that some viruses and innovations were oblivious to it. This is because in scale-free networks the critical threshold disappeared, making viruses traveling on them “practically unstoppable”, due to the uneven topology of the Internet (135). An infected hub will pass the virus to all the other computers linked to it, and in scale-free networks those infected computers have a good chance of surviving the virus while still passing it on. Barabasi writes that these results are not limited to computer viruses (135).

We move on to learn about Paul Baran who, in 1964 while employed at RAND Corporation, suggested three possible architectures for the Internet – centralized, decentralized, and distributed – when tasked to develop a communication system that would survive a nuclear attack:

    • Centralized – Topology is star-like; along with the decentralized network dominated the structure of communications systems of the time.
    • Decentralized – Topology is sets of stars connected to form one large star; along with the centralized network dominated the structure or communications systems of the time.
    • Distributed – Topology is mesh-like; redundant so if some nodes fail other paths maintain the connection between the remaining nodes.

According to Baran, the only topology that could survive a nuclear attack was the distributed model. However, Barabasi states that Baran’s distributed network could only have worked and become a reality “if the Internet had continued to be regulated and maintained by the military.” (147)

Next, we are introduced to Internet mapping and given brief introductions to a couple of men and an organization who pioneered the field. Bill Cheswick and Hal Burch produced a map called the millennium map depicting the Internet’s topology on January 1, 2000. The Cooperative Association for Internet Data (CAIDA), monitors everything about the Internet, from “traffic to topology”.

Barabasi writes that there are “important practical reasons” for needing a global Internet map. He asserts that without knowing the Internet’s topology, it is not possible to design better tools and services to use it. He also asserts that the people who designed the Internet’s basic structure, still in place today, never imagined today’s uses of it, such as email or the World Wide Web (149).

According to Barabasi, the World Wide Web is an example of a success disaster and claims had its original creators foreseen how it would be used they would have designed a different infrastructure, resulting in a smoother experience (149).

Success disaster – When the design of a new function escapes into the real world and multiplies at an unseen rate before the design is fully in place.

We go on to learn that even though it was designed by humans, the Internet has “all the characteristics of a complex evolving system, making it more similar to a cell than a computer chip.” (149)

Chapter twelve opens by telling us of the bold claims made by Internet search engines in the early days of the Web. The leading search engines of the day boasted that they covered the entire Web. But, a research paper, titled Searching the World Wide Web (PDF), published in the journal Science in April 1998 “undermined” the search engines’ claims (162).

Lee Giles and Steve Lawrence built a meta search robot that visited search engines and asked them to fetch documents containing the word “crystal”. HotBot returned the largest number of documents, but they only covered 34 percent of the Web. AltaVista turned out to cover only 28 percent of the Web, and Lycos a mere 2 percent (163). Lawrence and Giles’ study awakened the public to the embellished claims of the search engines, but it also revealed that in 1999 there was huge swaths of the Internet that wasn’t seen – a full 60 percent (164). However, Barabasi writes that the topology of the Web limits our ability to see everything on it, anyway, because the Web is made up of four continents.

The first continent is called the central core. It contains about 25 percent of all Webpages and is home to all major sites. It’s easy to navigate because there is a path between any two documents housed on it (167). The second continent is called IN. It is a large as the central core but not as easy to navigate. The central core can be reached from it, but there is no way back to IN from there.

The third continent is called OUT. It, too, is as large as the central core but, like IN, is harder to navigate. It can be reached from outside with no problem, but once you get in you cannot get out. The fourth continent is comprised of tendrils and disconnected islands. They are isolated groups of pages, linked to each other, that cannot be reached from the central core and do not have links coming back to it. This continent has about 25 percent of all Web documents as well, and some of the isolated groups can contain thousands of Web docs (168).

The major concept of chapter thirteen is the network of the human cell. Barabasi teaches us that there are three network types – all scale-free, of course – in a human cell and two gene functions (182-183).

 Three Types of Human Cell Networks:

    • Metabolic – A web of hundreds of multistep intracellular biochemical reactions. Nodes can be simple chemicals or more complex molecules made of dozens of atoms. Links are the biochemical reactions that take place between these molecules.
    • Regulatory – Controls everything within a cell, from metabolism to cell death. Nodes are the genes and proteins encoded by the DNA molecule. Links are the biochemical reactions between these components.
    • Cellular – A sum of all cellular components such as genes, proteins, and other molecules connected by all physiologically relevant interactions, ranging from biochemical reactions to physical links. Contains all metabolic, protein-protein, and protein-DNA interactions present in the cell.

Barabasi explains, on page 183, that genes play two roles in the cellular network: structural and functional. In their structural role, they determine the scope and make of proteins and pass that information on to future generations (heredity). Genes’ structural role can be “unearthed” from its sequence. The functional role of genes, according to Barabasi, is apparent only in the dynamic context in which a gene interacts with many other components of a cell.

In chapter fourteen, we the most important thing we learn is that the networks behind all twentieth century corporations have the same structure; a tree. Barabasi writes that at the root is the CEO and the branches are the lower level managers and workers, and despite its pervasiveness there are many problems with the corporate tree (201). First, he says, information must be carefully filtered as it rises in the hierarchy, because if not filtered well when it reaches the top of the tree the overload could be huge.

Secondly, he maintains that integration leads to unexpected organizational rigidity and uses Ford’s car factories as a practical example, as it was one of the first manufacturing plants to fully implement the hierarchical organization. According to the author, Ford’s assembly lines became so tightly integrated and optimized that even small changes in the design of a car required shutting down the factory for weeks or months. Optimization, Barabasi writes, leads to Byzantine monoliths, which are organizations that are so over-organized that they are inflexible and unable to respond to changes in the business environment (201).

Barabasi sums up the tree model as being best suited for mass production, which he says was the way of economic success up until recently (2002), but that now economic value was found in ideas and information, a paradigm change dubbed the information economy (201). He then tells us that the most visible element of this remaking is a shift from a tree to a web or network organization, flat and with lots of cross-links between the nodes; and that companies wanting to compete in a fast-moving marketplace are shifting from a static and optimized tree into a dynamic and evolving web, offering a more flexible command structure (202).

Barabasi ends Linked in chapter fifteen by telling us one important thing: network thinking is poised to invade all domains of human activity and most fields of human inquiry. He tells us that it is not just a helpful point of view or tool, but that networks are by their very nature the fabric of most complex systems, and that nodes and links deeply infuse all strategies aimed at approaching our interlocked universe (222).


Albert-Laszlo Barabasi’s Linked: A Summary, Part I

Some call it the circle of life and others call it cosmic consciousness. No matter what anybody calls it, we all acknowledge the same universal truth, that we are all connected, even if we do not know exactly how.  In his book Linked, author Albert-Laszlo Barabasi explores the how.

Barabasi states that his book has a simple aim: to get us to think networks – they are present everywhere and we just need an eye for them he says (7). Linked is supposed to help us develop this eye, but be forewarned: this book is not an easy read. Barabasi is a physicist, and despite trying hard to make this book user-friendly, the subject matter just doesn’t lend itself to the task.

The first chapter, the Introduction, opens with Barabasi linking a twenty-first century teenage computer hacker named MafiaBoy to the first century Apostle Paul, and asserting that both were masters of the network. From his bedroom, the teen hacker orchestrated a distributed denial of service attack (DDoS) that was able to crash the websites of some of the biggest names in e-commerce back in the year 2000. Nearly 2,000 years before the, then, biggest denial of service attack on the Internet, a reformed persecutor of Christians had a conversion experience and afterward walked nearly 10,000 miles, over twelve years, spreading the message and faith of a man whom he’d never met.

From this reach of an opening, we jump to the book’s first scientific concept – reductionism. Barabasi writes that reductionism tells us that to comprehend nature, we first must decipher its components, assuming that once we understand the parts it will be easy to understand the whole (6). He then tells us a secret: we’ve been doing it wrong.

Barabasi believes that reductionism, which according to him was the driving force behind much of the twentieth century’s scientific research, is the wrong approach and has resulted in us taking apart the universe and having no idea how to put it back together. He goes on to say that after spending trillions of research dollars to disassemble nature in the last century, we are just now acknowledging that we have no clue how to continue except to take it apart further (6) because putting it back together turned out to be harder than scientists thought it would be.

So, why would it be harder to reassemble the universe than take it apart? Barabasi answers that question with one word – complexity. He writes that nature is not a well-designed puzzle with only one way to put it back together and that in complex systems the components can fit in so many different ways that it would take us billions of years to try all the combinations (6). Well, then, how did nature do it? According to the author, nature exploits the laws of self-organization whose roots, he writes, are still largely a mystery.

In chapter two, Barabasi introduces the reader to the random universe. He tells us way too much minutiae about Leonhard Euler, who in 1736 introduced the idea of graphs and unintentionally created a branch of mathematics known as graph theory, which today is the basis for our thinking about networks.

We then learn about Paul Erdos and Alfred Renyi, who together in 1959 introduced the random network theory model. Random network theory says that nodes in a network connect to each other randomly and, according to Barabasi, has dominated scientific thinking about networks since being introduced in 1959 (23).

Chapter three of Linked introduces us to something we have all probably heard of, just not like this: the six degrees of separation. Although it would not be given its catchy title until more than sixty years later, the concept was first introduced in 1929 by Hungarian writer Frigyes Karinthy in his short story “Lancszemek” (PDF) or, in English, “Chains” (26), which also made it the first time the concept was ever published. A character in the story bet the other people that he was with that they could name any person on earth and through at most five acquaintances, one of which he knew personally, he could link himself to the chosen one, and does so.

In 1967, Harvard professor Stanley Milgram rediscovered Karinthy’s concept and turned it into a much celebrated and groundbreaking study of our interconnectivity (27). The actual term “six degrees of separation” was not coined until John Guare‘s 1991 stage play of the same title (29).

Chapter four opens by introducing us to Mark Granovetter and his paper The Strength of Weak Ties (PDF), which Barabasi writes is one of the most influential, and most cited, sociology papers ever written (42). In the paper, Granovetter proposed that when it comes to finding a job, getting news, launching a restaurant, or spreading the latest fad, our weak social ties are more important that our strong friendships (42).

The thinking behind this is that the people we are already close to are of limited help to our job search because they move in the same circles we do and have access to pretty much the same information that we have access to. It is by accessing our weak ties, the people with whom we are only acquainted, that we gain access to new information and/or opportunities that we didn’t have before. Our weak ties do not move in the same circles we move in, so they will have access to different information that may be of more help to us in the job search.

After learning about Granovetter’s weak ties, we are introduced to Duncan Watts. Watts was working on his PhD in applied mathematics when he was asked to investigate how crickets synchronize their chirping. While doing so, he kept coming up with more and more questions about how the cricket’s network (there are different types of networks) affected their synchronized chirping and approached his advisor Steven Strogatz for assistance (46).

The result was that Watts and Strogatz introduced a quantity called the clustering coefficient, which is obtained by dividing the number of actual links by the number of possible links. A number close to 1.0 means that all the links are close links (47). This is now known as the Watts-Strogatz model. Barabasi uses the clustering coefficient to introduce readers to the Erdos number.

Paul Erdos published over 1,500 papers with 507 coauthors. According to Barabasi, it is an “unparalleled honor” to be counted among his hundreds of coauthors, and short of that it is a great distinction to be only two links from him. Barabasi writes that to keep track of their distance from Erdos, mathematicians introduced the Erdos number. Erdos has Erdos number zero, coauthors one, those who wrote a paper with an Erdos coauthor two, and so on (47).

Barabasi writes that most mathematicians turn out to have rather small Erdos numbers, being typically two to five steps from Erdos, although his influence reaches well beyond his immediate field (48). He continues, writing that Erdos numbers demonstrate how the scientific community forms a highly interconnected network, and the smallness of most Erdos numbers indicates that “this web of science” truly is a small world and is a small-scale example of our social network (48).

Having familiarized us with how to measure clustering using the clustering coefficient, and shown how it could be adapted for real world use with the Erdos number, Barabasi next introduces us to the concept of clustering, itself. He begins by telling us that Watts and Strogatz’s most important discovery is that clustering does not stop at the boundary of social networks (50).

Due to the work of Watts and Strogatz, Barabasi writes that we now know that clustering is present on the Web; we have spotted it in the physical lines that connect computers on the Internet; economists have detected it in the network describing how companies are linked by joint ownership; ecologists see it in food webs that quantify how species feed on each other in ecosystems; and cell biologists have learned that it characterizes the fragile network of molecules packed within a cell (51).

From the ubiquity of clusters, Barabasi moves us on to the concept of connectors. He quotes Malcolm Gladwell‘s book The Tipping Point : “Sprinkled among every walk of life… are a handful of people with a truly extraordinary knack of making friends and acquaintances. They are connectors.” (55) In scientific terms, connectors are nodes with an “anomalously large” number of links (56), and according to the author, a random universe, which Erdos and Renyi believed, does not support them (62).

Barabasi then introduces us to hubs. He writes that the architecture of the World Wide Web is dominated by a few highly connected nodes, or hubs (ex: Yahoo! or that are extremely visible (58). He goes on to say that the discovery that on the Web a few hubs grab most of the links initiated a frantic search for hubs in many areas, with startling results: Hollywood, the Web, and society are not unique. Hubs surface in the cell, and exist on the molecular level among many other places.

In chapter six, we are introduced to power laws. Barabasi writes that most quantities in nature follow a bell curve, but on occasion nature generates quantities that follow a power law distribution instead of a bell curve distribution (67). The distinguishing feature of a power law is not only that there are many small events but that they coexist with a few very large ones (67). We are then introduced to the concept of scale-free networks, which are networks with power law degree distributions (70).

Power law distributions predicts that each scale-free network will have several large hubs that will define the network’s topology; that is, how a network is organized and laid out (68). Barabasi writes that with the realization that most complex networks in nature have a power law degree distribution, the term “scale-free networks” was quickly adopted by most academic disciplines that dealt with complex webs (70).

Chapter seven begins with Barabasi telling us that the Erdos-Renyi and Watts-Strogatz network models assumed that there were a fixed number of nodes that remained unchanged for the life of the network, thus making that network static (83). He goes on to explain that real networks, as opposed to simulated ones, are not static and that growth should be factored into network models (83).

A growing network starts from a tiny core and nodes are added one after another. But, the links that connect to the nodes are not all equivalent to each other. Barabasi writes that there are clear winners and losers, with the oldest nodes having the advantage due to having had the most time to collect links (83).

He goes on to explain that the Webpages we prefer to link to are not ordinary nodes, but hubs – the better known they are the more links point to them, and the more links they attract the easier it is to find them on the Web, and the more familiar we are with them. Thus, Barabasi introduces the reader to the concept of preferential attachment. When deciding which Webpages to link to, chances are that we will link to the ones we know and/or the most well-connected pages (85).

Barabasi states that preferential attachment helps the more connected (popular) nodes (Webpages) get a disproportionately large number of links, attracting new links at a rate proportional to the number of its current links (88). New Webpages are at a disadvantage because they have not existed long enough to attract links pointing to them. The chapter ends by asking how he newer Webpages, which he calls “latecomers”, gain popularity in a system where the more established and popular sites flourish and continue to grow. Restated, the question is how do newer nodes become connected in a system where the winner takes all?

In chapter eight, Barabasi addresses the process that separates the winners from the losers: competition in complex systems (95). He introduces us to the fitness model – in a competitive environment each node has a certain fitness. To explain the concept of fitness he gives a practical example: fitness is your ability to make friends relative to everybody else in your neighborhood.

Fitness is a quantitative measure of a node’s ability to stay in front of the competition (95). Nodes with higher fitness are linked to more often, and independent of when a node joins the network, a fit node will soon leave behind all nodes with smaller fitness (97). Barabasi writes that all networks fall into two fitness categories: fit-get-rich or winner-takes-all.

In the fit-get-rich model, the fittest node will grow to become the biggest hub, but its lead will never be significant as it will be followed closely by a smaller node that has almost as many links. In the winner-takes-all model, the fittest node grabs all of the links, leaving little, if any, for the rest of the nodes. Barabasi writes that when the winner takes all, there is no room for a potential challenger (103).

In chapter nine, Barabasi seeks to answer the question: How long will it take a network to break into pieces once we randomly remove nodes? Restated, how many routers must be removed from the Internet to break it into isolated computers that cannot communicate with each other? (112)

Barabasi and his team of graduate students conducted computer simulations on scale-free networks and found that a significant fraction of nodes can be randomly removed from any scale-free network (i.e. World Wide Web, cells, social networking)  without breaking it apart (113). These results indicate that scale-free networks’ resilience to errors is an inherent (built-in) property of their topology. This resilience is termed topological robustness.

Next, Barabasi’s team set out to find the source of scale-free networks’ topological robustness. Further simulations and experimentation revealed that topological robustness is rooted in the “structural unevenness” of scale-free networks because failures disproportionately affect small nodes since there are so many more of them. However, despite their higher numbers, small nodes contribute little to a network’s integrity (114).

Barabasi’s team discovered that scale-free networks break down only after all nodes have been removed, which he says that for all practical purposes is never (115). In laymen’s terms, this means that for the World Wide Web to completely, randomly fail, every single Internet router in the world would have to malfunction and fail all at the same time. The odds of that happening? Never. Unless, of course, you live in the make-believe world of the NBC television show Revolution. But, I digress.

Barabasi’s team next turned to a new set of experiments. No longer selecting nodes randomly, the team set out to find out what would happen if the Internet was attacked in a systematic and coordinated way. They set up computer simulations that directly targeted not the nodes, but the hubs (116).

The simulations took out the largest hubs first, one after the other. It was soon evident that while the remaining hubs compensated for the first, and largest, hubs’ demise and kept the Internet running, they were no longer able to do so after several hubs were taken out. Barabasi’s team observed that, in their simulation, large chunks of nodes were falling off the network, becoming disconnected from the main cluster.

As the team pushed forward with the simulation, they observed the network’s “spectacular” collapse (116). The team also observed that the “critical point” (where a scale-free network started breaking), which was absent under random failures, was present during targeted attacks, and that removal of only a few hubs during attack broke the Internet into tiny, hopelessly isolated pieces.

Barabasi’s team had learned that vulnerability to attack is an inherent property of scale-free networks, and the proverbial “cost” to be paid in exchange for the resilience they displayed (117). Following one logical conclusion to the next, through their simulations and experimentation, the team had discovered scale-free network’s Achilles’ heel, thus the aptly applied title of chapter nine.