How simulating social networks revealed why I have no friends, and also no free time

The friendship paradox is an observed social phenomenon that most people have fewer friends than their friends have, on average. Sometimes it is stated more strongly that most people have fewer friends than *most* of their friends. It’s not clear from the popular articles about the topic whether the latter statement is generally true. Let’s investigate!

I would have thought this would be difficult to determine, and perhaps this is why it took until 1991 for someone to discover it. In Scott Feld’s original paper, Why your friends have more friends than you do, he suggested that this might be a source of feelings of inadequacy. But, I mean, it’s not like people keep tallies of each of their friend’s friends, let alone a list of their own friends, do they? This was probably true in 1991, but now in the age of Facebook we can do this with ease — and probably a lot of people do. 

How many faces are in my book

In fact, why not, I’ll have a go. I have 374 friends on Facebook. I’m not sure why. Anyway, let’s take 10 random friends and count the number of friends they have. Here’s the tally:

Friend 1 - 522
Friend 2 - 451 
Friend 3 - 735
Friend 4 - 397
Friend 5 - 2074
Friend 6 - 534
Friend 7 - 3607
Friend 8 - 237
Friend 9 - 1171
Friend 10 - 690

The average number of friends these friends have is 1042. So I have fewer friends than my friends have, on average. Also, of these 10, I have fewer friends than all but one of them — poor friend number 8. 

Okay, but this is not a paradox — especially if you know me. How about we look at someone else in my network, say Friend 2. Friend 2 has 451 friends, and because Facebook is such a secure website that naturally preserves the privacy of its users, we can click on profiles of the friends of Friend 2 to find out how many friends they have. The numbers of friends that 10 random friends of Friend 2 have are: 790, 928, 383, 73, 827, 1633, 202, 457, 860, and 121. On average, Friend 2’s friends have 627 friends, which is more than 451. Also, six of these friends have more than 451 friends and so Friend 2 has fewer friends than most of their friends as well. If we repeated this exercise for all of my friends, we would still find that most of them have fewer friends than their friends even though most already have more friends than me!

You probably have a gut feeling that this is paradoxical. My intuition for why it feels paradoxical comes from the following analogy. Consider your height. Perhaps you are shorter than average. Maybe you have some friends that are also shorter than average. But, you expect to find at least half the people out that are actually taller than average. That’s kind of the intuition of “average” after all. Why isn’t the same true for friends, then? That is, even though myself and Friend 2 have fewer friends than our friends, surely those people out there with lots of friends have more friends than their friends. This is true, but these popular people are extremely rare, and that is the difference between friend numbers and height, for example. 

In a typical social network there are lots of people with few friends and few people with lots of friends. The very popular people are counted in many people’s friend circles. And, the unpopular people are not counted in many friend circles. Take my Facebook friendship as an example again. I have, apparently, few Facebook friends. So, it would seem that many people have more friends than me. But — and here is the most important point — those people are unlikely to count me and my low number of friends in their friend circles. This is evidenced by the fact that Friend 2 and I only have 2 friends in common. So all of those friends of Friend 2, when they go to count the number of friends their own friends have, aren’t going to have me and my low friend count skewing the distribution in their favor. 

Is this why the “karate kid” was such a loner?

Let’s look at some smaller examples where we can count everyone and their friends in the network. The most famous social network (at least to network theorists) is Zachary’s karate club. It looks like this.

Zachary’s karate club. Node colors guide the eye and are marked based on the number of friends each person has.

(At this point I’d like to quickly mention that this post is duplicated in a Jupyter notebook which you can play with on Google Colab.)

This shows 34 people in a karate club (circa 1970) and who they interacted with outside class. Colors simply guide the eye to how many friends each member has. You can see one person only has 1 friend and another has a whopping 17! Already this graph is too big to count things by hand, but a computer can step through each node in the network and count the friends and the friends of friends. In Zachary’s karate club, then, we have,

The fraction of people with fewer friends than their friends have on average is 85.29%.

The fraction of people with fewer friends than most of their friends is 70.59%.
Histogram of friends and friends of friends in Zachary’s karate club. The data is collected by looking at the friends of each friend for every person in the network. The distribution of those friends can be summarized by the mean and median. For each friend of each person, those numbers are tabulated and plotted in the lower row.

The above histograms show the data in detail. You can see the key feature again of a few people with lots of friends. Obviously those few people are going to have more friends than their friends have. But that’s the point — there are only a few such people! Once we look at the full distribution of “friends of friends” we find it flattens out significantly, and it is more likely to find a large number of friends. The lower two histograms show the mean and median number of friends of each person in the network. Note that in either case, each person has friends with roughly 9 friends. This much more than the vast majority of people in the network! Indeed, only 4 people out of 34 have more than 9 friends. 

Go big or go… actually it doesn’t seem to matter how big we go

Now let’s go back to Facebook. This time we’ll use data collected from ten individuals, anonymized and made public. It includes roughly 4000 people connected to these 10 individuals in various social circles. The social network looks like this.

The Facebook Ego Graph, showing community structure and clustering around popular individuals.

We can do the counting of friends and friends of friends for this network as well. The details look like this.

Histogram of friends and friends of friends in the Facebook Ego Graph. The data is collected by looking at the friends of each friend for every person in the network. The distribution of those friends can be summarized by the mean and median. For each friend of each person, those numbers are tabulated and plotted in the lower row.

In summary, we find,

The fraction of people with fewer friends than their friends have on average is 87.47%.

The fraction of people with fewer friends than most of their friends is 71.92%.

Very similar! But perhaps this is just a coincidence. It’s only two data points after all. How do we test the idea for any social network? Simulation!

Fake it ’til you make it

Simulation is a tool to understand something by way of studying scale models of it. A model airplane wing in a wind tunnel is a classic example. Today, many simulations are done entirely on computers. In the context of social networks there are many models, but out of sheer laziness we will choose the so-called Barabási–Albert model because it is already implement in NetworkX, the computer package I’m using. If we create a mock social network of 34 people (same number as the karate club), it will look something like this.

An instance of the Barabási–Albert model on 34 nodes. It is constructed node by node, connecting each new node to two previous nodes, with a preference for highly connected nodes. This is a example social network.

It doesn’t look all that much different from the karate club, does it? The numerical data are similar as well.

Histogram of friends and friends of friends in the example Barabási–Albert model graph above. The data is collected by looking at the friends of each friend for every person in the network. The distribution of those friends can be summarized by the mean and median. For each friend of each person, those numbers are tabulated and plotted in the lower row.

In this example, the numbers of interest are,

The fraction of people with fewer friends than their friends have on average is 79.41%.

The fraction of people with fewer friends than most of their friends is 70.59%.

This is nice, but the real beauty of simulation is the ability to rapidly test multiple examples. The above is just one mock social network. To get full confidence in our conclusion (relative to the assumptions of the model of course), we need to perform many simulations over randomly generated social networks.

Simulate all the graphs!

If we repeat the above exercise over 10,000 random social networks with 34 individuals, we find,

The fraction of people with fewer friends than their friends have on average is 79.31%.

The fraction of people with fewer friends than most of their friends is 65.31%.

So now we can say with some confidence that the friend paradox persists in any social network — at least one that has the same characteristics as the Barabási–Albert model with 34 people. Another thing we can easily do though is change the number of people in the network to see if the trend continues for large networks. Indeed, things get much worse if we increase the number of people. As the number of people in the social network increases, the fraction of people with fewer friends than their friends (in majority or on average) also increases.

The friendship paradox demonstrated over 100 randomly chosen social networks created with the Barabási–Albert model, for increasing network size.

Cool, cool. Anything less depressing to tell us?

It’s been shown that this paradox goes beyond friends as well. You are also probably deficient, when comparing yourself to your friends, in income, Twitter followers, and how happy you are — and this fact is probably not helping. But, okay, enough of this crap — we all feel terrible now! Surely there is something positive we can glean from all this, right? Yes! 

Since your friends have more contacts than you, they will probably contract a virus — or anything else transmitted through communities — before you. Indeed, researchers showed that instead of tracking randomly chosen people to gauge the spread of a disease, it was much more efficient to ask those random people to name a friend, and track that friend instead! In the study, the group of friends got sick an average 2 weeks before the originally chosen people. There are probably lots of other applications out there waiting to be found for the friendship paradox. 

Before we end, though, I want to ask one last question: does the friendship paradox have to happen in any social network? The answer to this is yes and no. For the statement of the paradox that uses averages the answer is yes and this can be proven mathematically. That is, the statement on average, people have less than or equal to the average number of friends of their friends is true for any social network you can conceive of. For the statement which uses majority (most of your friends have strictly more friends than you), the answer is no. The key ingredient is the existence — or non-existence — of popular individuals. We can even create completely random social networks for which the paradox does not hold. Consider the following network, again over 34 people.

This network (coming from the Erdős–Rényi model) is constructed by considering every person is a friend of every other person with some fixed probability (in this case 75%).

For this social network we have,

The fraction of people with fewer friends than their friends have on average is 44.12%.

The fraction of people with fewer friends than most of their friends is 44.12%.

You can see from the detailed distribution of friends the difference more clearly. The number of friends is evenly spread around the average. This is similar to looking at the height of people. So, roughly half the people have fewer friends than their friends, another rough half has more friends than their friends, and the rest have exactly the same number as the average of their friends. 

Histogram of friends and friends of friends in the example Erdős–Rényi model graph above. The data is collected by looking at the friends of each friend for every person in the network. The distribution of those friends can be summarized by the mean and median. For each friend of each person, those numbers are tabulated and plotted in the lower row.

Conclusion

Of course, it should be somewhat obvious that if everyone had exactly the same number of friends, then they would have the same number of friends as their friends! But this is also true when the distribution of friends is more evenly spread. What does this suggest? I guess it lends credence to claims of favorability of egalitarian communities. But it seems to be the case that more hierarchical networks (be they in friendships, corporations, Twitter followers, etc.) grow naturally as a way to accommodate the complexity arising from maintaining large numbers of connections. But that is the topic for another blog post. In the meantime,

Please don’t share this. Instead, tell a friend to share it.