We Need to Let Go of the Bell Curve
Most human activities as well as many disciplines — from physics and biology to linguistics, finance, and computer science — follow a Pareto distribution instead of a “normal” Gaussian curve. In Pareto distributions, a small change in one variable is associated with a large change in another, because it reflects variables multiplied with each other rather than added to each other, as in the normal distribution. This is also referred to as a “power law.” This isn’t an obscure intellectual point, but instead carries serious practical consequences. Because of this error, our approach to most problems is, at best, suboptimal. What does this mean for business leaders? The author presents three practical implications for innovation, risk management, and people.
I recently had a conversation with one of our senior managers about our company’s new banking division; he told me that only 21% of our cardholders account for 80% of spending. That skewed situation worried him greatly, and he wondered what we could do to spread our lending book more evenly. I’ve had similar conversations with the fundraising manager of a nonprofit I chair: The bulk of the funding comes from some 20 donors, which she tells me is unsustainable. The way she sees it, the organization is heading toward a cliff. Both of these reactions reveal a common cognitive error that has profound implications for leadership.
Like the banking manager and the fundraiser, most of us view the world as largely Gaussian, which means we believe most things are distributed, or should be distributed, according to bell curves. In this world, most cardholders and donors, for example, would spend or contribute close to the average, and the remaining people would fan out symmetrically on each side of that average amount of money. The mean, median, and mode would all coincide; half the people would fall below the average, and the other half above. In this world, variables are independent and do not influence each other.
Why do we think that way? First, our brains are hardwired to find fairness rewarding and uplifting and are averse to inequality. A Gaussian world, with most people clustered around a stable average, feels fair and predictable. We also find symmetry particularly pleasing, whether in faces, art, or statistics. Moreover, most of our schooling is still based on “normal” distributions and Newtonian thinking, which breaks down reality into independent variables and cause and effect. This view of the world has permeated multiple disciplines, from medicine to statistics and management. Finally, there are indeed phenomena that follow Gaussian distributions. Take test scores, for example. The variables measured (the test scores) are the outcome of additive processes (the sum of the scores on each question).
Even though I learned about a multitude of other statistical distributions when I studied statistics and probability theory, I too held the intuitive view that most things follow a bell-shaped distribution.
Yet they don’t. Let me tell you why, and why it matters greatly.
About 10 years ago, after reading about cognitive biases, I was surprised to find out that most human activities, as well as many disciplines — from physics and biology to linguistics, finance, and computer science — follow a Pareto distribution instead of a “normal” Gaussian curve.
In Pareto distributions (named after economist Vilfredo Pareto, who in the early 20th century observed that 20% of people in Italy owned 80% of the land), a small change in one variable is associated with a large change in another, because it reflects variables multiplied with each other rather than added to each other, as in the normal distribution. This is also referred to as a “power law.” Instead of a symmetric bell curve, the distribution of observations or outcomes looks like a hockey stick with a long tail, as shown in the figure below. There are many observations of low values, and a small number of high values, or outliers.
Once you start looking, you’ll see that pattern almost everywhere, almost all the time. The frequency of words we use when we speak, the magnitude of earthquakes and hurricanes, the size of companies and cities, book sales, and the pattern of countries winning Olympic medals all follow power laws. Social media is no exception — for example, a U.S. study showed that just 25% of the most active Twitter users accounted for 97% of tweets. In our short-term insurance business, Discovery Insure, the worst 30% of drivers account for 60% of serious accidents. Covid-19 also spreads in a Pareto fashion: In two Indian states, 60% of new infections were found to be caused by less than 10% of people carrying the disease — a few “super-spreaders” — whereas another 71% did not infect anyone at all. (This transmission pattern has been observed in other countries as well.)
Why is the Pareto distribution the actual norm, instead of the “normal” Gaussian distribution? Perhaps even more importantly, why is it becoming even more so? Most things follow power laws because this is how interconnected complex systems behave. And power laws are becoming ever more ubiquitous because our world operates in increasingly interconnected complex systems. The more interconnected the complex systems, the more pronounced the power law.
Economies, supply chains, trade, and markets have become more intertwined and global. Information technology and transport have exponentially deepened the interconnection of the multiple systems we’re part of. In these networks, variables are not additive but instead influence each other, creating dynamic, reinforcing, and cascading processes that are nonlinear, multiplicative, and far less predictable. In fact, these systems are capable of “black swan” behavior, because in most Pareto distributions (unlike in Gaussian ones), the variance — which measures how dispersed the data points are around the mean in a distribution — is not well defined.
Besides being pervasive, these power laws are also remarkably stubborn. Regardless of what we do, a small number of data points — people, decisions, or other observations — still account for most of the results. Political systems designed to produce greater income equality, for instance, struggle to shift Pareto out of the way. Take China, which, despite its greater focus on income equality than most other countries, has a higher Gini coefficient than Germany and the United Kingdom. Such distributions also repeat themselves like Russian dolls, as we see in our health insurance business: The sickest 20% of people generate 79% of health care costs, and the same skewed distribution can be found within that 20% group (with the sickest 20% within that group responsible for nearly 60% of health care costs). If you keep drilling down into the numbers, you’ll keep finding that a relatively small number of people account for most of the costs.
Strategy & Execution
Must-reads from our most recent articles on strategy and execution, delivered once a month.
This disconnect between our Gaussian perception and the Pareto reality is not an obscure intellectual point, but instead carries serious practical consequences. Because of this error, our approach to most problems is, at best, suboptimal. Malcolm Gladwell, for example, has written about how the typical solutions meant to address homelessness — shelters and soup kitchens — have been ineffective because they’re based on the mistaken assumption that the majority of homeless people follow the average: average number of days without a roof, average cost per person to the public purse, or average reasons for being homeless. Yet on all these dimensions, homelessness follows a power law, too. In the words of Nobel laureate physicist Philip Anderson, we need to free ourselves from “average” thinking, or focusing on the mean, which, in most cases, is misleading. The joke that when Bill Gates walks into a bar, everyone in that bar becomes a millionaire on average, illustrates the point. Outliers and tails are dismissed as aberrations, when in fact they have the most impact — good and bad. A small viral event, for example, snowballs into a global coronavirus pandemic and economic disaster.
The realization that we live in a largely Pareto world — inherently unfair, asymmetric, and unpredictable — may feel unpleasant at first. The upside, however, is that systemic change in such a world is much easier and faster. In a Gaussian world, all elements within a system must shift for the entire construction to change, which is laborious, time consuming, and often impossible. In a Pareto world, on the other hand, a change in the tail shifts the entire system — for better or worse.
What does this all mean for business leaders? Here are three practical implications for innovation, risk management, and people.
Focus on bold decisions in the tail, rather than incremental change.
In a Pareto world, seemingly intractable problems become solvable through a positive change in the tail. This is how some cities like Denver have been able to make inroads on homelessness. They designed specific interventions focused on the chronically homeless, who account for most of the social services and health care costs and are the hardest cases to solve, but who make up a tiny fraction of the entire homeless population.
In a Pareto system, one individual or one decision can make an enormous difference. The vision and leadership of one entrepreneur like Steve Jobs, for example, can end up shaping an entire industry. To stretch this illustration further, imagine what impact Apple would have had if Jobs had started the company in Johannesburg instead of Silicon Valley; the company’s gross income today is equivalent to roughly half of South Africa’s GDP.
Similarly, the kind of radical innovation that can transform entire companies and industries happens in the tail. This is why the creation of the Vitality program was a game changer for Discovery: One outlier decision had, and continues to have, an enormous impact on our entire business. A few years after we launched our insurance business, an initial conversation with a chain of gyms about cross-selling sparked an entirely different idea: What if we created a program that rewarded people for doing healthy things? What if members who were part of this program could go to the gym for free? I still remember vividly the 10 minutes it took for that idea to take shape. The decision to implement it profoundly transformed our business. It laid the foundation for a new insurance model based on behavioral economics and shared value — a business model in which it makes good sense for Discovery, its customers, its suppliers, and its local community to give Vitality members the knowledge, tools, and incentives to live healthier lives.
A lot of smaller-tail decisions have contributed to turning the initial idea into what it is today. But Discovery’s success can be traced back to that initial tail decision, which remains the core of our identity and growth and has had an enormous impact through creating multiplicative shared-value benefits for members and for shareholders. It was the root of a system that connected behavior change with risk and reward.
I’m not saying that incremental improvements aren’t important — they are. I’m saying that radical innovation that can result in systemic and profound change starts with bold decisions in the tail, therefore that’s where leaders should focus their time and attention.
Embrace rather than fight the power law and identify tail problems early.
The shift of perspective toward a Pareto world also has implications for how we deal with risk and uncertainty. We spend enormous time and energy trying to “correct” nonlinear phenomena — like asymmetric fundraising and bankcard lending — that we perceive as abnormal and risky. There may be moral and fairness considerations to account for, but since power law distributions are the rule rather than the exception, and since they’re remarkably stubborn, they call for different solutions to deal with risk and a focus on the dynamics in the tail.
Just as innovation in the tail can shift entire systems for the better, a negative tail decision can bring down an entire system — a scenario that keeps me awake at night. So how do we deal with a Pareto world’s uncertainty and chaos? How do we make decisions? How do we avoid bad tail decisions, or correct them quickly before they become catastrophic?
One promising avenue is to combine extreme outcomes and the plausibility of various future scenarios — economist George Shackle’s Potential Surprise Theory — to help grapple with decisions in extreme uncertainty. Using traditional probabilities is problematic because they rely on predefined and mutually exclusive outcomes that are supposed to cover all possible scenarios — which, of course, no one can predict. A decision maker using Potential Surprise Theory, on the other hand, chooses among various possible scenarios based on a combination of the degree of disbelief or implausibility of possible outcomes and the expected potential gains and losses associated with each. This approach, unlike traditional probabilities, leaves room for surprises and new possibilities. So, think about plausibility and consequences, rather than probability. Think in a much broader way about the bad things that could potentially happen.
We indeed cannot predict or prevent black swan (i.e., bad tail) events. But learning to recognize and contain them early can avoid a disaster. When Covid-19 first broke out in South Korea, for instance, authorities’ early and decisive test-trace-isolate reaction contained the spread of the disease, compared to countries like the United States or Brazil. Similarly, we can learn to recognize and reverse dangerous tail decisions. Imagine what could have been avoided if Lehman Brothers, for example, had identified as a tail decision expanding its subprime mortgages and mortgage-backed derivatives business to the extent that a small decline in real estate value could wipe out its capital.
Create an A+ team to leverage A+ players’ impact.
People’s performance is still often measured using a Gaussian curve. In reality, a small number of outperformers consistently account for most of the impact. The implication is two-fold.
First, recruit and retain the best possible people across the board. The stars in the tail will still account for the largest impact on results — remember that power laws are stubborn — but this doesn’t mean that all recruitment and retention efforts should focus on them. Consistently attracting and retaining exceptional talent across the entire organization will lift the entire talent curve, with profound impacts on results. So, although a small percentage of people still account for a relatively large proportion of results, having better talent across the board can lift results in absolute terms.
Second, focus on improving team environments and dynamics at all levels. When the complex system that is a company gets more and better interconnected, the multiplicative impact of the star performers in the tail becomes amplified throughout the entire system.
. . .
Recalibrating our perspective from Gaussian to Pareto might sound arcane, but the practical implications are profound. Shifting the lens through which we understand the world impacts how we approach systemic change, how we make decisions and handle risk, and how we lead. And because our lives are made up of intricate and complex webs of human connections, from families and professional networks to the communities we live in, our entire lives follow power laws: A few key decisions — from whom we marry, to what careers we choose — end up having an enormous impact on our future. I believe our thinking follows the same power law. By correcting a handful of cognitive errors, starting with this one, we can radically transform our performance, our impact, and our entire life.