Survivorship Bias | Examples & Definition
Survivorship bias is a term used to describe a type of selection bias. It occurs when a set of data for analysis is skewed by excluding certain data points due to the selection process. When the surviving data are examined as the only relevant examples, they produce inaccurate results because relevant data have been excluded.
What is survivorship bias?
When selection criteria are used in a study, there is always the risk that the subjects selected (the “survivors”) will not tell the whole story. Because certain subjects have been excluded, the study fails to consider the whole range of possible outcomes or causes.
For example, if successful firms in a sector of the economy are studied, then those firms that fail or struggle will be excluded from the analysis. As a result, the study will not be able to analyze possible causes of failure or struggle.
Is survivorship bias important?
Any bias that undermines the reliability of research or analysis is important. Some of the consequences of only examining some of the data include:
- Overoptimism. If a study only examines successful people or entities, it can easily be overoptimistic about the chances of success.
- Misinterpretation. It is easy to make false assumptions about cause and effect when excluding some data. For example, many tech industry superstars were high school dropouts, but their success was despite dropping out, not because of it.
- Incomplete decision-making. Interviewing gold medal-winning athletes might lead you to conclude that “focus” and “visualization” are what leads to success in athletics. However, it is also likely that the runner finishing last in the heats was also focusing and visualizing success.
Survivorship bias doesn’t just pose problems in academic research but it can color our judgment in other ways too.
Preventing survivorship bias
A good research design will help you avoid potential problems with survivorship bias (and other research pitfalls). There are several steps you can take to reduce the risk of survivorship bias occurring:
- Choose your data sources carefully, making sure that they include, for example, studies that report a whole range of outcomes.
- Reflect on what might be missing from your data. Studying the main injuries suffered by marathon runners by interviewing race finishers misses the important data set of those who didn’t finish at all.
- Be careful with data “cleaning.” Removing outliers might make sense in general terms, but always consider that outliers might be showing something important and worth considering.
As with many research problems, being aware of survivorship bias can help you to avoid falling into the trap.
Frequently asked questions about survivorship bias
- What are some examples of selection bias?
-
There are many types of selection bias, including:
- Attrition bias
- Sampling bias
- Survivorship bias
- Self-selection bias
- Undercoverage bias
- Non-response bias
- What is a historical example of survivorship bias?
-
During World War II, early studies of damage inflicted on US bombers focused on the damage sustained by planes that made it back to their bases. The decision was made to reinforce the areas most often damaged by enemy fire.
It was soon realized, however, that this was excluding the most important sources of data—the planes that never made it back to base. It became apparent that the most important places to reinforce the craft were where they had not been hit. Because the planes that were hit there hadn’t returned.
This is an excellent historical example of survivorship bias because the planes were literally the survivors, but they lacked the most important data.