What Is External Validity? | Definition, Threats & Example

External validity refers to the extent to which the findings of a study can be generalized to other populations, settings, and contexts beyond the specific one in which the study was conducted. In other words, it’s about whether the results can be applied to other people, places, and situations.

External validity example
A researcher creates an experimental design to investigate the influence of alcohol consumption on sleep. They use systematic sampling to draw a sample of 100 law students from a local university who drink regularly.

They’re invited to attend a get-together where their alcohol consumption is moderated. They’re monitored while they sleep at the university’s laboratory to control for confounding variables and to reduce the risk of bias. At the end, they fill out a survey with multiple-choice questions about their quality of sleep. The results show that increased alcohol consumption correlates with a lower quality of sleep.

The external validity of this study is low because:

  • The study was conducted in an artificial environment (laboratory), which makes it difficult to generalize results to different settings.
  • The sample consisted of law students only, which makes it difficult to generalize the results to different groups.
  • The sample only included people who drink regularly, which makes the sample unrepresentative of the population (sampling bias).

External validity is important because researchers want to apply the results from their experimental designs (often conducted in laboratories or artificial environments) to the real world.

Internal vs external validity: Trade-off

Internal validity and external validity are two related forms of validity that are essential to assess when conducting scientific research.

  • Internal validity refers to the extent to which a study’s design and methods ensure that the observed effects in the dependent variable are due to the independent variable and not to other factors. In other words, internal validity addresses the question: “Are the results of the study due to the intervention or treatment being tested, or are they due to some other factor?”
  • External validity refers to the extent to which a study’s findings can be generalized to other people, settings, and situations beyond the specific context of the study. It addresses the question: “Can we apply the findings of this study to other populations, settings, and contexts?”

There is always a tradeoff between internal and external validity. In order to increase internal validity, you have to control for extraneous variables. This, in turn, leads to less realistic settings, which lowers the external validity.

Internal vs external validity example
A researcher conducts a study to investigate the effect of a new medication on blood sugar levels in patients with type 2 diabetes. All participants are randomly assigned to a treatment or control group. The study is conducted in a controlled laboratory setting, and the researchers carefully monitor the patients’ diet, exercise, and medication to ensure that the only variable being manipulated is the new medication.

The results show a significant decrease in blood sugar levels in patients taking the new medication compared to the control group taking a placebo.

  • The study has high internal validity because the researcher controlled for extraneous variables (diet, exercise, medication adherence, placebo effect) to isolate the effect of the new medication.
  • The study has low external validity because the study was conducted in an artificial environment that may not reflect real-world scenarios and the sample was limited to patients who were willing to participate (which may not be representative of all patients with type 2 diabetes).

What is external validity?

External validity refers to the extent to which research results can be generalized to other contexts beyond the specific context in which the study was conducted.

In other words, external validity indicates whether the results of a study can be applied to other people, places, and times. It questions whether the findings are representative of the “real world” or are limited to the specific conditions of the study.

There are two main types of external validity:

Ecological validity

Ecological validity refers to the extent to which a research study accurately reflects the real-world environment and behaviors of interest. It’s concerned with how well the study’s methods and procedures capture the naturalistic context in which the phenomenon being studied occurs.

In other words, ecological validity asks:

  • Do the study’s methods and procedures accurately reflect the way people behave and interact in their everyday lives?
  • Do the results accurately capture the complex interactions and relationships between variables in real-world settings?
Ecological validity example
A group of researchers is interested in the effects of physical activity on students’ mental health.

They recruit 100 college students through simple random sampling. All students are asked to wear a Fitbit that tracks their physical activity over a period of 4 weeks. The students complete short, daily surveys with Likert-scale questions to collect ordinal data about their mental health.

The study has high ecological validity because it’s conducted in a real-world setting where students go about their daily routines without interruption (unobtrusive data collection). The daily surveys are designed to capture participants’ mental health experiences in real-time rather than relying on retrospective self-reporting or artificial laboratory settings.

Lastly, the longitudinal design (four-week period) allows researchers to capture natural fluctuations in mental health and physical activity over time, providing a more comprehensive understanding of the relationship between the two variables.

High ecological validity is important because it means that the study’s findings are likely to be more generalizable and applicable to real-world situations. Conversely, low ecological validity can lead to findings that are artificial or misleading.

Population validity

Population validity refers to the extent to which the results of a study’s sample are generalizable to a larger population or group of interest. To achieve high population validity, the sample should be large enough and represent the target population and its characteristics accurately.

To ensure a representative sample, researchers should use probability sampling methods (e.g., stratified sampling, cluster sampling) instead of non-probability sampling methods (e.g., snowball sampling, purposive sampling, quota sampling).

Population validity example
A researcher wants to study the effects of meditation on happiness levels at a large organization. They are interested in the influence of the type of meditation (nominal data) and the duration of the meditation (interval data).

They recruit participants through volunteer sampling (or self-selection sampling) by posting flyers at the employee entrance, resulting in a sample of 50 employees who sign up for the study. The participants are mostly young, white women from higher socioeconomic backgrounds.

The sample is non-random as participants self-selected themselves into the study. This could lead to an overrepresentation of certain groups (e.g., women) and an underrepresentation of others (e.g., men). Moreover, participants who have a prior interest in meditation might be more likely to report positive outcomes, which could skew the results.

This study has low population validity, which means the results of the sample can likely not be generalized to the larger population (all employees at the large organization).

External validity in psychology

External validity is important in all research but especially in medical and psychological research. Psychological research often forms the basis for developing effective treatments for mental health problems. High external validity helps ensure that the findings can be applied in various contexts to various people. It wouldn’t be ethical to apply results with low external validity to other populations without further research.

Moreover, if a study’s findings can be generalized, it strengthens the underlying theory and increases our understanding of the topic that’s being studied.

External validity in psychology example
A researcher conducts a study on the effectiveness of a new cognitive-behavioral therapy (CBT) program for treating depression in a small, highly educated population of people between 20 and 30 years old. The study is conducted in a university setting where participants are recruited through social media and flyers posted on campus.

  • The study only includes individuals with a narrow age range and educational background, making it difficult to generalize the findings to other populations.
  • The sample lacks diversity in terms of ethnicity, socioeconomic status, and other demographic factors, making it difficult to generalize the findings to a more diverse population.
  • The study is conducted in a university setting, which may not be representative of real-world settings where CBT would typically be applied.

Threats to external validity

There are several threats to external validity and it’s important to recognize and counter them in any research design.

External validity example: Threats
A researcher wants to investigate if people with sleep disorders can benefit from daily meditation. They recruit 100 participants between 55 and 65 years old who have been diagnosed with sleep disorders after trauma. They all receive treatment at the same facility.

Participants are randomly assigned to one of three groups:

  • They don’t take part in a meditation session (control group)
  • They take part in a 15 minute meditation session
  • They take part in a daily 30 minute meditation session

Their sleep quality is assessed at two points in time: before the intervention (pre-test) and after the intervention (post-test).

Threat Definition Example
History Outcomes are influenced by an unrelated event that happened prior to the study. Elections are scheduled right before the pre-test, resulting in a lower quality of sleep than usual.
Sampling bias The sample doesn’t accurately represent the target population. The sample only contains people who suffer from a sleep disorder after trauma. Their traits (e.g., they suffer from a lot of stress) might differ from other populations, like people who have sleep disorders because of irregular work hours.
Hawthorne effect Participants change their behaviors because they know they’re being studied. The participants actively engage in anxiety-reducing habits, such as reading before bedtime and listening to relaxing music, because they are aware they’re participating in the research.
Observer bias The researchers’ behaviors or traits accidentally affect the results. Since the study doesn’t have a single-blind or double-blind design, the researchers unconsciously motivate the participants who take part in the 30-minute meditation sessions more than the other participants.
Situation effect The location, time of day, setting, etc., harm the generalizability of the results. The first study was conducted during the winter. The study was replicated but during the summer, and there was no significant effect this time.
Testing effect The use of a pretest or posttest affects the results. The study suffers from recall bias due to participants’ increased familiarity with the test methods after the pretest. They report better sleep and less stress during the posttest.
Aptitude-treatment The dependent variable is influenced by interactions between individual variables and the traits of the group. The interaction between certain participant traits (e.g., suffering from trauma) and the meditation session (e.g., focus on relaxing) improve sleep quality. The findings couldn’t be reproduced for a second batch of participants, who have sleep disorders due to irregular working hours.

How to increase external validity

It is important to find a balance between maintaining high internal and external validity. There are three main ways to increase external validity:

  1. Natural setting: You can counter situation effects or testing effects by conducting the research in natural environments (field experiments).
  2. Replication: Replicating your research counters most threats by investigating generalizability to other contexts (e.g., different population, setting, or conditions).
  3. Random, or probability, sampling: You can counter research biases such as selection bias by using probability sampling methods that give every member of the population an equal chance of ending up in the sample.

External validity example

The level of external validity determines to what extent you can generalize your findings to other contexts.

High external validity example

High external validity example
A researcher conducts a randomized controlled trial (RCT) to evaluate the effectiveness of a new math curriculum in improving math skills among elementary school students.

The study is conducted in 20 public elementary schools across five different districts in a large urban area. The schools are randomly assigned to either the treatment group or the control group.

In the treatment group, the new math curriculum is implemented in all classrooms, while in the control group, the traditional math curriculum is continued as usual. The study measures math skills using a standardized test administered at the beginning and end of the school year.

The study has high external validity because:

  • The sample is representative, which increases the likelihood that the findings can be generalized to other contexts.
  • The sample size is sufficient to provide statistically significant results that are likely to be representative of the population.
  • The study is conducted in real-world settings, with teachers and students going about their regular activities. This increases its ecological validity.

Low external validity example

Low external validity example
Someone else also conducts a study to evaluate the effectiveness of a new math curriculum in improving math skills among elementary school students.

The study is conducted in a single, elite private school with a small student body of 100 students, all from high-income families. The new math curriculum is implemented by two external math teachers in all classrooms with high math scores, whereas the other classrooms follow the traditional curriculum.

The study measures math skills using a non-standardized test administered at the beginning and end of the school year. The study is conducted by the principal, who has a close relationship with the students and teachers.

The study has low external validity because:

  • The sample is not representative of the entire population (elementary school students), with only 100 students from a single, elite private school.
  • The students didn’t get randomly assigned to the new or traditional math curriculum, which means that any effects might only be true for students with high math scores.
  • The new math curriculum is taught by external teachers, disrupting the natural environment. This harms the ecological validity.

Frequently asked questions about external validity

Does random assignment increase external validity?

Random assignment can increase external validity, but it has a bigger impact on internal validity.

Random assignment helps to reduce confounding variables and ensures that the treatment and control groups are comparable in all aspects except for the independent variable.

This increases the confidence that any observed differences between the groups can be attributed to the treatment rather than other factors, which means an increase in internal validity.

It can also improve external validity because random assignment of participants prevents researchers from inadvertently selecting participants who may be more or less likely to respond to the treatment.

However, the external validity may still be limited by sampling bias if the participants are not representative of the target population, which is why choosing the appropriate sampling method is also important to ensure external validity.

A probability sampling method, such as simple random sampling, stratified sampling, cluster sampling, or systematic sampling, is always the best choice.

What kind of sample is best for external validity?

To ensure high external validity, it’s important to draw a sample that’s representative of the population you want to generalize to. It’s always best to choose a probability sampling (also known as random sampling) method for this.

The most popular sampling methods are stratified sampling, systematic sampling, simple random sampling, and cluster sampling.

A probability sampling method also increases other types of validity, such as internal validity, and it reduces bias.

Is this article helpful?
Julia Merkus, MA

Julia has a bachelor in Dutch language and culture and two masters in Linguistics and Language and speech pathology. After a few years as an editor, researcher, and teacher, she now writes articles about her specialist topics: grammar, linguistics, methodology, and statistics.