What Is Stratified Sampling? | Examples & Definition
Stratified sampling is a probability sampling method where researchers divide a population into homogeneous subpopulations (strata) based on specific characteristics, such as gender, age, or socioeconomic status. Every member of the population should be in precisely one stratum.
Another sample is then drawn for each stratum using a different random sampling method (e.g., cluster sampling). This way, researchers can estimate statistics (e.g., averages) for each subpopulation.
Stratified sampling is used when the characteristics of a population vary and researchers need to make sure that the sample is representative of the entire population. This sampling method ensures high external validity and generalizability and minimizes the risk of some research biases.
When do you use stratified sampling?
If you want to use stratified sampling, you need to be able to assign each member of the population to exactly one stratum (subgroup). Your groups need to be mutually exclusive (no one fits into more than one subgroup) and exhaustive (every participant fits into one subgroup).
A stratified sample is the best choice if you think subgroups will have different mean values for the variables you’re interested in.
Stratified vs cluster sampling
Stratified sampling and cluster sampling show overlap (both have subgroups), but there are also some major differences.
- Stratified sampling is a sampling technique in which a population is split into strata (subgroups) based on a specific characteristic. Next, you choose members at random from every stratum for data collection. Units (e.g., people) in each stratum are similar to one another with respect to the variable of interest (e.g., they are of the same age). A stratum does not resemble a miniature version of the population.
- Cluster sampling is a sampling technique in which the population can be naturally divided into clusters (e.g., because of geographical differences between groups). Data can be collected from the cluster as a whole without selecting participants based on a specific criterion. Each cluster is a miniature version of the population as a whole.
Step-by-step guide to stratified sampling
Stratified sampling consists of four steps:
- Define the population and subgroups
- Divide the population into strata
- Determine the sample size for each stratum
- Draw a random sample from each stratum
Step 1: Define the population and subgroups
As with other random sampling methods, start by defining the population you are interested in.
You also need to decide on the characteristic you’re interested in because you’ll divide the population based on this trait. Assigning each participant to a subgroup should be a clear and unambiguous process because each member can be placed only in exactly one stratum.
If you’re interested in multiple traits, you can stratify based on multiple characteristics, provided you can still unambiguously assign each participant to one subgroup. To calculate the total number of subgroups, multiply the number of levels for each trait.
Suppose you stratify based on age and gender, with three groups for the former and four for the latter. Then you would have a total of 3 x 4 = 12 subgroups.
Step 2: Divide the population into strata
You create a list of each member of the population and assign each entrepreneur to a stratum based on their age and gender identity.
You need to make sure that every member fits into exactly one stratum. This means the strata should cover the entire population without showing any overlap (i.e., they should be mutually exclusive).
If you multiply the number of levels from characteristic 1 (age) with characteristic 2 (gender identity), you have a total of nine groups. Each entrepreneur should be assignable to exactly one group.
Trait | Strata | Groups |
---|---|---|
Age |
|
|
Gender identity |
|
Step 3: Determine the sample size for all strata
You first need to decide whether your sample should be proportionate or disproportionate.
- A proportional sample means that the sample size of each stratum is the same as the share of the subgroup in the entire population. Subgroups that are less strongly represented in the larger population will therefore also be less strongly represented in the sample. For example, women generally make up a lower portion of the IT student population, so you also include fewer women in your sample.
- A disproportionate sample means the sample size of each stratum is not proportional to the share of the subgroups in the entire population. For example, men generally make up a lower portion of the nursing student population, but you still include an equal number of men, women, and people with another gender identity in your sample.
Researchers tend to use disproportionate samples when they want to draw statistical conclusions about an underrepresented or marginalized subgroup whose sample size would be too low if they used the actual proportions.
After deciding on the proportions, you can use a free online sample size calculator to determine the total sample size based on your estimated population size, chosen confidence level and margin of error, and estimated standard deviation. If your sample is too small, you can’t draw statistical conclusions.
Step 4: Draw a random sample for each stratum
For the final step, you use a different random sampling method to draw a sample for each stratum. Popular probability sampling methods for this step are simple random sampling or systematic sampling.
If you do this correctly, the random nature of these methods will ensure you draw a representative sample for all strata.
Stratified sampling advantages
A stratified sample has the following advantages:
-
Guarantees a diverse sample
A stratified sample reflects the diversity of your population because it’s guaranteed to contain participants from every subgroup of the population. However, this is an advantage that comes with most random sampling methods.
-
Guarantees equal variance between groups
If you want a similar level of variance for each stratum, the sample size for each subgroup should be similar (e.g., equal numbers of women, men, and people with a different gender identity).
With other sampling methods, you may have many participants from one subgroup but almost no participants from another subgroup.
-
Lowers the variance in the total population
The total population may be relatively heterogeneous, but subgroups are likely more homogeneous in nature.
Suppose you are investigating how a new teaching method affects the reading test scores of adults learning a new language. There is a good chance that both the original test scores and the possibly changed test scores are strongly correlated with the number of hours they’ve studied.
A stratified sample would help to get a better picture of the variable you’re interested in because you can group participants by the number of hours they studied. This way, you lower the variance in each subgroup and thus in the population as a whole.
-
Allows for many data collection methods
In some cases, you’ll have to use different data collection methods for different subgroups.
For example, if you have limited time and money to conduct your survey research, then it might be convenient to survey elderly participants door-to-door but younger participants using email.
Stratified sampling disadvantages
Stratified sampling also comes with some disadvantages:
-
Risk of misclassification
Identifying appropriate strata and accurately classifying the population into these strata can be complex and time-consuming. If the strata are not well defined or are incorrectly identified, the sample may not accurately represent the population, leading to biased results.
Once strata are defined, it may be difficult to adapt the sampling method if new information suggests a different stratification would be more appropriate.
-
Complex and time-consuming
Determining the right proportion of samples for each stratum and ensuring each is adequately represented can be challenging, particularly in large or heterogeneous populations.
In small populations, the benefits of stratified sampling may not outweigh the additional complexity, and other sampling methods might be more efficient.
-
Need for detailed and up-to-date information
Stratified sampling requires detailed information about the population to create appropriate strata, which might not always be available or reliable. In dynamic populations where characteristics change frequently, maintaining up-to-date stratification data can be difficult.
Frequently asked questions about stratified sampling
- What’s the difference between stratified and systematic sampling?
-
Stratified sampling and systematic sampling are both probabilistic sampling methods used to obtain representative samples from a population, but they differ significantly in their approach and execution.
- Stratified sampling involves dividing the population into distinct subgroups (strata) based on specific characteristics (e.g., age, gender, income level) and then randomly sampling from each stratum. It ensures representation of all subgroups within the population.
- Systematic sampling involves selecting elements from an ordered population at regular intervals, starting from a randomly chosen point. For example, you have a list of students from a school and you choose students at an interval of 5. This is a useful method when the population is homogeneous or when there is no clear stratification. It’s much easier to design and less complex than stratified sampling.
- What is disproportionate stratified sampling?
-
Disproportionate sampling in stratified sampling is a technique where the sample sizes for each stratum are not proportional to their sizes in the overall population.
Instead, the sample size for each stratum is determined based on specific research needs, such as ensuring sufficient representation of small subgroups to draw statistical conclusions.
For example, the population you’re interested in consists of approximately 60% women, 30% men, and 10% people with a different gender identity. With disproportionate sampling, your sample would have 33% women, 33% men, and 33% people with a different gender identity. The sample’s distribution does not match the population’s.
- What is proportionate stratified sampling?
-
Proportionate sampling in stratified sampling is a technique where the sample size from each stratum is proportional to the size of that stratum in the overall population.
This ensures that each stratum is represented in the sample in the same proportion as it is in the population, representing the population’s overall structure and diversity in the sample.
For example, the population you’re investigating consists of approximately 60% women, 30% men, and 10% people with a different gender identity. With proportionate sampling, your sample would have a similar distribution instead of equal parts.
- Is stratified sampling random?
-
Yes, stratified sampling is a random sampling method (also known as a probability sampling method). Within each stratum, a random sample is drawn, which ensures that each member of a stratum has an equal chance of being selected.