In psychology, inter-rater reliability refers to the degree of agreement between different observers or raters who evaluate the same behavior, test, or phenomenon.
It ensures that measurements are consistent, objective, and not dependent on a single person’s judgment, which is especially important in research, clinical assessments, and behavioral studies.
High inter-rater reliability indicates that results are dependable and reproducible across different raters.
There isn’t just one formula for calculating inter-rater reliability. The right one depends on your data type (e.g., nominal data, ordinal data) and the number of raters.
Cohen’s kappa (κ) is commonly used for two raters
Fleiss’ kappa is typically used for three or more raters
The Intraclass Correlation Coefficient (ICC) is used for continuous data (interval or ratio). This is based on analysis of variance (ANOVA)
The most used formula (for Cohen’s kappa) is:
Po is the observed proportion of agreement, and Pe stands for the expected agreement by chance.
Though it’s difficult to fully eliminate sampling bias, it can be minimized through careful research design and sampling methods.
Probability sampling methods (where every member of the population has a known chance of being selected) are less susceptible to sampling bias than nonprobability methods.
Looking for ways to minimize sampling bias that are tailored to your specific situation? Get ideas from QuillBot’s free AI Chat.
Sampling bias occurs when the sample collected for a study systematically differs from the target population. Below are some common types of sampling bias:
Self-selection bias: People who choose to participate in a study differ from the general population in an important way (e.g., motivation, interest).
Nonresponse bias: Those who are unable or unwilling to respond often share key characteristics, and their absence may skew results.
Healthy user bias: Individuals who are able or willing to participate are often healthier or more health-conscious than nonparticipants.
Survivorship bias: Data are only available for individuals or outcomes that pass a certain filter (e.g., those who survive an event); those that didn’t are ignored.
Undercoverage bias: Certain subgroups are systematically excluded from the sample, leading to skewed representation.
Prescreening bias: Eligibility criteria (e.g., age, language) may unintentionally exclude relevant parts of the population.
Not sure which types of sampling bias are applicable to your study? AI tools are a great way to generate ideas and receive dynamic feedback on study design. Try QuillBot’s free AI Chat the next time you’re feeling short on inspiration.
There’s not a universally agreed-upon distinction between sampling bias and selection bias, but sampling bias is often considered a subtype of selection bias.
Sampling bias occurs when a sample is not random (i.e., it differs from the target population). It impacts external validity—how well the results generalize from the sample to the population.
Selection bias, on the other hand, refers more broadly to bias introduced when selecting who to include in a study. It impacts internal validity—whether your results can be explained by the independent variable you manipulated (and not by other confounds).
The distinction between sampling and selection bias is complex. AI tools like QuillBot’s Paraphrasing Tool can be helpful when trying to parse difficult concepts.
Purposive sampling is a sampling method where the researcher intentionally selects individuals to study based on desired characteristics or experiences relevant to their research question.
There are several common approaches to purposive sampling:
Maximum variation (heterogeneous) sampling: includes individuals who differ from each other as much as possible to capture a range of experiences
Homogeneous sampling: includes individuals who are very similar to each other to enable a detailed exploration of a certain subgroup
Typical case sampling: includes individuals who best reflect the average or norm of a population
Extreme (deviant) case sampling: includes outliers who fall significantly above or below the norm
Critical case sampling: includes individuals whose results are likely to generalize—if it happens to them, it would probably happen to anyone
Expert sampling: includes individuals with specialized knowledge or expertise relevant to the research topic
Random sampling—or probability sampling—includes a range of sampling methods used to select a subgroup of individuals from a larger population. A key characteristic of random sampling is that every individual has a known chance of being selected.
Purposive sampling, on the other hand, is a non-probability sampling technique. In this method, not every individual has a known chance of being included in the sample. Instead, the researcher chooses who they include in their sample based on certain traits or experiences. This can be helpful if the researcher is very familiar with the population they are studying and wants to gain rich, targeted insight rather than generalizable information.
Purposive sampling is often easier and more efficient than random sampling. However, purposive sampling is highly susceptible to sampling bias. Random sampling is a better approach for obtaining a representative sample that reflects the broader population.
To learn more about random sampling vs purposive sampling, use QuillBot’s free AI search.
Purposive sampling and convenience sampling are two non-probability sampling methods, meaning not every individual from the population has an equal chance of being selected. Sampling methods are ways of choosing individuals from a population to study.
Purposive sampling is when a researcher hand-picks individuals because they possess specific traits or characteristics. For example, someone studying successful teaching techniques might only include teachers who have recently won awards in their sample.
On the other hand, convenience sampling involves selecting individuals simply because they are easily accessible. For example, a business might ask their social media followers to complete a survey.
Convenience sampling and purposive sampling are not mutually exclusive—a researcher might use some combination of both techniques when obtaining a sample for their study.
Both of these techniques are susceptible to sampling bias. Individuals who are readily accessible or who the researcher chooses to participate may not be fully representative of the broader population.
Purposive sampling, or judgment sampling, is a non-probability sampling method that involves hand-picking individuals to include in a study based on certain characteristics.
For example, a researcher studying how cancer patients cope with terminal illness may directly recruit several late-stage cancer patients who are receiving palliative care at their clinic.
Unlike in probability sampling, not every cancer patient has an equal chance of being selected. Instead, the researcher can choose the cases they feel will be most informative.
Purposive sampling can be helpful when the researcher is very familiar with the population they are studying, as it allows them to select individuals who best represent this group. However, any biases the researcher holds may be reflected in the sample.