Saturation in Qualitative Research

Saturation is a fundamental yet unscientific aspect of qualitative research. In this post, I explain what saturation is and why it is often unscientifically derived.

Thematic analysis creates categories

Qualitative research produces reams of textual data from archives and interviews. Sociologists analyze these data and, using thematic analysis, develop categories of these data. These category/themes summarize, to an extent, the data at hand.

Not all data fit neatly into categories. But categories are important and, as such, sociologists create a category when they believe that the category is a true fit with the data — that it is a real thing, and not just some fictional category dreamed up solely in the imagination of the sociologist.

At some point, the sociologist argues that the category reflects reality and that no subsequent data would expand the category. In other words, the category “exists” and any additional data would be redundant. In this case, the sociologist declares that the category is “saturated.”

Saturation is a common term in social science qualitative research

Definitions of saturation

In the dictionary, “saturate” is rooted in chemistry, mainly, and it means that something is “thoroughly soaked.”

Over the years, sociologists have tried to define saturation. Here are a few examples:

Glaser and Strauss (1967: p. 61): “Saturation means that no additional data are being found whereby the sociologist can develop properties of the category. As he sees similar instances over and over again, the researcher becomes empirically confident that a category is saturated.”

Grady (1998: p. 26): “New data tend to be redundant of data already collected. In interviews, when the researcher begins to hear the same comments again and again, data saturation is being reached… It is then time to stop collecting information and to start analysing what has been collected.”

“Urquhart (2013: p. 194): ‘the point in coding when you find that no new codes occur in the data. There are mounting instances of the same codes, but no new ones’” (1895).

Guest et al (2020: 5): “For the purposes of our assessment, saturation refers to the point during data analysis at which incoming data points (interviews) produce little or no new useful information relative to the study objectives.”

Saturation occurs when adding information doesn’t lead to new information. In other words, the data that the researcher collected has led to theoretical insights, and new data does not seem to be more illuminating. The researcher revealed all “categories” that are possible from the data.

How do we know if the qualitative data are saturated?

Saturation can be applied to all qualitative data, but the big question is how we know when saturation exists. Or, as Saunders et al put it, “‘When and how?’—at what stage in the research is saturation sought, and how can we assess if it has been achieved?” (1899).

Saturation is not necessarily a fixed point — it may be a point in the spectrum in which there may be farther points (Saunders et al: 1900). “The question will then be ‘how much saturation is enough?’, rather than ‘has saturation occurred?’” (Saunders et al: 1901).

Because saturation is not fixed, researchers are often unclear as to when they reached it.

For many qualitative researchers, “saturation” is a sense or feeling.

They “know” the data, or they are “satisfied” that nothing new will come. They collect a couple more “just to be sure.” This is scientifically and intellectually unsatisfying, as we cannot know the end point, and thus we rely on “sense.” Saunders et al complain of the “uncertainty” of saturation: saturation may feel valid, but it is not reliable.

Can an endpoint, or “thoroughly soaked” point, be known?

Some attempt this through a priori reasoning based on the available information (e.g. prior studies, informants, a kind of logic, etc.). But, since we are to collect new information — in essence, we go beyond the available information — the a priori approach is inherently suspect.

Here we can turn to Guest et al (2020), who propose “A simple method to assess and report thematic saturation in qualitative research.” Guest et al write, “Our method also specifically applies to contexts in which an inductive thematic analysis is used, where emergent themes are discovered in the data and then transformed into codes” (2).

Guest et al attempt a computational approach. For example, in their review of the literature (2):

“Morgan et al. [16] conducted a pioneer methodological study using data collected on environmental risks. They found that the first five to six interviews produced the majority of new information in the dataset, and that little new information was gained as the sample size approached 20 interviews. Across four datasets, approximately 80% to 92% of all concepts identified within the dataset were noted within the first 10 interviews. Similarly, Guest et al. [9] conducted a stepwise inductive thematic analysis of 60 in-depth interviews among female sex workers in West Africa and discovered that 70% of all 114 identified themes turned up in the first six interviews, and 92% were identified within the first 12 interviews…”

Here they attempt to estimate how large a sample is needed to achieve saturation in coding/themes. But, how do we know if they coded those themes correctly? Another problem is that their method assumes one coding level. They address this problem (14), but do not solve it:

“We tested our method on single-tier codebooks, but qualitative researchers often create hierarchical codebooks. A two-tier structure with primary (“parent”) codes and constituent secondary (“child”) codes is a common form, but researchers may also want to identify and look for higher-level, meta-themes (e.g., Hagaman and Wutich [19]). For any method of assessing saturation, including ours, researchers need to decide at which level they will identify and include themes/codes. For inductive thematic analyses this is a subjective decision that depends on the degree of coding granularity necessary for a particular analytic objective, and how the research team wants to discuss saturation when reporting study findings.”

Conclusion

Saturation is common to all thematic analysis of qualitative data. Thus, it is a major methodological issue in sociologists’ qualitative research. Yet, qualitative researchers do not use science to achieve it. Rather, they “feel it,” or “sense it,” based on “knowing” the data.

Qualitative researchers tend to work with small N data and, because there are so few data points, they may be right that they have achieved saturation. But, unless the scientific method is applied, saturation has temporarily left the realm of science to visit the home of the humanities.

Jut because saturation is more of a humanistic than scientific claim, the qualitative researcher can still be a scientist. Indeed, many valid methodological practices rely on intuition and art more so than mechanical derivation. Methodology requires creativity. This creative art and humanistic practice can yield enduring insights, especially when the analysis is replicated by others. But we must be clear what we mean when we use the term, “saturation.”

References

Guest, Greg, Emily Namey, and Mario Chen. “A simple method to assess and report thematic saturation in qualitative research.” PloS one 15, no. 5 (2020): e0232076.

Saunders, Benjamin, Julius Sim, Tom Kingstone, Shula Baker, Jackie Waterfield, Bernadette Bartlam, Heather Burroughs, and Clare Jinks. “Saturation in qualitative research: exploring its conceptualization and operationalization.” Quality & quantity 52 (2018): 1893-1907.

See also

Bowen, G.A.: Naturalistic inquiry and the saturation concept: a research note. Qual. Res. 8(1), 137–152 (2008)

Morse, J.M.: Data were saturated…. Qual. Health Res. 25(5), 587–588 (2015)

Nelson, J.: Using conceptual depth criteria: addressing the challenge of reaching saturation in qualitative research. Qual. Res. (2016). doi:10.1177/1468794116679873

O’Reilly, M., Parker, N.: ‘Unsatisfactory saturation’: a critical exploration of the notion of saturated sample sizes in qualitative research. Qual. Res. 13(2), 190–197 (2013)

Joshua K. Dubrow is a PhD from The Ohio State University and a Professor of Sociology at the Polish Academy of Sciences.