How do we decide what ‘counts’ as a theme for our IPA research study?

Aka: the prevalence debate in IPA

A question that comes up repeatedly across both IPA Data Analysis workshops is how we might decide what ‘counts’ as a theme and should therefore be included in the analytic account.

This article will examine ‘the prevalence debate’ in IPA and the factors that we may need to consider when we are deciding what ‘counts’ as a theme for our IPA analysis and can therefore be included in the findings for our study.

When we get to the point of developing our final master table of Group Experiential Themes (GETs), it is well worth considering the prevalence or recurrence of your final GETs and themes/sub-themes across the group to assist in making decisions about what to include or what to discard at this point (if you are brave enough!) before you start to write the narrative account.

In addition to making these considerations, it is also helpful to include some notation of this in the master table or list.

What do we mean by prevalence?

The prevalence (also termed frequency or recurrence) of themes and how many participants evidence them (i.e., are featured in them) is a key consideration when constructing the master table.

In this context, we are loosely referring to the quantity of participants (and therefore data or ‘evidence’) that we have contributing to each theme or GET.

Prevalence was originally a quantitative concept and is defined by NIMH as being ‘the proportion of a population who have a specific characteristic in a given time period’). In other words, how many people have X or Y characteristic at a certain point in time.

The term prevalence has been used in qualitative research but has previously tended to be more applicable to content analysis or in what we might describe as more ‘superficial’ qualitative research, for example, it might be seen in coding reliability approaches within large, descriptive thematic analyses conducted by multiple researchers.

Where does it first show up in terms of IPA?

The notion of ‘recurrence’ or prevalence shows up first in the 2009 edition of the Smith, Flowers and Larkin IPA text (also fondly known as ‘THE book’ or ‘the IPA bible’)

In Smith et al. (2009, pp.106-107), the authors address ‘measuring recurrence across cases’ (p.106). They indicate that for a theme and/or SOT (Super-ordinate Theme, now known as GET) to be ‘classified as recurrent it must be present in at least a third, or half, or, most stringently, in all of the participant interviews’ (p.107).

In this edition they point to this practice of numerically indicating the frequency/recurrence/prevalence for a theme/SOT as a means to further enhance the validity of your findings if you have a larger sample size. They also give an example to illustrate how we might establish the prevalence or recurrence of a set of themes (see p.107).

Interestingly, they also note that decisions around prevalence or recurrence will be driven by several influences:

  • Analytically (e.g., the level of theming involved in your final analytic structure, the size of the sample and the level of commentary for each theme) AND
  • Pragmatically (e.g., the end product of a project such as whether it is for D or M level, or a report for say attracting funding or to inform policy)

In any eventuality, Smith et al. (2009) are clear that we must identify ‘a set of criteria which can be used to identify recurrent themes’ (p.107) and advise that we find a way of presenting the interconnections between these themes graphically or via some other means such as a table.

It is worth noting that this guidance has been updated in the second edition and that the concept of recurrence or prevalence for a theme can be useful to consider (and, in fact, is recommended these days – see below) whatever your sample size, although there are caveats (of course) that we will consider in a moment!

Full reference: Smith, J.A., Flowers, P., & Larkin, M. (2009). Interpretative Phenomenological Analysis: Theory, Method, Research. London: Sage.

How does this concept develop over time for IPA?

It next appears in the ‘IPA quality evaluation guide’ published by Prof. Smith on page 17 of the Health Psychology Review Special Edition that focused on ‘Advancing and extending qualitative research in health psychology’ (Volume 5, Issue 1, 2011).

The full reference for this seminal and not to be missed paper is as follows:

Smith, J.A. (2011a). Evaluating the contribution of interpretative phenomenological analysis. Health Psychology Review, 5, 9-27

This article penned by JS outlined what he considers ‘high quality IPA’ from a review of 51 published IPA papers concerned with ‘the illness experience’ (p. 14).

Notably, this selection of papers was derived from a wider search covering the years 1996 to 2008 that uncovered 293 empirical IPA studies in the scientific literature at that time.

In the review, rigour was aligned (among many other criteria) to the ability to show the ‘density of evidence for each theme’ (p.17), in other words, its prevalence, or how many participants contribute supporting ‘evidence’ (i.e., extracts/quotes) to that theme.

Later in the discussion section, Smith (2011a, p.24) elaborates further on this aspect of rigour in terms of the quality of your IPA by stating:

‘The paper should be rigorous. One should aim to give some measure of prevalence for a theme and the corpus should be well represented in the analysis. Extracts should be selected to give some indication of convergence and divergence, representativeness and variability. This way the reader gets to see the breadth and depth of the theme’ (p.24)

At this point in time, some pretty specific numbers were outlined on page 17 and again on page 24 in terms of the density of evidence required depending on your sample size. I attempt to summarise here:

  • n1-3: extracts from every participant to support each theme
  • n4-8: extracts from at least three or half the participants to support each theme
  • n>8 (i.e., larger sample sizes): extracts from at least three or four participants per theme plus a measure of prevalence of themes and/or some indication of how the prevalence of a theme is determined, or extracts from half the sample for each theme (Smith, 2011a, p. 17 and 24, italics in the original, p.17)

He concludes on page 24 as follows:

‘In other words, the evidence base, when assessed in the round, should not be drawn from just a small proportion of participants’ (2011a, p.24)
In those papers assessed as high-quality IPA, themes were well evidenced by enough extracts from enough participants – usually half (Smith, 2011a)

BUT there are (as always) caveats to this somewhat explicit and precise outline of actual numbers…

What are the caveats, Elena?

Firstly, in his later response to Kerry Chamberlain’s commentary in the special edition (Smith 2011b, see below), JS further develops the criteria for quality and usefully expands on the nuances around the numbers he cites.

He also reminds us that these are ‘guidelines are not prescriptions’ (Smith, 2011b, p. 57) and they do not offer ‘a simple recipe’ to follow (p. 58).

Secondly, Prof Smith draws our attention to some caveats to his specifics regarding the required evidence base, the first of which is the caveat of compensation.


This is where the evidence base PLUS interest factors are considered together (2011a, p.17).

Thus, a theme or GET with particularly interesting data may gain compensation for a less than ideal evidence base because this interest offsets the need for a robust density of evidence in this case.

In other words, we might have a very interesting theme that we decide ‘counts’ as a theme and can be included in our analysis despite that fact that it has a less than ideal evidence base or a limited number of participants contributing to it.

This caveat or point is elaborated upon in the second edition of ‘THE book’ and I will go down this rabbit hole in just a moment.

What issues did Kerry Chamberlain’s commentary in the special edition raise?

This section will explore the following piece in Vol 5, Issue 1 of the Health Psychology Review

Chamberlain, K. (2011). Troubling methodology. Health Psychology Review, 5(1), 48-54.

Now, I want to be clear that I am not going to review or address the earlier parts of Chamberlain’s commentary as this is a broader critique of IPA which requires its own article to fully unpack and contextualise, alongside the other critiques of IPA that are out there. This will no doubt take some time and is on a (long) list of other topics that I want to cover in my content eventually.

We ARE going to examine the material from page 52 onwards as this is relevant to the ‘IPA quality evaluation guide’ and, more specifically, to the notion of prevalence for your IPA themes and GETs.

Here Chamberlain expresses ‘some concerns around the ongoing codification of IPA practice’ (p.52).

More precisely, he questions the detailed numbers that are outlined regarding the density of evidence required for a theme.

He posits the critique that Prof Smith was ‘perpetuating an over-emphasis on the quantification of themes’ (Smith, 2011b, p.58) and that ‘offering these rules of thumb, expressed in such quantified terms, also serves to promote the idea that themes are only worthwhile (valid?) if they are quantifiably common in the data’ (Chamberlain, 2011, p.52).

He goes on to elaborate that ‘surely a single sensitive comment from a participant can provide the researcher with some valuable insights into meaning, which may be pursued in seeking an insightful interpretation’ (pp. 52-53).

And that ‘The story told, and the degree of insight offered to readers, are surely more important than the range and quantification of quotes offered’ (p. 53).

How did Prof Smith respond to this critique?

The reference for the response is as follows:

Smith, J.A. (2011b). Commentary. Evaluating the contribution of interpretative phenomenological analysis: a reply to the commentaries and further development of criteria. Health Psychology Review, 5(1), 55-61

In response to this critique (which, to be fair, has also been levelled at much big-Q qualitative research, in addition to IPA), Smith’s commentary (2011b, p.59) offers a considered reply to Chamberlain.

Firstly, he acknowledges that ‘the balancing act is delicate here’ (p.58) and encourages us not to see any guidelines as a formula to follow blindly, despite the fact that the aim is to try and make IPA more do-able for the novice by providing specific direction in these matters.

He writes that in relation to the criteria surrounding prevalence, the intention was NOT to continue to stress the point that themes must or should be quantified in IPA, and that it is ironic that this is how that section of the paper came across, given IPA’s consistent emphasis on smaller sample sizes and case studies with a high level of high-quality interpretative depth.

He goes on to acknowledge that perhaps some of the criteria failed to emphasise the requirements for a good level of interpretative commentary alongside data extracts, as opposed to simply focusing on the thematic structure and how well this is evidenced.

Smith (2011b) states that perhaps ‘some of the expression of the criteria didn’t quite hit the mark’ (p.59) and that some of the nuance expected in good IPA ‘got lost in translation!’ (p.59) during the process of conducting the review and writing the 2011a paper.

Smith continues with some examples of exceptions that might occur along the lines of the compensation caveat, all the while emphasising the need for a good quality narrative (aka good quality writing and depth of interpretation).

He then clarifies and concludes the compensation discussion thus:

‘While a paper should present a considerable amount of evidence, not all extracts need carry the same weight and a single comment from a participant can have significant interpretative leverage for the corpus overall’ (p.59).

Arguably, this position is also outlined in the ‘diving for pearls’ paper that was published in the QMiP Bulletin the same year, 2011, and which you are also highly recommended to read and reflect upon:

Smith, J.A. (2011). ‘We could be diving for pearls’: The value of the gem in experiential qualitative psychology. QMiP Bulletin, Issue 12, October 2011

What is the most recent position on prevalence from ‘the dons’ of IPA?

The position on prevalence in IPA is further updated in the more recent second edition of ‘THE book’ where Smith, Flowers and Larkin (2022, p.104) go as far as to state:

‘As a rule of thumb, one can decide that in order to make a Group Experiential Theme plausible it must be inhabited by at least half the participants in the study. However, this is not a hard and fast rule’.

They do add some (inevitable) caveats to this, arguably encouraging flexibility in how these principles are applied to our analysis.

They note that important themes may have ‘smaller patterns of convergence’ (i.e., may be seen in less than 50% of participants) which they describe as sometimes due to ‘compensation’ (p.104).

This refers to the caveat mentioned above, which they elaborate upon, thus:

That a GET may ‘arise from the distinctive concerns of a small subset of participants’ and that this will be data driven and ‘influenced by the level of detail in the analysis’ (p.104).

In other words, they are encouraging us to hold the 50% notion lightly, and that we consider ‘interest factors’ (p.104) alongside how many participants contribute evidence to a theme/GET when considering whether we can or should include it.

So, a GET/theme with a small number of participants contributing to it, or even a single participant, with particularly interesting data may gain compensation for a less than ideal evidence base.

In this case, the theme’s interest or contribution to the analysis offsets the need for an elevated density of evidence.

In conclusion

My reading of all this is that themes/GETs evidenced with data from a single participant and/or with an uneven emphasis on a single or just a few participants ARE acceptable if the account of this individual(s) is sufficiently interesting and makes a sufficient contribution to our understanding of the analysis to warrant inclusion.

Smith, Flowers and Larkin (2022) go on to discuss the illustration of prevalence via evidence (i.e., data quotes plus commentary) still on page 104. Here they outline that:

GETs ‘expressed at a broad level’ are likely to have more illustrative instances than those ‘expressed at a more specific level’.

In other words, the broader your GET and the more participants illustrating your GET, the more examples/data quotes you are likely to have (NOTE: this is not rocket science!) BUT again, this will be data driven and likely to be different for all of us.

Of course, Smith et al. (2022, p.104) always urge that we are transparent about this type of research decision and the ‘processes and decisions being made and how the evidence is being presented’.

To this end, we must make sure that we provide a clear and unequivocal rationale and explanation for why we may have included a compensated theme in the manner described above.

So, it might be that you end up with a theme (or even GET) that is only illustrated by a single participant as it offers such strong ‘compensation’ that it warrants being left as such.

OR it may be that this participant’s theme could in fact be included in another theme as an example of divergence or ‘disconfirming case’ that perhaps illuminates the rest of the accounts in that theme in some way, even if this is due to his experience being significantly different.

There is a good example of this in one of the exemplars for IPA from the appendix of the Smith and Nizza (2021) text:

Dwyer, A., Heary, C., Ward, M., & MacNeela, P. (2017). Adding insult to brain injury: young adults’ experiences of residing in nursing homes following acquired brain injury, Disability and Rehabilitation, DOI: 10.1080/09638288.2017.1370732

If you look at the GET ‘Institutional life: disempowerment and dehumanisation’ (page 6), it demonstrates a good example of polarisation with most of the ES/PETs presented being at odds with the single ‘disconfirming’ case of Sean.

Here we see that the contradictory aspect of Sean’s experience is highlighted through clustering with other participants’ experiences that are at odds with Sean’s. This illuminates the complexity of the experience and the idiosyncratic and unique ways that participants experience and live this theme.

Final words

I honestly believe that the moral of this story should be that quality cannot be achieved by simply following a checklist or abiding by numerical prevalence guidelines, and I encourage you all to develop your critical eye around this debate of prevalence and what can and should ‘count’ as a theme in your IPA.

To do this, I urge you to review the Health Psychology Review, Volume 5, Issue 1, reading the commentaries and critically reflecting upon the developments and considerations from 2011.

Plus, I suggest returning to the relevant section of the 2022 Smith, Flowers and Larkin text to ponder on this debate – it can be found on pages 104 to 105.

It can be very helpful to use the guidelines, both from 2011 and 2022, as a discussion point with your supervisor and study buddies as to how you may engage with them.

Remember that there are/have been concerns/critiques around the use of prevalence information and tables (in other words, codification) in ‘Big Q’ qualitative research for some time now.

So, like all research decisions, your inclusion/exclusion criteria for prevalence information and your rationale for why you have or have not chosen to include it must be explained and justified or accounted for in your thesis.

We will look at how to present this type of prevalence information in your thesis write up in both the Introducing IPA Data Analysis and the Advanced IPA Data Analysis workshops where we cover the mechanics of how to include this information into your master table of GETs and in your final write up of the analysis/findings chapter respectively.

Wishing you all the best for making these research decisions, Elena

Copyright © 2024 Dr Elena Gil-Rodriguez

These works are protected by copyright laws and treaties around the world. Dr Elena GR grants to you a worldwide, non-exclusive, royalty-free, revocable licence to view these works, to copy and store these works and to print pages of these works for your own personal and non-commercial use. You may not reproduce in any format any part of the works without my prior written consent. Distributing this material in any form without permission violates my rights – please respect them.

The information contained in this article or any other content on this website is provided for information and guidance purposes only and is based on Dr Elena GR’s experience in teaching, conducting, and supervising IPA research projects.
All such content is intended for information and guidance purposes only and is not meant to replace or supersede your supervisory advice/guidance or institutional and programme requirements, and are not intended to be the sole source of information or guidance upon which you rely for your research study.
You must obtain supervisory and institutional advice before taking, or refraining from, any action on the basis of my guidance and/or content and materials.
Dr Gil-Rodriguez disclaims all liability and responsibility arising from any reliance placed upon any of the contents of my website or associated content/materials.
Finally, please note that the use of my content/materials does not guarantee any particular grade for your work.

You may also like…