What to do if you have too much data to do sufficient interpretative justice to your IPA analysis

Aka Help! I have too much data to do a decent IPA data analysis with enough depth. What should I do? Warning: long read alert!

What will this article cover?

Data analysis for your Interpretative Phenomenological Analysis (IPA) research study can be a tricky business, especially if you end up swamped with data. This article explores some strategies that you could employ if you have too much data for an in-depth IPA data analysis. You may end up with too much data if your sample size is too big for an IPA and this piece complements another article on ideal sample sizes for an IPA study that you can read here.

However, my main aim here today is to assist you in coming up with a strategy for managing an overwhelming amount of data if you have already recruited and collected your data using a sample that you now realise was too big.

In my experience, it is very uncommon for students to report that they do not have enough data for their IPA. In fact, more often than not the opposite is true, and students report feeling utterly overwhelmed with bucket loads of data spilling out of every research orifice willy-nilly.

This is frequently due to having a sample size that is too big, and/or participants who were particularly loquacious, coupled with a short timeframe for completion, meaning that they have insufficient time to do full interpretative justice to their IPA analysis of the full data set.

*Side note: What students often report is having data that they are concerned lacks depth and/or is ‘not rich enough’ but this issue is usually related to interview technique. We address this common issue robustly in my Supercharge Your Semi-Structured Interviewing workshop and on the relevant ‘How do I…?’ page here on my website.

As covered in my article on the mythical ideal IPA sample size, the main issue with having too much data for your IPA analysis that you are likely to be totally overwhelmed and will struggle to manage the data itself and the process of analysis in a reasonable timeframe.

This tends to lead to a more superficial analysis than is recommended for a good quality IPA and, as a result, it will be liable to:

  • Lack sufficient synthesis
  • Lack sufficient interpretative depth
  • Focus more on the convergences within the data set and struggle to highlight the idiographic detail and divergences in participants’ experience of the phenomenon

As I hope you are already aware, this is definitely NOT what we want for an IPA study of decent quality.

What if I already have a sample that is too big?

If your sample size is too big, you may end up with a staggering amount of data, especially if your participants were particularly verbose. This, coupled with a lack of time, can give rise to significant overwhelm and massive tailspin when it comes to the analytic process, for all the reasons outlined above.

Does this sound familiar?

There are several approaches that you could employ to find a way past this type of research dilemma:

Approach #1 Discard some data

Personally, I would hesitate to recommend this approach as it feels a bit disrespectful given that participants have given up their time and stories to us free of charge. It feels unsound to me to then choose to reject their data due to errors in design on our part (not to mention how on earth you would actually make decisions on who to discard?!). Some might argue that it is even ethically questionable, although this has been hotly debated on the IPA discussion forum: see message #9653 on the IPA discussion group and the ensuing thread where various ‘big names’ weigh in.

NOTE: To access the archive, log into the group and go to the messages tab. Plug the #message number into the Msg# search box in the top right-hand corner just under the regular search box with the magnifying glass symbol. If you are still struggling with searching the IPA discussion forum for #/message/post numbers, please see my article ‘Making the most of the IPA discussion forum: using the archive and search functions effectively’. See the section titled ‘Searching by message number’ where I outline how to perform a search using #message numbers and provide screen shots to help you master this skill.

Furthermore, the volume of data to discard will no doubt be brought to bear in deciding if this is the best or most appropriate approach. It may feel more comfortable/possible/sound if it is just one or two participants, as opposed to a greater number. This will no doubt need to be evaluated on an individual case-by-case basis, and most definitely in conjunction with supervisory (and possibly ethics board) guidance.

Approach #2 Following on from the above, employ some of the data for a different, standalone, triangulation exercise of some nature

You could either include this in your thesis or publish it separately, thus honouring the data donation from your participants and doing something with it.

This idea was proffered by (you guessed it) Mike Larkin in his hugely helpful response to this question (see message #9656 in the abovementioned thread) where excess data was gathered in recruitment confusion and some participants were discovered NOT to fit the inclusion criteria when they came to be interviewed. Hence, data was gathered, but some of the sample were inadvertently discovered to be non-homogenous.

The question was raised on the forum whether to discard the data or how to include the data.

Interestingly (and this is how I came upon this thread), Mike Larkin suggests developing a template from the main study and then applying this to the non-homogenous data.

See below for more on combining IPA and template analysis.  

Approach #3 Do a preliminary ‘surface’ analysis using a different approach and identify a core sample suitable for IPA analysis

This approach has been used to explore larger data sets for both breadth and depth and the best example is that of Spiers and Riley (2019).

Reference: Spiers, J., & Riley, R. (2019). Analysing one dataset with two qualitative methods: The distress of general practitioners, a thematic and interpretative phenomenological analysis. Qualitative Research in Psychology, 16(2), 276-290https://www.tandfonline.com/doi/abs/10.1080/14780887.2018.1543099

These authors interviewed 47 GPs (Yes, you heard that right!) and employed an initial inductive and explicit, critical realist informed, Thematic Analysis (TA) across the larger sample.

From this preliminary TA analysis, the insights gained on the transcripts were used to decide which smaller subset were ripe for an IPA deep dive. A demographically homogenous group of ten richer transcripts were selected for the IPA analysis.

This hugely informative and comprehensive methodological paper comes highly recommended if you think this could be the answer to your massive volume of data woes.

However, please do note that the Spiers and Riley (2019) study was designed to be conducted in this manner, and you may need to consider wider implications (e.g., methodological, and ethical) if you are searching for a solution to the problem of too much data and did not design your study like this or plan to employ this approach from the get-go.

It goes without saying of course that discussion with supervisors is mandatory in this type of instance.

For further reading on Thematic Analysis (TA) and its use for qualitative data analysis, I refer you to the goddesses of all things TA, Braun and Clarke, who have written extensively and extremely eloquently on TA for many, many years.

Approach #4 Identify and select a small, core sample of transcripts for IPA analysis and then extend across the remainder of the sample using a template or framework approach

This approach was recently suggested by Mike Larkin on the IPA discission forum – to read the original, please see message #11693 in the group.

It was posited by Mike as a potentially ‘ethically-appropriate’ solution to managing an excess of data that you will not be able to do sufficient interpretative justice to (for the reasons mentioned above) without discarding any.

I found five examples of this combination in the literature, some more useful than others.

I will walk you through them in order of usefulness after discussing some methodological papers that could be helpful should you consider this approach:

Methodological paper one:

Reynolds, F. (2003). Exploring the meanings of artistic occupation for women living with chronic illness: a comparison of template and interpretative phenomenological analysis approaches to analysis. British Journal of Occupational Therapy, 66(12), 551-558.

This paper is a super-useful, easy read and highly recommended. It explores a study using a template approach to analyse written narrative accounts and subsequently an IPA analysis on interview data.

Reynolds presents a thoughtful reflection on conducting both studies and makes extremely helpful comparisons between the two approaches. Reynolds also gives a discerning review of her take on the strengths and limitations of both methodologies.

Please do note, however, that it is not a combined study; nonetheless, the level of analytical comment in this article could provide a helpful scaffold from which to consider a combined approach.

Methodological paper two:

If you want a useful methodological read about template analysis, I recommend the following:

Brooks, J., McClusky, S., Turley, E., & King, N. (2015). The utility of template analysis in qualitative psychology research. Qualitative Research in Psychology, 12, 202-222.

This paper gives a valuable outline of the approach and its underpinning epistemology and then goes on to provide three handy case examples of research conducted employing a template approach.

The third case study outlined employs a combination of TeA and an interpretive phenomenological approach. Now, please bear in mind it is not an IPA approach, it is an interpretive phenomenological approach, however, there are many similarities and I felt this was nonetheless useful.

Examples in the literature that may help build an argument for employing this approach

I found just five examples of this combination in the literature at the time of writing, some more useful than others.

I will walk you through them in order of usefulness – from most to least useful, in my humble opinion, obvs!

The exemplar: the most useful example, with the best level of methodological detail

Bond, J., Robotham, D., Kenny, A., Pinfold, V., Kabir, T., Andleeb, H., Larkin, M., Martin, J.L., Brown, S., Bergin, A.D., Petit, A., Rosebrock, L., Lambe, S., Freeman, D., Waite, F. (2021). Automated virtual reality cognitive therapy for people with psychosis: protocol for a qualitative investigation using peer research methods. JMIR Res Protoc, 10(10), e31742. doi: 10.2196/31742

This article is a protocol submission for a study taking place right now – you can read it here.

This particular study is a peer research study exploring the user experience of automated virtual reality (VR) therapy for anxious social avoidance in people with psychosis. The aim is to inform future implementation strategies for this novel therapy within MH services.

The authors have conducted semi-structured interviews with 25 participants with psychosis. The analytic strategy is for six transcripts to be analysed initially by two separate researchers using IPA.

The selection criteria for these transcripts are loosely defined as: ‘based on the richness of the data and the chosen perspective’ (p.5), indicating that some researcher judgment will be employed to decide which of the transcripts meet these criteria.

NOTE: this is definitely something you may need to consider if you plan to employ this approach – how will you define the criteria for selecting your sub-set for the IPA? More on this below.

Once the six cases have been analysed and the resultant themes from both analysts reviewed and merged, a provisional template will be developed. Then the remaining data set will be grouped, and each grouping subjected to a TeA in turn.

As per King (2004), the template will be revised and updated (if necessary) following the analysis of each grouping. Once all transcripts have been processed, they will be reviewed and re-processed as required according to the template.

The end result will comprise of the final, revised template (i.e., a set of themes) that covers the full data set, including the six cases analysed using IPA.

This article could be useful in arguing the case for this approach as it states on page five that ‘Template analysis is a flexible approach that can be used in concert with IPA to enable larger samples to be analysed. The multiple perspectives design provides a structure for exploring both the individual case and considering cases of directly related groups.’ (2021, p.5).

The authors also outline the dual approach as being useful in lending the study both depth and breadth, especially given that it is a multi-perspectival design.

Side note: there is also a useful, albeit brief, discussion of the epistemological position adopted for this study and how this relates to the contextual information that is also supplied for the study.

Additional reference: King N. (2004). Using templates in the thematic analysis of text. In: Cassels C, Symon G, editors. Essential Guide to Qualitative Methods in Organizational Research. London, UK: Sage Publications; (pp.256-270).

Taking the silver medal as the original forerunner from way back in 2013

Dennis N.L., Larkin, M., & Derbyshire, S.W.G (2013). ‘A giant mess’ – making sense of complexity in the accounts of people with fibromyalgia. British Journal of Health Psychology, 18, 763-781

This article was the first to be published employing the TeA and IPA combo – we could perhaps call it a forerunner.

However, the focus of the article is not methodological (hence sitting in second place, despite being published much earlier than any other), although a clear explanation is given of the procedure that was employed.

Nonetheless, it provides a shining example of what can be achieved using this combination for a larger than average IPA data set.

The study involved a large sample of twenty participants with a diagnosis of fibromyalgia taking part in asynchronous email interviews.

The two stages of analysis are described as linked and ‘complementary’ (p.766) with eight transcripts selected for the first phase of the analysis: the IPA. The criteria for selection involved transcripts that were the ‘closest to the median length’ of 12-15 pages (p.766).

The aim of this phase is described as developing a ‘core’ analysis that was ‘in-depth, bottom-up, [and] experientially focused’ (p.766).

A template was generated from this initial work and employed to interrogate the remaining twelve transcripts in line with TeA principles. As per King (2004) and the Bond et al. (2021) described above, the analysis was extended and refined through the incorporation of new material from the residual data via the TeA.

What makes this paper stand out for me is the analytic account: it gave me a real sense of how it might be to live with the condition and bought the participants’ experience to life. I found it compelling, plus it was easy to follow and well organised. The discussion hosts a thoughtful reflection on the analysis and of email interviews (as opposed to f-2-f) which may be useful if relevant to your own study.

Finally, as the authors outline on page 764: ‘This study adopts reasonably novel methodological features, firstly to engage with participants who might find face-to-face interviews tiring and inconvenient, and secondly to combine in-depth, bottom-up qualitative analyses with more top-down processes, in order to increase credibility, coherence and transferability’ – I would agree wholeheartedly!

Taking the bronze medal for usefulness

Tour, S.K., Thompson, A., Howard, R.A., & Larkin, M. (2022). Experiences of blogging about visible and long-term skin conditions: interpretative phenomenological analysis. JMIR Dermatology, 5(2), e29980.

The tactic of combining and IPA with TeA was employed in the following study, although the article’s focus is not the methodological issues involved in using this approach to your IPA.

This study investigated how individuals with chronic, visible, skin conditions experience the practice of blogging about their problem for the purpose of self-management.

The researchers employed two methods of data collection: email interviews with individuals and blog content freely available on the web.

Four blogger participants took part, completing email interviews of up to ten emails over a period of six weeks. A full IPA analysis was conducted on the transcripts from the four interviews.

This analysis was employed to create the initial template for the TeA of the first and most recent five blog posts of those bloggers interviewed by email (i.e., ten posts for each blogger – a significant amount of additional data to process).

The template was revised and updated across the analysis of the blog material.

Side note: Interestingly, participants were sent the preliminary themes and findings for confirmation of fit, however, only one responded to this attempt at member checking.

Coming in at fourth place for utility

Weber, L., Voldsgaard, N.H., Holm, N.J., Schou, L.H., Biering-Sørensen, F., & Møller, T. (2021). Exploring the contextual transition from spinal cord injury rehabilitation to the home environment: a qualitative study. Spinal Cord, 59, 336-346.

This Danish study looked at knowledge and skills transfer from a spinal cord injury (SCI) unit to the home environment. The authors conducted 14 semi-structured interviews with people with SCI and employed ‘two complementary methods, IPA and template analysis’ (p.337).

It is not altogether clear from the article write-up exactly how these complementary analyses were conducted.

It seems that an IPA was initially conducted in a bottom-up manner and then, at the point of the cross-case analysis, ‘transfer of training theory’ (p.337: derived from organisational and educational research) was employed as an interpretative lens (or template?) for the final stages of the IPA analysis.

The researchers describe it thus: ‘the results of the IPA were synthesized with the template analysis, leading to the emergence of the central themes’ (p.338). Hmmm, a bit vague, if you ask me….

As per King (2004), the final template was revised and updated to include all ‘meaningful entities from the data set’ (p.338).

While the details are a little woolly compared perhaps to the laser sharp precision of the Bond et al. who have won the first prize for usefulness, this is an interesting approach to working with a larger data set that incorporates a TeA approach alongside an IPA.

It appears to have facilitated some theoretical framing of the data via the TeA part of the analytic strategy once the bottom-up initial IPA had been executed – an interesting hybrid approach to incorporating top down, a priori concepts into an IPA.

No disrespect intended, but I am rejecting this one in terms of usefulness for our purposes

Naidoo-Chetty, M., & du Plessis, M. (2021). Job demands and job resources of academics in higher education. Frontiers in Psychology, 12, 631171. 

This South African study interviewed 23 academics at a public university and applied IPA ‘in conjunction with’ (p.3) TeA.

The authors report completing the IPA analysis and then constructing the themes into a template which was then divided into various categories.

So, this study does not appear to have conducted a full TeA subsequent to an IPA on a sub-set of the sample, so this example may not be so useful for the purposes of arguing this type of approach to a larger dataset.

My other concern with this paper is that the authors refer to saturation in terms of their IPA! As we all know (I very much hope!), this is not a relevant concept for IPA, and this adds to my feeling that this paper is the least useful example unearthed for the purposes of arguing the toss for a combo of IPA and TeA. 

Methodological considerations: How do I choose a sub-set for my preliminary IPA?

Mike Larkin helpfully offers some guidance on possibilities in message #11693 on the forum

His words of wisdom in outlining some options for deciding on the ‘core’ sample for the IPA part of this combo process are paraphrased and summarised here for you:

Option 1:

Choose the most homogenous sub-sample, where homogeneity could be along any dimension in terms of whichever important characteristic may be shaping your study’s focus.

Option 2:

Choose the most heterogenous sub-sample as you may not want to lean too much in any one direction. Mike Larkin gives the example of including ‘as much within-sample heterogeneity as you can, so that you don’t – for example – overwrite women’s experiences with men’s’.

Option 3:

Choose the ‘richest’ sub-sample, in other words, the transcripts with the most phenomenological detail and description.

Option 4:

Choose a sample that is characterised by a specific relationship to the phenomenon of interest. Here Mike uses the example of the difference between learning to drive a manual car versus an automatic if we were looking at the experience of learning to drive and had interviewed participants who had learned either.

NOTE: Dr Larkin caveats that this list is not exhaustive and there may be alternatives that fit the framework and perspective that your study takes and suit its aims. Above all, he rounds off: ‘The key thing – as you can guess – is that it [your choice of sub-sample] makes sense in the context of your study, and for your research aims.’

Final thoughts

Of course, there is also the combo of IPA and framework analysis (Hurrah! More IPA nerdery! Who even knew there were so many options!), but I will cover this in another article as we are already approaching four thousand words here, and surely this is more than enough for anyone’s IPA excessive data dilemma!

In conclusion, however you decide to deal with your issue of having too much data to do sufficient justice to your IPA analysis, please, please, please always remember to give a rationale for your research decisions in your write-up and reflect upon how your choices may have shaped your study and potentially influenced your findings.

And finally, it would be remiss of me not to mention my IPA Data Analysis workshops here – there are two of them covering every step of the process that you need to follow to analyse your data for your IPA.

Wishing you all the best with manhandling your data into submission, Elena

Copyright © 2023-24 Dr Elena Gil-Rodriguez

These works are protected by copyright laws and treaties around the world. Dr Elena GR grants to you a worldwide, non-exclusive, royalty-free, revocable licence to view these works, to copy and store these works and to print pages of these works for your own personal and non-commercial use. You may not reproduce in any format any part of the works without my prior written consent. Distributing this material in any form without permission violates my rights – please respect them.


The information contained in this article or any other content on this website is provided for information and guidance purposes only and is based on Dr Elena GR’s experience in teaching, conducting, and supervising IPA research projects.
All such content is intended for information and guidance purposes only and is not meant to replace or supersede your supervisory advice/guidance or institutional and programme requirements, and are not intended to be the sole source of information or guidance upon which you rely for your research study.
You must obtain supervisory and institutional advice before taking, or refraining from, any action on the basis of my guidance and/or content and materials.
Dr Gil-Rodriguez disclaims all liability and responsibility arising from any reliance placed upon any of the contents of my website or associated content/materials.
Finally, please note that the use of my content/materials does not guarantee any particular grade for your work.

You may also like…