A product manager asks: "We interviewed 40 customers about why they churned. How do I turn this into something actionable?"
A consultant asks: "I did 25 expert network calls on market dynamics for our Due Diligence. How do I extract the key themes?"
An HR analyst asks: "We surveyed 200 employees and got back really detailed answers to our open-ended questions. Where do I even start?"
The answer in all three cases is the same: thematic analysis.
But when most business people hear "thematic analysis," they think of academic research papers with dense methodology sections that smell like months of work. That's not what this is about. This is about the practical version - how to systematically find patterns in qualitative data when you have business decisions to make and deadlines to meet.
I spent 18 years at McKinsey, and while working there started to gradually realise I am actually using thematic analysis in almost every project - from due diligence to strategy work to organizational change - without actually knowing it. I decided to get a bit more smart and systematic on how to do this, and talked with academics including my now co-founder Henri on what is the method actually about. What could I learn from the academic practice of thematic analysis and translate to the business world?
I wrote this blog post to demystify the concept and show how thematic analysis actually works.
What thematic analysis actually is
Let me start with a practical definition of thematic analysis. To strip away the jargon, thematic analysis is a systematic way to answer: "What are the main patterns in what people are telling us?". In other words, thematic analysis is a method for identifying, analyzing, and reporting patterns (themes) within qualitative data including interview transcripts, reports, customer feedback or any other sources of text.
In plain English: you read through interviews, documents, or survey responses, and you systematically identify the recurring ideas that matter.
It's not:
- Word frequency analysis (just lazily counting how many times "price" appears...!)
- Cherry-picking quotes that support what you already believe (good for half of the stuff, misses the critical other half...)
- Having ChatGPT summarise everything (a black box generic answer with zero client credibility...)
- Vague hip shot impressions from skimming the data (leave that to the senior VPs or Partners... but feed them processed good data first!)
It is:
- Systematic assignment of text to categories
- Grouping related categories into broader themes
- Identifying patterns, outliers, and relationships
- Building a defensible evidence base for your conclusions
The power comes from the systematic nature. When done properly, you can point to any insight and trace it back to the specific evidence. When a stakeholder asks "How do you know that?" you have an answer.
When to use thematic analysis in the realm of business?
You might not call it "thematic analysis" in your organisation, but you're doing some version of it whenever you:
- Analyse customer interviews to understand pain points and feature requests
- Process expert network calls for due diligence or market research
- Review employee feedback from engagement survey open text responses or exit interviews
- Synthesize stakeholder input for strategy or change initiatives
- Evaluate competitive intelligence from customer interviews mentioning competitors
- Study customer support tickets to identify product issues
- Assess policy consultation responses for government or regulatory work
In all these cases, you have big amounts of unstructured qualitative data and you need to extract patterns that inform decisions.
What you are doing is thematic analysis. It's your choice is if you want to do it properly, or just cook something together and hope for the best.
The 6 phases of thematic analysis: business edition
If you study the academic side of thematic analysis, you will find references to Braun & Clarke's 2006 article, where they defined six phases of thematic analysis for research. That is the approach to take when writing something to publish in top journals after months of rigorous study. What I'm going to give you is the business-focused version - 80/20 in terms of rigor, using business language and making it practical.
Phase 1: familiarise yourself with the data
What it means: Read everything without trying to analyse it yet.
This sounds obvious, but most people skip it. They want to jump straight to "finding insights" because reading 30 interview transcripts feels inefficient. It's not. You need to understand the landscape before you start mapping it.
What this looks like in practice:
- Block some uninterrupted time (and hand out your phone to a colleaque if that is what it takes!)
- Read through all your data start to finish
- Take high-level notes on initial impressions
- Notice recurring phrases, surprising statements, or emotional reactions
- Resist the urge to start categorising yet
I covered why this matters in my guide to analysing interview transcripts. The short version: if you start coding on your first read-through, you'll miss the forest for the trees. Only in some cases (e.g., just having conducted the interviews with fresh data in your head) should you skip this step.
Example: In a customer churn project, your incoming hypothesis could be that the issue was pricing. Based on the read-through, you notice subtle references to "implementation complexity" and "lack of onboarding support" in nearly every interview. If you were looking for snippets to prove your hypothesis, you would end up building your entire analysis around the wrong theme...
Phase 2+3: Generate Codes and Themes
What it means: Identify meaningful chunks of data and label them to create patterns
In classic thematic analysis for academic purposes, phases 2 and 3 are kept separate. You are first (phase 2) supposed to create codes and to classify individual paragraphs with one or more codes and only then start to look for bigger themes (phase 3) For example is someone says "The implementation took 6 months instead of 2 weeks," you might code that as [implementation timeline], [expectation mismatch], and [onboarding challenges], and only in phase 3 start to look at the bigger picture and links between codes. This thorough coding can take hours per interview and it's normally the step where business people give up in the face of time pressure and instead of doing anything systematic at all they just jump to hip shooting.
When doing thematic analysis for business, I would combine steps 2 and 3 and proceed as follows
- Start with an initial list of insight types based on your research questions or topics of interest (e.g., "Reasons for churning", "Reasons for staying", ...)
- Break down the insight types into categories (e.g., "Pricing related churn", "Service related churn", ...) and sub-categories (e.g., "Price hike after discount ends")
- Start reading through the transcripts and map relevant quotes/paragraphs/statements/snippets to your category structure
- If the quote doesn't fit your structure, adapt the structure by creating a new category or sub-category
- If one sub-category starts to get too big (high # of quotes) or too loose (quotes feel they describe different things), consider splitting it further
Some good practices for this process
- Map at the paragraph or thought level (not sentence-by-sentence)
- Use descriptive names: "pricing concerns" not "issue_3", and document what each category means (you'll forget in 3 days!)
- A single passage can go to multiple sub-categories (that's expected!)
- Start with a simple list and keep iterating it as you go. After about 5 interviews you typically have a stable category structure
- Don't ignore things just because they don't fit your schema - keep adding categories as you go for all relevant insights!
In terms of tooling, I've seen different "manual approaches" and they all have their pro's and con's
- Visual sorting. Some like to print the statements and physically sort them to different piles. This is fun to do, but often requires lots of work afterwards to replicate the physical structure to the actual digital data for sharing and deeper analysis.
- Systematic excel. If you can get the data so that each row in excel is a paragraph, you can have the categories as columns and put a simple X to each sub-category where the text is relevant, and then use filtering and sorting to see what is in each bucket. Works for shorter transcripts but often causes headaches in terms of getting the data in useable form to the sheets and Excel isn't really made for text.
- Comment function in Word. Very easy to mark stuff, but hard to then find the quotes belonging to each category afterwards. I see this approach attempted a lot as Word is a familiar tool, but can't really recommend it to anyone :)
- Copy-pasting. Creating the category structure to one document, and then copying and pasting text from the source documents to it. Often some headaches in formatting and keeping track of the source, but effective in creating something close to the end-product quickly. To ensure you capture all the insights, consider using cutting instead of copying so you can check that the "residual documents" don't contain anything you missed.
- Dedicated tools from academy. There are specialised tools used by academics like NVIVO, Delve, ATLAS.ti and MAXQDA that have a steep learning curve but are actually made for exactly this task. In addition to being hard to use (and quite expensive!), they also assume phases 2 and 3 are done separately so you have to work against their inbuilt logic when using for more practical business research.
When still doing this manually, I preferred a mix of copy-pasting (when doing it at the desk) and visual sorting (when doing it interactively e.g., in a workshop). During COVID-19 I also tried digital boards like Miro for the task, but found them a bit painful as you get the formatting hassle and delays from using digital tools but still end up with a mess that doesn't allow smart analysis afterwards.
Concrete example from a market research project:
Customer quote: "We evaluated three vendors. All were within budget - $5K, $7K, and $6K annually. We went with the incumbent despite them being most expensive because switching would mean retraining 50 people during our busy season. The disruption cost wasn't worth the $2K savings."
**This could go to multiple sub-categories:
- [Budget considerations] - price is mentioned
- [Competitive evaluation] - comparing vendors
- [Switching costs] - retraining mentioned as barrier
- [Seasonal timing] - busy season influences decision
- [Change management] - disruption concerns
That's five codes for one paragraph. That's appropriate because this quote speaks to multiple themes.
After this phase, your data should look as follows
Insight type 1: Reasons for churn
Category 1.1: Poor support responsiviness
Sub-category 1.1.1: Support responses are slow
Sub-category 1.1.2: Can't reach account manager
Sub-category 1.1.3: No weekend support
Individual quotes for sub-category 1.1.1:
- "I called my rep, he said he was in a meeting and would get back to me asap... still waiting for the call 3 months later..."
- "Emails seem to take ages to be acknowledged, I can't believe I pay 300 EUR per month and treated like air"
- "The ticket system just shows my ticket as unopened for hours after sending despite marking it urgent"
- "I am not sure anyone is monitoring their emails at all, I get faster response asking online for advice"
Phase 4: Review and refine themes
What it means: Check your themes actually work, then polish them.
This is the "Does this make sense?" phase. You go back to your tree and check:
Internal coherence: Do all the items in a sub-category and in a category actually belong together, or should you split?
External distinctiveness: Are your sub-categories meaningfully different from each other or need merging?
Evidence quality: Do you have enough data to support this sub-category, or just 1-2 quotes?
Practical checks:
Read all quotes for a sub-category Do they actually tell a coherent story? If half the quotes are about pricing and half are about features, you probably need to split the theme.
Compare sub-categories: If they're 50% overlapping, merge them. If the distinction feels forced, it probably is.
Look at coverage: If you have a sub-category supported by 2 quotes from 1 interview out of 40, that's not a theme - that's an outlier. Either drop it or reclassify it - or try to understand if there is something actually interesting in it that merits deeper study.
Example refinement from a strategy project:
Initial sub-category: "Digital transformation challenges" with a lot of quotes in it
After review, I realised this was actually three distinct sub-categories:
- Sub-category 1: "Legacy system integration barriers" (technical)
- Sub-category 2: "Workforce lacks digital skills" (capability)
- Sub-category 3: "Leadership doesn't understand digital" (strategic)
Each needed its own sub-category because they have different implications and solutions.
Phase 5: Define your categories and sub-categories
What it means: Write clear definitions and give categories names that make sense based on the evidence collected
This is where you make your themes presentable. Each theme needs:
A clear name: Avoid abstractions, but also too fancy names or jargon. Your theme names should tell someone what the theme is about without explanation.
Bad names:
- "Temporal uncertainty dynamics"
- "Resource allocation friction"
- "Stakeholder perception divergence"
- "Let's do it!"
Good names:
- "Timeline pressure from leadership reduces quality"
- "Budget constraints limit options for 2026"
- "Sales and Product teams have different views pricing"
A 2-3 sentence summary: Spell out what this theme captures and why it matters.
Example theme definition from a due diligence project:
Sub-category name: "Customer concentration risk is higher than disclosed"
Summary: While the company reports that no single customer exceeds 10% of revenue (true), interviews reveal that their top 3 customers are all owned by the same private equity firm and are all evaluating the same competing product. Loss of the PE relationship could mean losing 25% of revenue simultaneously. This concentrated dependency risk is not visible in standard customer concentration metrics.
Supporting evidence: 6 of 12 customer interviews mentioned the PE connection; 3 mentioned active competitor evaluations.
Why this level of detail matters: Anyone reading your analysis should understand not just what the theme is, but why you created it and what evidence supports it.
I would do this step for all themes that even roughly fit your research questions or interest. In phase 6 you will narrow your insights down, but it's good to have a library of insights ready in case you need to revisit your data later.
Phase 6: Produce the end product
What it means: Translate your thematic analysis into the format stakeholders need.
This is not about writing an academic paper. It's about putting your themes into whatever format drives decisions in your context:
- Consulting deck: Each theme becomes a slide with key evidence
- Market research report: Each theme becomes a section with supporting quotes
- Product strategy doc: Themes become "What we learned" with implications
- Executive brief: Themes become bullet points with evidence and recommendations
Structure for business deliverables:
For each theme:
- State the theme (clear headline)
- **Summary of findings (2-3 paragraphs)
- Provide evidence (3-5 representative quotes + other data)
- Add context (are there patterns by customer segment, geography, role?)
- Show implications (what does this mean for our decision?)
Example slide structure:
Slide title: Support Response Time Drives Churn
Summary of findings: Customers who churned cited support response time as primary factor, but the issue isn't speed alone - it's the perception that "low-value customers get slow support."
Evidence:
- "We pay $50/month and wait 48 hours for responses. Enterprise customers get same-day support." (Small business owner)
- "The message is clear: we're not important enough for fast support." (Mid-market customer)
- "Support was fine until we were late on payment once, then response time doubled." (Churned customer)
Pattern: Support speed complaints concentrated among customers in $25-100/month tier (12 of 15 mentions). Enterprise customers ($1K+/month) did not mention this.
Implication: Two-tier support structure is visible to customers and creates resentment. Either improve SMB response times or make support tiers explicit in pricing.
Critical point: Don't try to use every sub-category. Focus on the storyline-relevant ones. In a typical project, maybe 60% of sub-categories from phase 5 make it into the main deliverable. The other 40% are background context or interesting-but-not-critical findings that go into an appendix or get dropped.
Common Mistakes in Business Thematic Analysis
Mistake 1: Too many categories
If you have 20 categories, you haven't analysed - you've just reorganised your data.
The fix: Be ruthless about consolidation. Categories should represent patterns, not topics. If categories are very similar, merge them. Use sub-categories to add nuance where needed.
Mistake 2: Final analysis is about topics, not insights
Bad title: "Pricing" (that's a topic) Good title: "Pricing increases outpacing perceived value delivery" (that's an insight)
Bad title: "Customer support" Good title: "Support response time creates perception of second-class treatment"
The fix: Every category definition in phase 6 should contain a verb or imply a relationship. It should tell someone something, not just label a category.
Mistake 3: Confirmation bias
You had a hypothesis going in. Suddenly all your themes confirm it. Convenient!
The fix: Actively look for disconfirming evidence. If all your themes align perfectly with your initial assumptions, you're probably not listening to the data.
Mistake 4: Losing traceability
You have themes, but you can't easily trace them back to specific quotes or show which interviews contributed to each theme.
The fix: Maintain clear links between insight types → categories → quotes → source interview transcripts/audio. Tools help here (whether Excel, NVivo, or Skimle).
Mistake 5: Treating all themes equally
Not all themes are equally important for your decision. But you spend equal time on each.
The fix: Prioritise themes by decision-relevance, not just by frequency. A theme mentioned by 5 people that's critical to the business decision deserves more attention than a theme mentioned by 20 people that's interesting but not actionable.
Manual vs AI-assisted thematic analysis
Let's be honest about the time investment
Manual coding of 30-40 interviews using traditional approach:
- Familiarisation: 4 hours
- Initial coding and category development: 16 hours
- Review and refine: 6 hours
- Create deliverable: 8 hours
Total: 34 hours, i.e. roughly a full week of work
That's not counting the time learning tools like NVivo or MAXQDA if you go that route. (or fighting with Word suddenly autoformatting your quotes...)
Against this backdrop, it's now wonder that many consultants, market researchers and other knowledge workers were lured by the promise of tools like ChatGPT once they first came out. Could the entire work be done in minutes by just uploading the transcripts to the black box and downloading a ready report?
Experts quickly learned that simplistic AI analysis tools (like ChatGPT, Claude, Gemini) do not actually analyse the data, they rather produce reports that resemble analysis. The challenge is that there is no way to make sure each paragraph and insights was considered, no way to check no quotes were hallucinated or misrepresented, and no way to challenge and further work with the data. And by taking the "magic AI route" and not immersing themselves to the data, experts risk being made irrelevant as anyone can produce AI slop quality analysis.
That's why we developed Skimle. It is based on translating the 40 years of experience of our two co-founders (Henri from academia, Olli from consulting) on how to do thematic analysis into a multi-step, AI-assisted workflow.
What Skimle does:
- Reads each document systematically (like a human analyst would)
- Identifies meaningful passages and creates a category structure (equivalent to Phase 2+3 analysis)
- Links every theme to source quotes (maintains full traceability)
- Presents everything in spreadsheet view (for you to merge/split/edit/hide/rename as needed)
- Summarises findings to Excel, Word and PowerPoint exports (draft deliverable for you to review and improve)
AI-assisted rigorous approach (with Skimle):
- Upload & AI processing: 1 hour
- Review AI-suggested themes and source documents: 3 hours
- Refine and reorganise: 3 hours
- Refine autogenerated deliverable: 4 hours
Total: 11 hours
Our experience indicates that Skimle speeds up the overall timeline by 3x in terms of like-for-like outputs. Now, you have then the choice of either using this as an efficiency gain (for example, one consulting client simply did not have enough capacity to handle all their analysis, so with Skimle they were able to deliver without hiring new people), or to spend more time on insights (for example, refining the themes deeper, including more interviews in the analysis set, performing an mid-point analysis to guide the later batch of interviews etc.).
The AI helps with the mechanical work, but you still need to the human judgment parts (familiarisation, refinement, interpretation).
Who wins, carbon or silicon?
If you compare manual coding with AI-assisted rigorous approach, there are different advantages:
Manual coding advantages:
- You develop deep familiarity with every sentence
- Coding choices reflect your expert judgment from the start
- Some find the manual process helps them think - kind of like knitting in a meeting to stay focused
AI-assisted advantages:
- More comprehensive coverage (AI doesn't get tired and miss things)
- Faster iteration (can try different theme structures quickly)
- Often catches patterns humans miss (minority perspectives, subtle themes)
- Maintains perfect traceability (every quote linked to themes)
For business work with deadlines, I believe AI-assisted is hard to beat - as long as you go for a rigorous approach instead of trying to pass AI slop to your boss. For academic work where you have months and methodology must be detailed and conform to existing standards, manual can be defendable. But this is rapidly changing too.
If you want to try AI-native thematic analysis that maintains academic rigor, try Skimle for free.
Related Resources
- How to analyse interview transcripts - My 5-step framework for interview analysis (thematic analysis is Step 2)
- How to summarise expert interviews - Practical workflow from 30 expert calls to synthesis
- Qualitative analysis tools comparison - How NVivo, MAXQDA, and AI tools compare
- How to conduct effective business interviews - Get better source data through better interviewing
Olli from the Skimle team
