AI Strategy

AI Won't Replace Your Users (But It Will Make You a Better Researcher)

There is a conversation happening in UX right now that feels familiar. On one side, you have people claiming AI will render user research obsolete. On the other, you have researchers digging in, insisting nothing can replace talking to real humans. Both camps are missing the point.

AI should not be your only means of assessing the usability of a system. But used well, it can make you significantly better at doing so.

The New Smoke Test

Here is something I have been experimenting with: taking a tool like Claude Code or an OpenAI agent, pointing it at a browser, and asking it to complete a task on a website. Navigate to a product page, find the returns policy, complete a checkout flow. That sort of thing.

It is not a usability test. It does not tell you how a confused 58 year old with bifocals and a dodgy trackpad will struggle with your date picker. But it does something genuinely useful. It acts as a first stage proxy, a smoke test for your interface. If an AI agent cannot figure out how to complete a basic task on your website, you have a problem that does not require five recruited participants to identify. You have a problem that needs fixing before you get anywhere near a real user.

Think of it as the equivalent of walking around the building before the fire drill. You are not testing evacuation procedures. You are checking the doors are not locked.

Synthetic Users: Useful, But Not What You Think

The idea of synthetic users has generated a lot of excitement and a fair amount of scepticism. The research tells a more nuanced story than either camp would like.

The Stanford and Google study from 2024 is probably the most impressive work here. Researchers built "digital twins" from two hour qualitative interviews with over a thousand people. Those twins replicated survey responses with 85% accuracy, which is roughly comparable to how consistently the real participants replicated their own answers when retested two weeks later. That is remarkable.

But here is where it gets interesting, and where the hype merchants tend to go quiet. That accuracy was achieved because each synthetic user was built from a rich, detailed, two hour qualitative interview. When researchers tried building synthetic users from simple demographic descriptions or brief persona paragraphs, the results were significantly worse. The Nielsen Norman Group's own evaluation found that synthetic user responses were "too shallow to be useful" for many research activities, and that the tools proved most valuable as a form of desk research rather than as a substitute for real participants.

So the best synthetic users are created after detailed user research. There is an irony in that worth sitting with for a moment. To build a good fake user, you first need to properly understand the real ones.

Where synthetic users genuinely shine is in hypothesis generation and research planning. Using them to generate initial hypotheses before fieldwork helps you ask better questions and design more focused studies. They are good at synthesising publicly available information about a user group quickly, helping you build contextual understanding before you engage real participants. They can help you prepare. They cannot help you conclude.

What AI Cannot Do

The limitations are structural, not temporary. They are not bugs that will be fixed in the next model release.

AI cannot discover the unexpected. Real users constantly surprise researchers with behaviours, workarounds, and needs that emerge from the messy complexity of actual life. Synthetic users can only interpolate from existing data. They cannot extrapolate beyond it. They will never tell you about the customer who holds their phone in their left hand because they are carrying a toddler, or the one who pastes their password from a sticky note photo on their camera roll.

Sycophancy is baked into the architecture. Large language models are trained to be agreeable. Anthropic's own research demonstrated that user suggestions of incorrect answers can reduce model accuracy by up to 27%. Your synthetic users will tell you what you want to hear. That is a feature of the tool, not a finding.

And then there is the bias problem. Models trained on internet text reflect the biases of that data. Multiple studies have found that synthetic responses skew towards educated, Western, liberal viewpoints. Performance is measurably worse for marginalised groups. If you are designing for everyone, synthetic users are only really representing some of them.

We Have Been Here Before

In 2000, Jakob Nielsen published his famous argument that you only need to test with five users to uncover 85% of a product's usability problems. The claim was based on a mathematical model he developed with Thomas Landauer in 1993, and it was genuinely revolutionary.

At the time, usability testing was seen as expensive, time consuming, and the preserve of organisations with serious research budgets. Nielsen's insight, backed by the maths, was that you did not need a lab full of participants and a six week timeline. Five people, tested iteratively across multiple rounds of design changes, would get you most of the way there. His recommendation was not to run one big study. It was to run three small ones with five participants each, fixing problems between rounds.

The impact was enormous. It made usability testing accessible. It made it affordable. It gave designers and product teams a credible argument for doing any testing at all, rather than none. It did not replace the need for larger quantitative studies when the stakes demanded them, but it democratised the practice of watching real people use your product.

AI is doing the same thing now, in a different way. Using AI as synthetic users or to smoke test a website does not replace real research any more than Nielsen's five user approach replaced large scale quantitative studies. What it does is lower the barrier to understanding your UX landscape. It gives you a faster, cheaper way to build a foundation of insight that you can then deepen with real human research.

The Multiplier Effect

Here is the thing that gets lost in the "will AI replace researchers" debate. In the hands of an experienced UX professional, these tools do not replace expertise. They multiply it.

An experienced researcher who uses an AI agent to smoke test a prototype before their first research session will walk into that session with better hypotheses, sharper tasks, and a clearer sense of where the problems are likely to be hiding. They will waste less time on the obvious issues and spend more time exploring the subtle ones that only emerge when you watch a real person interact with a real system in a real context.

An experienced researcher who uses synthetic users to pressure test their discussion guide will ask better questions. One who uses AI to rapidly synthesise competitor research or domain knowledge will have richer contextual understanding before they speak to their first participant.

The value is not in the AI output itself. The value is in what a skilled practitioner does with it.

Practical Guidance

If you are thinking about bringing AI into your research practice, here is where the evidence points.

  1. Use AI agents as a first pass. Point them at your interface and see if they can complete your core tasks. If they struggle, you have low hanging fruit to address before you involve real participants. This is cheap, fast, and genuinely useful.
  2. Use synthetic users to prepare, not to conclude. They are most valuable before and between rounds of real research. Use them to generate hypotheses, build contextual understanding, and sharpen your research questions.
  3. Be deeply sceptical of positive feedback from synthetic users. Sycophancy is well documented. If your synthetic users love everything, that is a property of the model, not a property of your design.
  4. Feed them real data. If you are going to use synthetic users, enrich them with your own proprietary research data. Generic synthetic users based on demographics alone produce the weakest results. The richer the input, the more useful the output.
  5. Never use them for final decisions. Directional input, yes. Design validation or go or no go decisions, absolutely not.

The Bottom Line

AI is not going to replace user research. But it is going to make the gap between "no research" and "some research" much easier to cross. And for most organisations, that gap is where the biggest problems live.

Twenty five years ago, Nielsen showed us that five users and a willingness to iterate was enough to transform a product's usability. The tools have changed. The principle has not. Start with what you can do. Use AI to get smarter about what you need to learn. Then go and learn it from the people who actually matter: your users.

Want to bring AI into your research and design practice?

Explore AI Adoption
All Insights