The chasm between viral AI demos and successful real-world apps, and how to cross it
Skeuomorphism, the AI demo trap, and other learnings from building an enterprise-grade AI app
Remember the skeuomorphic design from early iOS versions that mimicked the physical world?
13 years since iOS 6, skeuomorphism is very much back, this time in the form of AI app design that mimics humans. Almost every week there’s a wildly impressive demo that goes viral on Twitter showing an AI app with human-like design doing human-like things.
And yet, most of these viral demos never become successful real-world applications. We learned this the hard way.
Our first version: a skeuomorphic failure
At Alpharun we allow companies to automatically run AI voice interviews with their customers to uncover the drivers of lost deals and user drop-off and churn. The most important part of our application is the AI voice interview platform; our customers trust us to provide a great interview experience to their users and uncover meaningful insights and feedback in the process.
We started by building an interview experience that resembled a traditional research interview as much as possible; the user would start their interview, and then just respond out loud as an AI-generated voice asked them questions.
This design was a very impressive demo when everything worked as expected. But we quickly realized we had a problem when we noticed phrases like “test test…is this recording?” appearing at the beginning of interview transcripts. Then we noticed in shadowing sessions that participants were actually a bit nervous as the audio recording started. And on top of it, the AI-generated voice was polarizing: users were impressed but also found it unnecessary.
We had built a skeuomorphic failure: it loosely resembled a traditional customer interview but didn’t actually provide a high-quality interview experience.
Reimagining the interview experience from the ground up
Confronted with these results, we challenged ourselves to stop mimicking something familiar and instead reimagine the interview experience by preserving the great parts (smart clarifying questions, more detailed answers with voice rather than typing) and solving the not so great parts (need to schedule a call, large time commitment, can’t pause and resume, etc.).
This exercise led us to the simple yet highly effective design that powers Alpharun interviews at scale today:
There’s no bizarre human-like avatar, no artificial voice that asks questions out loud, and no attempt to trick the user into thinking they’re talking to a human. Participants answer out loud at their own pace and Alpharun generates dynamic clarifying questions along the way. With this simple user experience (and a million optimizations and polished details behind the scenes), users complete interviews in about 5 minutes and receive an optional gift card for their time.
Almost immediately after releasing this update, user confusion largely disappeared, average response length (one metric we use to assess interview quality) more than doubled, and interview completion rates dramatically improved as well.
We even started to see interview participants reach out to find ways to use Alpharun at their own companies:
I just took an interview with your product and I’m blown away. Amazing…can you please lmk your basic pricing model? Great work on the product.
- An Alpharun Interview Participant
What we learned
Our experience building Alpharun has showed us that demos reward human-like skeuomorphism, but real world users of your application do not. Many of the things that make a demo flashy and exciting are actually the same things that will hold your application back as you scale:
Before AI, the hardest part about building a startup was finding a problem worth solving and getting people to actually care about your solution. Now there are exciting AI demos solving massive problems everywhere you look, but very few startups have achieved lasting adoption. This is the new chasm that AI startups have to cross:
3 tips for crossing the AI demo chasm
1. Reimagine rather than mimic
There’s a category of AI apps where “human-like” is part of the core value proposition (AI characters, therapy, AI-generated audio books, etc.). But for every other AI-enabled app, embrace the opportunity to build a better AI-first experience from the ground up, rather than mimicking the way it used to work. In B2B software in particular, apps deliver the most value when the AI-first user experience is a fundamental re-imagination of the human-led alternative, rather than a cheaper replica.
If you find yourself spending a lot of time building things that look impressively human-like rather than simple and reliable, it might be a sign you’re building a great demo but not a great app for the real-world.
2. More transparency, less magic
It’s tempting to position AI like magic fairy dust throughout your product because it can generate initial excitement from customers and inflate the perceived value of your product. But as customers become increasingly skeptical of AI applications that over-promise, we’ve found that using AI in simple, transparent ways builds trust for the long run.
For example, one of our customer’s favorite features is an AI-generated sentiment score that gives them a quick indication of the satisfaction of a customer they interviewed with Alpharun. Rather than providing the score in isolation, we provide the exact positive and negative customer quotes that contributed to the score:
3. Choose opinionated design over an open-ended chat interface
Possibly because of the meteoric success of ChatGPT, there are a lot of AI products that eschew more structured UX in favor of an open-ended chat interface. And while there are certainly some applications where the open-ended nature of chat is crucial, many apps would benefit from providing their users with more direction.
As flexible and powerful as a chat interface can be, it also has a number of shortcomings:
Language is not that efficient: if you wanted to see drafts in your email client is easier to click a button or type out “Show me emails I haven’t sent yet”?
Language is not very precise: if you needed an accurate report on your annual recurring revenue, would you trust a chat interface to interpret a question like “What is our revenue?” and magically infer the timeframe, revenue categorization, and geographies that you had in mind?
Lack of direction: an empty text box with a blinking cursor puts the burden on users to figure out how they can get value out of your product
At Alpharun, we’ve tried to design our Insights functionality to immediately provide value, rather than leaving it up to our users to figure it out:
Wrapping up
It’s still early days in AI, and we’re excited to keep building and learning from our customers. I’d love to hear your thoughts and feedback (comment below or email me: paul@alpharun.com), and if you think Alpharun could be useful to your team, book a setup call with me here.