Investing in AI Voice Agents: Where Do We Go From Here?
In the past year, there’s been a notable influx of AI-application layer companies building with voice as the primary modality. This is in part driven by the rapid improvement of voice technology at the infrastructure layer, including the development of conversational models that work with audio directly as opposed to the initial AI speech-to-text method where models transcribed speech before generating a response. By directly responding to audio as opposed to going through the process of transcription, models can better understand pronunciation, tone, and other contextual clues to help them interpret the conversation and respond in human-like form. The conversational quality of voice agents is now considered to be at par with or even outperforming call centers, and it will only get better.
In addition to technological tailwinds, strong behavioral tailwinds are driving the adoption of voice-first technology. Phone-based communication is still one of the most widely-used and information dense forms of communication across industries. It’s heavily relied on in the back office and front office to acquire new customers, work with existing customers, manage the workforce, order and track inventory, and more. For example, in the home services sector, it’s reported that 62% of customers called at least once during their purchasing journey (Invoca). Voice agents have the opportunity to disrupt entrenched work processes and replace labor-intensive manual workflows that software has yet to touch.
As voice technology improves and high-value use cases emerge, startups have begun to flood the market with specialized voice agents. Many are taking a vertical approach by building for one industry, such as Toma for automotive dealerships, Liberate for insurance providers, Sameday for home services, or Domu for financial services. Training agents on industry-specific data and use cases allows vertical voice AI startups to build more effective products that are personalized to the nuances of the industry and customer they serve.
Many startups are using voice agents as a “wedge product,” automating one key workflow such as outbound sales or customer service, anything that provides immediate ROI and builds trust with the customer. The broader goal with a wedge product is to gradually build upon the relationship and acquire more of the customer’s tech stack to eventually move into the position of a fully integrated system of record. This roadmap is great in theory, but execution involves building deep, trusted relationships with customers and consistently adding material value to the business. Not so easy in reality, especially at a time when the barrier to building a useful wedge product has decreased significantly.
On top of the steady flow of new voice AI startups, incumbents like Google, Amazon, and OpenAI are also introducing voice applications to their networks of users. Despite tough competition, technological tailwinds and a diverse range of high-value wedge use cases still make voice AI a compelling investment opportunity. So as investors, which teams should we back? What is the most effective wedge product and how do we know which companies will successfully expand their product offering into a lasting, venture-scale, system of record?
Without access to consistent or holistic data on revenue traction or customer distribution, we decided to take a deeper look at companies building voice agents that have raised a Series A or beyond to identify key attributes and company characteristics that could lead to durable, long-term success for voice agent startups. We looked at a cohort of 45 venture-backed companies that have raised financing rounds in the last 5 years and offer voice-based intelligence as a core or initial product. A few key differentiators stood out to us:
Team
71% of founding CEOs have a technical background with experience as a software engineer, and many with a specific focus on AI. This doesn’t come as a surprise; with the pace of innovation moving rapidly, it’s important to back a team with deep technical expertise that can be flexible and continuously integrate new capabilities.
More surprising was that only 29% of companies had at least 1 co-founder with domain expertise in the industry for which they’re building. Given the majority of companies in the cohort have a vertical-specific focus, we expected to see more evidence of industry expertise. While these results seem counterintuitive and warrant a deeper dive into how the vertical software founder profile may be evolving, at Interplay, we still believe domain expertise is an important part of a founder’s background. As the team at Euclid Ventures has found in a past analysis, exited vertical software companies in recent years have been disproportionately run by founders with deep industry experience.
Business Model
75% of the 45 companies are building front-office, customer-facing products as their initial wedge. Many of the more established companies have begun to diversify the product offering and also offer back-office integrations, but the core value proposition is geared towards customer acquisition or management. This trend reflects the difference between “revenue driving” applications versus those that are primarily cost saving. Although increasing internal efficiency is critical to all businesses, it can be easier to sell customers on software that directly impacts topline growth. Both may drive material value, but the optics are important for effective customer acquisition, especially when there is steep competition in the market.
Industry or Use Case Focus
Voice AI has been particularly successful in industries where phone communication is the primary mode of interaction both internally and externally, especially in sectors with established call center operations. Voice agents can also have outsized value in industries where there is a high concentration of SMBs. As Bessemer Venture Partners cited in a recent piece on voice AI, SMBs miss 62% of their calls on average, losing critical opportunities for new customer acquisition or existing customer support. Industries where voice agents have gained quick traction include restaurant (Slang, ConverseNow, Hi Auto), healthcare (Hyro, Infinitus, Hippocratic, Suki), and recruiting (ConverzAI, Maki).
For voice agent use cases, it’s beneficial when calls follow a similar pattern for call length, format, and outcome. Clear and consistent data effectively trains the agents and allows for more measurable results. Use cases where voice agents have gained quick traction include customer support (Decagon, Sierra, Parloa, Cresta) and sales (Artisan, 11x).
Product Differentiation
Voice agents offer an alternative method of interacting with software. Unlike with traditional manual software, the user is not clicking around on a screen, but is instead interacting through conversation and voice commands. Due to this shift, attention to user experience is critical to build trust and drive retention with customers. Effective ways we’ve seen voice startups improve UX include:
- Human-like interactions: 11x is developing an AI-driven workforce for sales and customer engagement. 11x emphasizes the human-like experience that their agents provide, such as with Julian, a sales representative who has personalized conversations and can reason, predict, and act in real-time. Humanizing the agents makes customers more comfortable using and trusting the product.
- Multi-modal functionality: Regal is building sales agents that meet customers wherever they feel most comfortable communicating. Custom agents can be deployed across any channel, including voice, SMS, and email. Providing this optionality makes customers more willing to initially adopt new software.
- Integration with existing workflows: HappyRobot offers AI communication tools for logistics, automating inbound and outbound calls across operations. HappyRobot integrates across existing platforms, such as a company’s transportation management system and electronic logging device to access real-time data and ensure communication is accurate and efficient. Extensive integration improves agent training and intelligence, allowing for the optimal user experience.
Dollars raised and company valuations are not a sure indicator of a business’s long-term success; however, we believe there is still value in examining this cohort and that the trends discussed in this piece can improve our ability to invest in this category in the future.
Overall, we’re excited to watch as voice AI continues to evolve and be deployed across industries. Specific sectors that we’re excited about and believe voice AI will heavily impact business productivity include home services, financial services, government, and healthcare. If you are building a startup that leverages AI voice agents, we’d love to chat! Please feel free to reach out to our team at ce@interplay.vc.