All articles

OpenAI’s New Voice Agent Attracts Devs with More Natural Sound, MCP, and Price Cut

Cresta News Desk
published
September 7, 2025
Credit: openai.com (edited)

Key Points

  • OpenAI released its most advanced speech-to-speech model, gpt-realtime, making its Realtime API generally available for developers.

  • The Realtime API now supports SIP, image processing, and MCP, enhancing enterprise use cases for voice-enabled applications.

  • OpenAI's unified model reduces latency, improving the naturalness of AI voice interactions.

  • OpenAI cuts the price of its new voice model by 20% to attract more developers.

  • The competitive voice AI market raises questions about consumer preferences for human-like AI interactions.

OpenAI is pushing to make AI voice agents feel less robotic by releasing its most advanced speech-to-speech model, gpt-realtime, and moving its Realtime API into general availability. The move equips developers with a more powerful and production-ready toolkit to build voice-enabled applications.

  • More tools in the box: The Realtime API now has a new set of features designed for enterprise use cases. It supports SIP to connect agents to phone networks, can process image inputs to see what a user is seeing, and adds MCP support to simplify how agents connect to external tools and data.

  • Killing the lag: The core of the update is a unified model that handles a conversation from start to finish, avoiding the traditional method of chaining separate systems for transcription and speech. That architecture slashes the latency that often makes talking with AI agents feel stilted and unnatural.

  • A friendlier conversation: To get the model ready for real-world tasks, OpenAI trained it directly with customers in fields like education and customer support. Zillow’s Head of AI, Josh Weisberg, who got early access, said the model shows “stronger reasoning and more natural speech,” adding that the technology “could make searching for a home... feel as natural as a conversation with a friend.” To get more developers on board, OpenAI is also cutting the price for the new voice model by 20%.

By making its most capable voice AI cheaper and more accessible, OpenAI is escalating the race to build voice agents that can finally move beyond simple commands and into complex, real-world conversations. OpenAI’s push comes as the voice AI market becomes increasingly competitive, with specialized players also vying for enterprise adoption. The drive for more natural-sounding agents raises the question of whether consumers actually want AI to sound human in the first place. Meanwhile, businesses are exploring AI voice to solve expensive operational problems, like the high cost of agent turnover in contact centers, which can run as high as 45% annually.