OpenAI unveils voice intelligence capabilities in its API

OpenAI announced on Thursday that its API now includes several new voice intelligence features, designed to help developers build apps capable of speaking, transcribing, and translating conversations.
The company's new GPT‑Realtime‑2 is another voice model, built to produce a realistic vocal simulation that can engage in conversation with users. However, unlike its predecessor (GPT‑Realtime‑1.5), this version incorporates GPT‑5‑class reasoning, which OpenAI says was developed to handle more complex user requests.
The company is also releasing GPT‑Realtime‑Translate, which — as the name suggests — offers real‑time translation services that keep pace with the user during conversations. This feature supports over 70 input languages (languages it can understand) and 13 output languages (languages it speaks back).
Finally, the company has introduced a new transcription capability called GPT‑Realtime‑Whisper, which provides live speech‑to‑text functionality that captures words as interactions happen.
“Together, the models we are launching move real‑time audio from simple call‑and‑response toward voice interfaces that can actually do work: listen, reason, translate, transcribe, and take action as a conversation unfolds,” the company said.
Who will benefit from these updates? Companies looking to expand customer service capabilities are an obvious audience. However, OpenAI also notes that the new features will support a wide range of areas, including education, media, events, and creator platforms.
As useful as these tools may be from an enterprise perspective, there is also potential for misuse. The company says it has built guardrails to prevent its new features from being abused for spam, fraud, or other forms of online abuse. Specific triggers have been embedded in the system so that “conversations can be halted if they are detected as violating our harmful content guidelines,” according to OpenAI.
All the new voice models are included in OpenAI’s Realtime API. Translate and Whisper are billed per minute, while GPT‑Realtime‑2 is billed based on token consumption.
Related article
OpenAI outlines AI economy with public wealth funds, robot taxes, and four-day week
As governments struggle to manage the economic impact of superintelligent machines, OpenAI has released a set of policy proposals outlining how wealth and work could be reshaped in an "intelligence age." The ideas blend traditional left-leaning mecha
Greg Brockman reveals how Elon Musk departed OpenAI
In late August 2017, key figures at OpenAI—then a small nonprofit research lab—met to discuss how they would establish a for-profit entity to commercialize their technology and raise the capital needed to achieve AGI.Elon Musk was demanding full cont
Pentagon signs deals with Nvidia, Microsoft, AWS to deploy AI on classified networks
After previously reaching agreements with Google, SpaceX, and OpenAI, the U.S. Defense Department announced Friday that it has now signed deals with Nvidia, Microsoft, Amazon Web Services, and Reflection AI to deploy their AI technologies and models
Related Special Topic Recommendations
Comments (0)
0/500

OpenAI announced on Thursday that its API now includes several new voice intelligence features, designed to help developers build apps capable of speaking, transcribing, and translating conversations.
The company's new GPT‑Realtime‑2 is another voice model, built to produce a realistic vocal simulation that can engage in conversation with users. However, unlike its predecessor (GPT‑Realtime‑1.5), this version incorporates GPT‑5‑class reasoning, which OpenAI says was developed to handle more complex user requests.
The company is also releasing GPT‑Realtime‑Translate, which — as the name suggests — offers real‑time translation services that keep pace with the user during conversations. This feature supports over 70 input languages (languages it can understand) and 13 output languages (languages it speaks back).
Finally, the company has introduced a new transcription capability called GPT‑Realtime‑Whisper, which provides live speech‑to‑text functionality that captures words as interactions happen.
“Together, the models we are launching move real‑time audio from simple call‑and‑response toward voice interfaces that can actually do work: listen, reason, translate, transcribe, and take action as a conversation unfolds,” the company said.
Who will benefit from these updates? Companies looking to expand customer service capabilities are an obvious audience. However, OpenAI also notes that the new features will support a wide range of areas, including education, media, events, and creator platforms.
As useful as these tools may be from an enterprise perspective, there is also potential for misuse. The company says it has built guardrails to prevent its new features from being abused for spam, fraud, or other forms of online abuse. Specific triggers have been embedded in the system so that “conversations can be halted if they are detected as violating our harmful content guidelines,” according to OpenAI.
All the new voice models are included in OpenAI’s Realtime API. Translate and Whisper are billed per minute, while GPT‑Realtime‑2 is billed based on token consumption.
OpenAI outlines AI economy with public wealth funds, robot taxes, and four-day week
As governments struggle to manage the economic impact of superintelligent machines, OpenAI has released a set of policy proposals outlining how wealth and work could be reshaped in an "intelligence age." The ideas blend traditional left-leaning mecha
Greg Brockman reveals how Elon Musk departed OpenAI
In late August 2017, key figures at OpenAI—then a small nonprofit research lab—met to discuss how they would establish a for-profit entity to commercialize their technology and raise the capital needed to achieve AGI.Elon Musk was demanding full cont
Pentagon signs deals with Nvidia, Microsoft, AWS to deploy AI on classified networks
After previously reaching agreements with Google, SpaceX, and OpenAI, the U.S. Defense Department announced Friday that it has now signed deals with Nvidia, Microsoft, Amazon Web Services, and Reflection AI to deploy their AI technologies and models





Home






