OpenAI mentioned Thursday that its API will now embody numerous new voice intelligence options designed to assist builders create apps that may speak, transcribe, and translate conversations with customers.
The corporate’s new GPT‑Realtime‑2 is one other voice mannequin, constructed to create a sensible vocal simulation that may converse with customers. Nonetheless, not like its predecessor (GPT-Realtime-1.5) this one is constructed with GPT‑5‑class reasoning that OpenAI says was created to cope with extra difficult requests from customers.
The corporate can be launching GPT‑Realtime‑Translate, which, simply because it sounds, is designed to offer real-time translation providers that “preserve tempo” with the consumer, conversationally. The characteristic contains greater than 70 input languages (that’s, the languages that it may well comprehend) and 13 output languages (the languages it relays to the speaker).
Lastly, the corporate has additionally launched a brand new transcription functionality, GPT-Realtime-Whisper, which provides customers reside speech-to-text capabilities which are captured as interactions happen.
“Collectively, the fashions we’re launching transfer real-time audio from easy call-and-response towards voice interfaces that may really do work: pay attention, cause, translate, transcribe, and take motion as a dialog unfolds,” the corporate mentioned.
Who will these updates be good for? Firms that need to increase customer support capabilities are an apparent goal. Nonetheless, OpenAI additionally notes that its new options will help with a wide selection of areas, together with training, media, occasions, and creator platforms, amongst others.
As helpful as these instruments appear from an enterprise perspective, it additionally appears believable that they may very well be misused. The corporate mentioned it has constructed guardrails to cease its new options from being abused to create spam, fraud, or different types of on-line abuse. Sure triggers have been embedded within the system in order that “conversations will be halted if they’re detected as violating our dangerous content material pointers,” OpenAI mentioned.
Techcrunch occasion
San Francisco, CA
|
October 13-15, 2026
All the new voice fashions are included in OpenAI’s Realtime API. Translate and Whisper are billed by the minute, whereas GPT-Realtime-2 is billed by token consumption.
If you buy via hyperlinks in our articles, we might earn a small fee. This doesn’t have an effect on our editorial independence.
