'Her', closer: OpenAI launches an AI that interacts with voice like a human

We are getting closer to an AI that behaves with the ease of the one that stars in the movie Her – it is on Filmin and Prime Video.

Oliver Thansan
Oliver Thansan
14 May 2024 Tuesday 04:38
2 Reads
'Her', closer: OpenAI launches an AI that interacts with voice like a human

We are getting closer to an AI that behaves with the ease of the one that stars in the movie Her – it is on Filmin and Prime Video. Last night OpenAI presented a new language model, GPT-4o (“or” for omni) that interacts from image and voice and responds instantly, with the speed of a human, in a way that is apparently indistinguishable from a person.

OpenAI defines the new model as “a step forward towards a much more natural interaction between human beings and computers.” GPT-4o can be asked to interact from any combination of text, audio and image, and in response it generates any combination of text, audio and image. The most surprising thing is its speed. It answers questions in just 232 milliseconds, like a person, so you carry on conversations naturally.

Through the ChatGPT app, the AI ​​can access an image through the mobile camera or a screenshot, and also the user's voice through the microphone. The same can be done with the computer screen, where you can show, for example, a fragment of programming code so that you can discover, out loud, where the errors are. GPT-4o can act as a simultaneous translator in 50 languages, it is capable of using different tones of voice and even sing. OpenAI CTO Mira Murati and two of the company's programming engineers showed several examples of its capabilities.

One of them showed his face to the camera and asked ChatGPT to try to tell him what emotions he was feeling. “You seem to be feeling quite happy and cheerful, with a big smile and maybe even a touch of excitement. Whatever you're going through, you seem to be in a very good mood. Share the source of those good vibes,” she responded. “The reason I'm in a very good mood,” said the engineer, “is because we were making a presentation showing how useful you are.” “Oh, stop. “You're making me blush,” the AI ​​replied.

Starting yesterday, OpenAI began deploying the text and image functions of GPT-4o in ChatGPT. AI will be available on the free tier, although paid users will have up to five times the message limit. The company announced that in the coming weeks, it will launch a new preview version of voice mode with GPT-4o in ChatGPT Plus.