ChatGPT can now “see, hear and speak”

OpenAI has released a significant update to ChatGPT that allows the chatbot to hold voice conversations with users and interact with them through images.

Oliver Thansan
Oliver Thansan
26 September 2023 Tuesday 11:25
7 Reads
ChatGPT can now “see, hear and speak”

OpenAI has released a significant update to ChatGPT that allows the chatbot to hold voice conversations with users and interact with them through images. In other words, with this new version the artificial intelligence (AI) application "can now see, hear and speak", according to the company's announcement on its blog. In this way, it incorporates functions that already integrate other well-known assistants on the market such as Apple's Siri and Amazon's Alexa.

This update is part of growing competition in the generative AI market. Google is preparing to launch its own answer to ChatGPT called Gemini in the coming weeks, while Amazon invested $4 billion a few days ago in the AI ​​startup Anthropic. In this context, Open AI wants to maintain the initiative in a market that it has led since last year's launch of ChatGPT, which is already used by millions of users around the world for a wide variety of tasks, from summarizing documents to writing source code.

The new voice feature will allow ChatGPT to narrate stories, resolve debates, and speak users' text input aloud. "We collaborate with professional voice actors to create each voice. We also use Whisper, our open source speech recognition system, to transcribe your spoken words into text," the company notes on its blog. It is the same technology that Spotify will use from now on to clone the voice and translate podcast content into different languages.

The addition of image support also allows users to take photos of their surroundings and ask ChatGPT to resolve issues or provide feedback about what they see. This feature will compete with services like Alphabet's Google Lens.

Artificial intelligence experts believe that combining multiple modalities of information is essential to create more advanced, human-like AI. However, the collection of voice and image data raises concerns about privacy and data use by technology companies. OpenAI claims that user data will be deleted after 30 days if not saved, but there is still debate over how this data will be used in the future.

The new ChatGPT features will roll out to Plus and Enterprise subscription plans in the next two weeks. "We are excited to roll out these capabilities to other user groups, including developers, shortly thereafter," says OpenAI.