Artificial Intelligence (AI) Model That Reasons Across Audio, Vision, and Text in Real Time

OpenAI‘s latest ChatGPT 4 model, GPT-4o (“o” for “omni”), is a step toward much more natural human–computer interaction.

GPT-4o accepts as input any combination of text, audio, image, and video (uploaded screenshots, photos, documents, or charts, for example) and generates any combination of text, audio, and image outputs. It allows for real-time spoken conversations in more than 20 languages, has memory capability for translations in real time, and learns from previous conversations with users.

OpenAI will launch a ChatGPT desktop app with the GPT-4o capabilities, giving users another platform to interact with the company’s technology. Additionally GPT-4o will be available to developers looking to build their own custom chatbots from OpenAI’s GPT store, a feature that will also be available to nonpaying users.

Share this post: