OpenAI announces GPT-4o, a new landmark AI-model

May 14, 20246246 views0

OpenAI announces the introduction of GPT-4o, the most recent paradigm-shifting prototype that operates in conjunction with text, video, and audio, just as it does in the real world!

GPT-4o facilitates enhanced virtual and human connections. It exhibits ease of use with any permutation and combination of input containing text, audio, and video and generates outputs corresponding to these inputs. An equivalent of 320 milliseconds comprises its aural reaction time, which is quicker than the blink of an eye and comparable to that of humans.

Text in English is fast, whereas text in other languages is more so. The API’s price is lower by half. It functions best with video and audio presentations.

Before GPT-4o, one could use voice mode for communication in ChatGPT, with latencies of 2.8 seconds. Voice mode consists of three prototypes: one that converts audio to text; the second, GPT-4o, which absorbs and releases text; and the third, converting text to audio. Somewhere along the way, prime information was lost without any laughter, singing, or emotion.

However, in the case of GPT-4o, there were simulations pertaining to text, audio, and video that brought them under a single neural network. Given that GPT-4o represents the first prototype that integrates all three factors, we have yet to uncover its true potential.

In contrast to traditional benchmarks, GPT-4o exhibits a significantly enhanced capacity for reasoning, coding, and text comprehension while also surpassing performance metrics pertaining to multilingualism, aural proficiency, and video aptitude.

A total of twenty distinct languages have been chosen for the purpose of testing.

In terms of security considerations, GPT-4o incorporates built-in functionalities such as training data filtering and post-training to improve the prototype’s behavioral pattern. Novel security measures have been implemented to ensure the provision of voice outputs.

GPT-4o involves an out-of-the-box thinking prowess related to in-depth understanding, leading toward practical utility.

ToAI Team

Fueled by a shared fascination with Artificial Intelligence, the Times Of AI journalists team brings together various researchers, writers, and analysts. We aim to provide a comprehensive knowledge of AI for a broad audience of the Times Of AI. Through in-depth analysis of the latest advancements, investigation of ethical considerations around AI development, AI governance, machine learning, data science, automation, cybersecurity, and discussions about the future impact of AI across various sectors, we aim to empower readers with the details they need to navigate this rapidly evolving field.