Thursday, Jun 20, 2024

  • Siri: Let’s giggle like petty schoolgirls.
  • Siri: Give me a clever comeback for when someone says I look tired.
  • Siri: What’s the tea today?

Admit it. Sometimes you wish bots like Siri or Alexa gave you more of a give-and-take. Not in the way that they do when they analyze your buying algorithms or hijack your location settings. But rather, you wish they talked to you with real human depth and emotion—in the same way you wished customer service reps did when your cable is out and you’re just screaming for someone to empathize with your plight.

Well, GPT-4o is here to change all that.

OpenAI’s latest model—unveiled by employees in a live demonstration last month—can read your emotions, change its tone to suit your mood and even sing you a lullaby. Ask it for a bedtime story, and it will whisper soothingly. Need some sassy advice? It can switch to a playful, sarcastic tone. And yes, it can even sing on command.

Created by AI when asked to create an illustration of "Omni"

GPT-4o’s voice feature is now available for free to ChatGPT users. During the live demo, employees showcased its talents, having it read stories in different tones, sing “Happy Birthday” and translate languages in real time. But the real showstopper was its dynamic voice-changing ability, switching from a robot to a charming conversationalist seamlessly. Thanks to "native multimodal support," it processes audio prompts directly, making conversations fluid and almost instantaneous.

The O stands for Omni—Latin for "every"—indicating this new model can understand audio, images and text simultaneously and generate responses in all these formats (previously, the ChatGPT interface used separate models for different content types). This leap in technology marks the end of an era of detached, impersonal AI helpers. With its advanced capabilities in natural language processing, real-time translation and emotional detection, GPT-4o is poised to transform industries like manufacturing by improving communication, efficiency and customer interactions.

Sounds like its already more humanlike than some humans we know at the other end of our cable bill.

