May 13, 2024 · Under Attack

GPT-4o makes multimodal assistance mainstream

Threat

Single-purpose OCR, translation, visual Q&A, and voice assistant apps lose oxygen as multimodal help becomes free and mainstream.

What Changed

Text, image, and voice interaction converge inside one widely used assistant.

GPT-4o changed accessibility and distribution. Multimodal intelligence moved closer to default user behavior, putting narrow visual and voice utilities under pressure.

Categories Hit

OCR and image Q&A apps: Users can ask a general assistant to read, translate, and explain images without opening a specialized utility.
Voice translation helpers: Real-time voice and visual reasoning compress workflows that used to require separate voice or translation apps.

Brands / Services Hit

Google Lens (lens.google)
Duolingo Max (duolingo.com)
Otter.ai (otter.ai)

Sources

OpenAI GPT-4o announcement