• AI Odyssey
  • Posts
  • Text to video is here, Hollywood is dead

Text to video is here, Hollywood is dead

PLUS: Google's massive Gemini breakthrough

Hey AI Enthusiasts,

This week was wild. There is always some healthy competition in the AI space but after Google’s Gemini launch, OpenAI is all over the news with their exciting new development that will leave you saying "wow!"

Let’s get into it.

Whip up stunning, one-minute videos based on simple text prompts.

OpenAI introduced Sora, a text-to-video model.

Sora can create videos of up to 60 seconds featuring highly detailed scenes, complex camera motion, and multiple characters with vibrant emotions.      

Just take a look at this:

OpenAI is pushing the boundaries of what’s real.

Sora's capabilities hold immense promise for various practical applications:

  • Effortless B-roll and stock footage: Imagine having access to a vast library of customizable, high-quality video clips generated on demand, eliminating the need for expensive shoots or stock footage purchases.

  • AI-powered video ads: Create engaging and personalized video ads tailored to specific audiences, with Sora streamlining the production process.

  • Seamless video editing: Imagine manipulating video styles, and environments, and even merging clips seamlessly, all within an AI-powered editing suite.

Beyond text-to-video: 

Sora's versatility doesn't stop at generating videos from scratch. It can also:

  • Animate still images: Breathe life into static photos, transforming them into dynamic video sequences.

  • Extend existing videos: Seamlessly add missing frames or lengthen existing videos without compromising quality.

  • Edit video styles and environments: Modify the visual aesthetics of videos, changing their mood and setting with ease.

  • Merge videos: Combine two separate video clips into a cohesive whole, opening doors for creative storytelling and editing possibilities.

The rapid advancements in AI, Sora emerging just over a year after ChatGPT's debut, leave us wondering: what groundbreaking innovations await us in 2024?

Here’s the reaction of other AI founders:

  • Runway CEO Cristóbal Valenzuela posted two words on X this afternoon in response to today’s OpenAI surprise demo drop of its new Sora video AI model: “game on.”

  • Stability AI CEO Emad Mostaque responded positively to the Sora news, calling OpenAI CEO Sam Altman a “wizard.”

Here’s everything technical you can read about Sora.

Google just revealed an upgraded Gemini 1.5 model 

What's new?

  • Enhanced performance: Gemini 1.5 tackles tasks faster and more efficiently thanks to a groundbreaking Mixture-of-Experts (MoE) architecture.

  • Context: The model can now process up to 1 million tokens of information, allowing it to understand complex topics and reason across vast amounts of data. This means it can analyze things like:

    • 1 hour of video

    • 11 hours of audio

    • Codebases with over 30,000 lines of code

    • Over 700,000 words of text

  • Multimodal understanding: Gemini 1.5 can not only understand text but also reason across different formats like video and code, making it incredibly versatile.

  • Improved problem-solving: The model can now tackle complex problems in areas like code analysis, offering helpful solutions and explanations.

Currently, the full 1M token version is only available to approved developers and enterprise customers.

AI roundup:

AI art to sip your coffee with…

Prompt: mid-century modern shiny chrome toaster with a cute face screaming in terror on the surface of Mars

-Midjourney

AI deep dives: Food for thought

Still hungry for more? Here are some curated AI deep dives for you:

Stay tuned for more. We’ll see you this Thursday..👋