Movie Gen – the future of AI video generation


Meta, the parent company of Facebook and Instagram, has introduced a groundbreaking artificial intelligence model called Movie Gen, designed to greatly improve video creation. This new AI-powered video generator is capable of producing high-definition videos complete with sound, using only text prompts. The announcement of Movie Gen marks Meta’s latest foray into generative AI, placing it in direct competition with other industry giants like OpenAI and Google.

At its core, Movie Gen allows users to create entirely new video clips from simple text inputs like this: “A sloth with pink sunglasses lays on a donut float in a pool”. The model offers a significant leap forward in video generation, pushing the boundaries of creativity for filmmakers, content creators, and enthusiasts alike. The videos can be produced in various aspect ratios and can last up to 16 seconds, making them suitable for a wide range of uses, from social media posts to short film clips. This technology builds on Meta’s previous work in video synthesis, such as the Make-A-Scene video generator and the Emu image-synthesis model.

In addition to creating new videos from scratch, Movie Gen offers advanced editing capabilities. Users can upload existing videos or images and modify them using simple text commands. For example, a still image of a person can be transformed into a moving video where the person performs actions based on the input prompt. The ability to customize existing footage doesn’t stop there. Users can change specific details like background, objects, and even costumes. These changes, all executed via text prompts, showcase the precision and versatility of Movie Gen’s editing functions.

But what truly sets Movie Gen apart from its competitors is the integration of high-quality audio generation. The AI can create soundtracks, sound effects, and ambient noises that synchronize with the visuals of the generated video. Users can provide text prompts for specific audio cues, like “rustling leaves” or “footsteps on gravel,” and Movie Gen will incorporate those sounds into the scene. The model can generate up to 45 seconds of audio, ensuring that even short films or detailed clips are accompanied by dynamic soundscapes. Meta AI also mentioned that the model includes an audio extension technique, allowing seamless looping of audio for longer videos.

The unveiling of Movie Gen comes at a time when other major players in the AI industry are also developing similar tools. OpenAI announced its text-to-video model Sora earlier this year, but the model has yet to be publicly released. And Runway has just recently introduced its latest generative AI platform – Gen-3 Alpha.

However, Movie Gen stands out due to its ability to perform multiple tasks: generating new video content, editing existing clips, and incorporating personalized elements, all while maintaining the original video’s integrity. According to Meta AI, in blind tests, Movie Gen has outperformed competing models in both video and audio generation.

Despite the excitement surrounding Movie Gen, Meta has stated that the tool is not yet ready for public release. According to the company, the technology is still too expensive to operate efficiently, and the generation time is longer than desired. These technical limitations mean that Movie Gen will remain in development for the time being, with no set timeline for when it will be made available to developers or the general public.