AI Video Breakthrough? Google’s Lumiere Promises Consistency

Google may be running late in the AI game, but don’t count them out just yet. In fact, some of the products they’re working on will leave the competition scared.

The company has just announced a new AI video generation model, dubbed Lumiere. They claim that this new model achieves incredible realism while also maintaining consistency.

This is achieved through “a Space-Time U-Net architecture that generates the entire temporal duration of the video at once, through a single pass in the model.”

Other video models “synthesize distant keyframes followed by temporal super-resolution — an approach that inherently makes global temporal consistency difficult to achieve.”

Google Lumiere Capabilities

Google’s Lumiere is a diffusion model designed for synthesizing videos that portray realistic, diverse and coherent motion.

It works from Text-to-video, Image-to-video, video inpainting and also for stylized generations (using reference images to generate videos in the same style).

From the examples provided, it appears that Lumiere is also great at cinemagraphs, i.e. animating a specific region of an image.