Adobe Launches Firefly Video Model, Officially Enters the Generative AI Video Space

2024-10-15

Adobe officially introduced the Firefly video model at its annual MAX conference, signaling its venture into the generative AI video creation sector. Previously showcased earlier this year, the model is now available to the public through a range of new tools. These tools include features integrated directly into Premiere Pro, enabling creators to extend video clips or generate videos from still images and text prompts.

Within Premiere Pro, Adobe launched a beta tool named "Generative Extend." This tool allows users to lengthen the beginning or end of slightly short video clips or make adjustments in the middle, such as correcting eye misalignment or unintended movements. However, video extensions are limited to two seconds, making the tool ideal for minor tweaks and eliminating the need to re-shoot videos due to small issues. The generated extension segments support resolutions up to 720p or 1080p and a frame rate of 24 frames per second. Additionally, the tool can be applied to audio to smooth edits, though it is restricted to extending sound effects and ambient room tones by up to ten seconds and is not suitable for dialogue or music.

Adobe also launched two additional video generation tools on its web platform: Text-to-Video and Image-to-Video. These tools, initially announced in September, are now available as a limited public beta through the Firefly web application.

The Text-to-Video feature operates similarly to other video generators, where users input a text description to produce video content. It can emulate various styles, including realistic films, 3D animations, and stop-motion animations. The generated clips can be further refined using the "Camera Control" option to simulate camera angles, movements, and shooting distances.

The Image-to-Video feature offers even greater precision by allowing users to add reference images and text prompts for more controlled results. Adobe recommends using this feature to create B-roll footage from images and photos or to upload still images from existing videos to assist in visualizing reshoots. However, examples indicate that this feature cannot fully replace reshoots, as outputs may contain errors such as cable wobbling and background shifts.

Currently, the Text-to-Video and Image-to-Video tools generate clips that are up to five seconds long, with a maximum resolution of 720p and a frame rate of 24 frames per second. In comparison, OpenAI's Sora can produce videos up to one minute in length while maintaining visual quality and adherence to user prompts, though this functionality has not yet been made publicly available despite being announced earlier than Adobe's tools.

These three tools—Text-to-Video, Image-to-Video, and Generative Extend—have a generation time of approximately 90 seconds. However, Adobe is developing a "Turbo Mode" to reduce this duration. Despite existing limitations, Adobe asserts that its AI video model-powered tools are "commercially safe" as they are trained on content with usage licenses. This attribute may appeal to certain users, especially since models from other providers, like Runway, have faced scrutiny for allegedly being trained on thousands of YouTube videos.

Additionally, videos created or edited using Adobe's Firefly video model can embed content credentials, which disclose the usage of AI and ownership upon online publication. It is currently unclear when these tools will exit the beta stage, but their public availability surpasses that of other generators like OpenAI's Sora, Meta's Movie Gen, and Google's Veo, which remain unreleased to the public.

For enterprise users, Adobe also announced AI-driven voiceover and lip-syncing features, as well as batch editing tools designed for managing large volumes of content.

The Firefly video model is currently in a limited open beta phase, accessible for free. Adobe has yet to announce pricing details.