ElevenLabs Launches AI Tool for Converting Video to Audio Effects AI NEWS

Home
AInews
ElevenLabs Launches AI Tool for Converting Video to Audio Effects

ElevenLabs Launches AI Tool for Converting Video to Audio Effects

2024-06-19

AI startup ElevenLabs has released its text-to-speech AI product Sound Effects in the field of AI voice technology. Shortly after, the company quickly launched an open-source tool to showcase the immense potential of its technology. This application can generate sound effect samples for video creators in "about 15 seconds" by parsing imported video clips and providing multiple options.

Although developers can access the source code of this application on GitHub, ElevenLabs has also prepared a website for the public to easily try out the Sound Effects API.

When you upload a video, this "video-to-sound effect" application selects four keyframes at one-second intervals on the client side. These frames and a prompt message are then sent to OpenAI's GPT-4 model to generate a customized text-to-speech effect prompt. This prompt is then processed by ElevenLabs' Sound Effects API to generate the corresponding sound effect. Finally, the video and audio are merged into one file on the client side, which users can download and use, with a duration of up to 22 seconds.

Ammaar Reshi, the design director of ElevenLabs, said in an interview, "We believe this is a strong validation of our SFX API capabilities. AI video creators often look for the perfect sound effects, and we believe that by understanding the frames in their videos and proposing the best sound output based on them, we can intelligently accelerate their workflow." He also expressed excitement about the various innovative experiences that this API may bring, specifically mentioning immersive video games where sound can be generated in real-time based on player interaction.

This API allows developers to build fully customized AI sound effects using short descriptions. ElevenLabs charges based on the duration of the generated audio, with a fee of 100 characters per generation or 25 characters per second, depending on the set duration.

In a brief test, this video-to-sound effect application demonstrated its convenience. After importing a movie clip of a car driving in a off-road environment without audio, ElevenLabs' AI generated four options, each sounding like a car driving on a gravel road. While applying sound effects to clips is interesting, the true potential may lie in integrating this capability into larger systems to have a greater impact.

As the popularity of AI video generation continues to rise, ElevenLabs may continue to explore new audio solutions to meet the growing needs of developers, filmmakers, and creators, maintaining its leading position in the industry.

Harness AI

AI-powered DevOps automation for faster code delivery

Tad AI

AI music generator for custom royalty-free tracks

HiPeople

AI platform for efficient and unbiased hiring

Thea Study

AI study tool for personalized learning experiences

21st

AI tool for instant UI component creation

Firecrawl

Extract clean web data for AI models

11X

AI tool for automating outbound sales prospecting

RECENT AI TOOLS

Dexter

Harness AI

Tad AI

HiPeople

Thea Study

RECENT AI NEWS

NVIDIA Launches Open Inference AI for Autonomous Vehicles

Anthropic Signs $200M Deal to Bring Its Large Language Models to Snowflake Customers

OpenAI Acquires AI Tool Provider Neptune to Enhance Model Training Workflow

Meta Integrates Facebook and Instagram Support, Tests AI Assistant Feature

Meta Plans to Cut Metaverse Budget by Up to 30%

Intel Cancels Plan to Spin Off NEX Networking Chip Business

Google AI Rewrites Discover News Headlines, Violating Its Own Anti-Clickbait Policy

Amazon Says Alexa Plus Can Find That Movie Scene You're Looking For

RECENT AI TOOLS