Stunning new AI adds realistic sound effects to any video

[Following the coverage of OpenAI’s Sora software, which lets users generate realistic video from text prompts, the AI voice company ElevenLabs announced new software that adds realistic sound effects to accompany the video, surely enhancing the presence experience of viewers/listeners. This story from New Atlas includes the details and a one-minute video demonstration (also available via YouTube). –Matthew]

[Image: “What if you could describe a sound and generate it with AI?” teases ElevenLabs as AI-generated sound effects and dialog are added to video footage generated by Sora from OpenAI. Credit: ElevenLabs.]

Stunning new AI adds realistic sound effects to any video

By Paul Ridden
February 19, 2024

Last week, OpenAI released a new AI model called Sora that could generate high-resolution video clips from text prompts. But they’re all essentially clever silent films. Now ElevenLabs has added background sounds to Sora-created footage.

AI voice cloning startup ElevenLabs was co-founded by former Google machine-learning engineer Piotr Dabkowski and ex Palantir deployment strategist Mati Staniszewski in 2022, and has since launched AI-powered text-to-speech software and an AI dubbing tool designed to auto translate speech from a video into more than 20 languages that “maintains the original voice tone and style.”

Now the company is working on something new, which can reportedly generate sounds to accompany otherwise silent video footage based on descriptions of a scene given by a user.

It’s a sound effects/foley team in a box, and to demonstrate its prowess, ElevenLabs has unleashed it on some Sora-generated content.

“We used text prompts like ‘waves crashing,’ ‘metal clanging,’ ‘birds chirping,’ and ‘racing car engine’ to generate audio that we overlaid onto some of our favorite clips from the OpenAI Sora announcement,” explained the company in a blog post.

The nitty gritty of Sound Effects by ElevenLabs has yet to be revealed, but the demo shows a bunch of Sora-generated video clips accompanied by fairly realistic background sounds – from footsteps on a busy street along with the hum of the city to the beeps and mechanical drone of a bipedal robot of the future to a film-like narrative in a Hollywood-style promo voice. All this apparently from text-to-audio prompts.

As with Sora, there will doubtless be some kinks that need working out, as well as protections against fraud and safety protocols to cook in, but with the pace of AI development so rapid, can we perhaps expect the Oscars for best everything to be awarded to an AI in the near future? Interesting (and possibly scary) times ahead.

There’s been no word as yet on when we can expect the Sound Effects tech to land, but folks interested in learning more are being invited to register their interest.


Source: ElevenLabs (X/Twitter)

This entry was posted in Presence in the News. Bookmark the permalink. Trackbacks are closed, but you can post a comment.

Post a Comment

Your email is never published nor shared. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

*
*

  • Find Researchers

    Use the links below to find researchers listed alphabetically by the first letter of their last name.

    A | B | C | D | E | F| G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z