Stability AI’s new audio model creates even longer songs – here’s how to try it for free

Sabrina Ortiz/ZDNET

There’s been widespread industry focus lately on audio generative AI models, with OpenAI this week releasing the latest updates of its own model, Voice Engine. Now joining the trend is Stability AI, which has revealed its own most advanced audio model.

Also: DALL-E adds a new way to adjust AI-generated images. Learn how to use it

On Wednesday, Stability AI — the open-source AI company best known for its Stable Diffusion model — unveiled Stable Audio 2. This new model offers significant upgrades over its predecessor — Stable Audio 1.0 — beyond text-to-audio capabilities. 

Introducing Stable Audio 2.0 – a new model capable of producing high-quality, full tracks with coherent musical structure up to three minutes long at 44.1 kHz stereo from a single prompt.
Explore the model and start creating for free at: https://t.co/E9ZIGagmPf
Read the… pic.twitter.com/rFGb0KpdeX

— Stability AI (@StabilityAI) April 3, 2024

Stable Audio 2 has audio-to-audio capabilities, which enable users to upload audio samples and create a wide array of sounds using natural language prompts. With style transfer, you can modify generated or uploaded audio to align with a specific style and tone. 

Also: Copilot in Microsoft 365 adds new AI perks and here’s how to get them

To protect creative integrity and artists’ rights, the uploads have to be free of copyrighted material. The company uses content recognition technology from Audible Magic to prevent such infringement and ensure users are compliant. 

To further protect artists, Stable Audio 1.0 and Stable Audio 2.0 were trained on data from AudioSparx, which consists of more than 800,000 audio files, and whose artists were given the option to opt out of the Stable Audio model training. 

READ MORE  Best music mixing app deal: 80% off DJ it!

The new model can also produce tracks up to three minutes long at 44.1 kHz stereo, a significant upgrade over Stable Audio which could only produce tracks up to 45 seconds long. The three-minute generation includes all the elements needed in a song such as melodies, backing track, sound effects, and more. 

Also: From Billie Eilish to Stevie Wonder, musicians condemn AI’s ‘assault on human creativity’

The model is already publicly available for free use on the Stable Audio website. Getting started is easy: Visit the site, log in with your Stable AI or Google account, and start tinkering. 

Leave a Comment