Posted in | News | Machining Robotics

New AI-Based System Helps Convert Text to Audio Clips

According to scientists from the University of Surrey, who are inviting the public to check their new text-to-audio model, Generative Artificial Intelligence (AI) systems will stimulate an explosion of creativity in the music sector and beyond.

New AI-Based System Helps Convert Text to Audio Clip

Image Credit: Getty

AudioLDM is a new AI-based system from Surrey enabling users to submit a text prompt which is further utilized to produce an equivalent audio clip. The system has the potential to process prompts and offer clips with the help of less computational power than present AI systems without making a compromise on sound quality or the users’ potential to manipulate clips.

The general public is capable of trying out AudioLDM by visiting its Hugging Face space. Also, their code is open-sourced on Github with 1000+ stars.

Such a system could be utilized by sound designers in different applications such as film-making, digital art, game design, the metaverse, virtual reality, and digital assistants for the visually impaired.

Generative AI has the potential to transform every sector, including music and sound creation. With AudioLDM, we show that anyone can create high-quality and unique samples in seconds with very little computing power.

Haohe Liu, Study Project Lead, University of Surrey

Liu state, “While there are some legitimate concerns about the technology, there is no doubt that AI will open doors for many within these creative industries and inspire an explosion of new ideas.”

Surrey’s open-sourced model is constructed in a semi-supervised approach with a method known as Contrastive Language-Audio Pretraining (CLAP). With the help of the CLAP method, AudioLDM could be trained on enormous amounts of audio data in the absence of text labeling, thereby considerably enhancing model capacity.

What makes AudioLDM special is not just that it can create sound clips from text prompts, but that it can create new sounds based on the same text without requiring retraining.

Wenwu Wang, Professor in Signal Processing and Machine Learning, University of Surrey

Wang added, “This saves time and resources since it doesn't require additional training. As generative AI becomes part and parcel of our daily lives, it's important that we start thinking about the energy required to power up the computers that run these technologies. AudioLDM is a step in the right direction."

The user community has made a range of music clips with the help of AudioLDM in various genres.

AudioLDM is a research demonstration project and depends on the present UK copyright exception exemption available for data mining for non-commercial research.


Tell Us What You Think

Do you have a review, update or anything you would like to add to this news story?

Leave your feedback
Your comment type
Azthena logo powered by Azthena AI

Your AI Assistant finding answers from trusted AZoM content

Azthena logo with the word Azthena

Your AI Powered Scientific Assistant

Hi, I'm Azthena, you can trust me to find commercial scientific answers from

A few things you need to know before we start. Please read and accept to continue.

  • Use of “Azthena” is subject to the terms and conditions of use as set out by OpenAI.
  • Content provided on any AZoNetwork sites are subject to the site Terms & Conditions and Privacy Policy.
  • Large Language Models can make mistakes. Consider checking important information.

Great. Ask your question.

While we only use edited and approved content for Azthena answers, it may on occasions provide incorrect responses. Please confirm any data provided with the related suppliers or authors. We do not provide medical advice, if you search for medical information you must always consult a medical professional before acting on any information provided.

Your questions, but not your email details will be shared with OpenAI and retained for 30 days in accordance with their privacy principles.

Please do not ask questions that use sensitive or confidential information.

Read the full Terms & Conditions.