Connect with us

Daily News

Unlocking Multilingual Content: AI dubbing takes centrestage

The tool transcribes the original dialogue into text using a speech-to-text model

Published

on

A start-up established by former Google and Palantir employees – ElevenLabs – has unveiled an exciting new feature called AI Dubbing. This innovative product has the power to translate spoken language into more than 20 different languages, and it is set to revolutionise the world of audio and video content dubbing.

This release comes as a breath of fresh air for content creators, especially those who have struggled with the manual translation process for years. AI Dubbing offers a game-changing solution, breaking down language barriers and enabling even smaller content creators to reach global audiences without the need for expensive manual translation services.

What’s particularly impressive about this tool is its ability to provide high-quality translated audio rapidly, all while preserving the original speaker’s voice, including their emotional nuances and intonation. This means that content creators can connect with audiences around the world in a way that feels authentic to their original delivery.

How AI Dubbing Operates

The process of AI-driven translation involves several intricate stages, from eliminating background noise to translating the spoken content. However, for end-users, the experience is remarkably straightforward. Here’s how it works:

Accessing the AI Dubbing Tool: Users begin by selecting the AI Dubbing tool on the company’s platform. They create a new project, choose the source and target languages, and upload the content file.

Automated Processing: Once the content is uploaded, the tool takes over, automatically detecting the number of speakers. Users can track the progress with a status bar that appears on the screen. This operation is akin to using any other online conversion tool. Upon completion, users can download the processed file.

Behind the Scenes: The AI Dubbing tool leverages the organisation’s proprietary technology to perform several critical functions like eliminating background noise and distinguishing between music, noise, and the actual spoken dialogue of the speakers. It recognises which speakers are talking, preserving their distinct voices. The tool transcribes the original dialogue into text using a speech-to-text model.

This text is then translated and adjusted to match the length and context of the content in the target language. The original voice characteristics of the speaker are retained. The translated speech is synchronised with the originally removed music and background noise, preparing the dubbed output for immediate use.

The platform considers this process the culmination of its extensive research in voice cloning, text and audio processing, and multilingual speech synthesis. The company supports more than 20 languages, including Hindi, Portuguese, Spanish, Japanese, Ukrainian, Polish, and Arabic. This means that content can be globalised with ease, opening up new possibilities for creators worldwide.

This advancement marks a significant step forward in AI-driven voice and speech synthesis. While the company is a leader in this field, it’s important to note that other companies, such as MURF.AI, Play.ht, and WellSaid Labs, are also making strides in this area. Moreover, Meta has recently launched SeamlessM4T, an open-source model that can understand and generate translations in real-time for nearly 100 languages from speech or text.

The global market for AI tools like this is on the rise, with an estimated value of nearly $5 billion by 2032, as per Market US, showcasing a compound annual growth rate of slightly above 15.40%. This underlines the growing importance of AI in language and content accessibility.

Shalini is an Executive Editor with Apeejay Newsroom. With a PG Diploma in Business Management and Industrial Administration and an MA in Mass Communication, she was a former Associate Editor with News9live. She has worked on varied topics - from news-based to feature articles.