MyShell Open-Sources OpenVoice: An Instant Voice Cloning AI Library that Takes a Short …
MyShell has open-sourced OpenVoice, a voice-cloning AI library that rapidly generates human-sounding voices from only a short audio clip. Introduced in December 2021, OpenVoice is designed for non-technical users to customize voices for videos, audio, games, and other creative projects. It can be integrated into other software using Python code.
OpenVoice’s creators believe it has the potential to revolutionize how people interact with digital content. “Text-to-speech was a profound breakthrough for making text accessible to people who cannot read,” the company said in a February 14 blog post. “If the past was about reading, the future will be about listening. OpenVoice will enable creators to breathe life into stories and digital characters like never before.”
The AI library features several voice models that users can select from, and it supports 13 languages and dialects, including English (multiple accents), Spanish, Chinese, French, German, Italian, Japanese, Korean, Hindi, Russian, and Portuguese (two dialects).
OpenVoice can generate three different types of voice audio:
* **TTS (text-to-speech):** Converts written text into a spoken voice.
* **SS (speech synthesis):** Synthesizes a voice speaking a foreign language.
* **VC (voice cloning):** Clones a voice from a user-provided short audio clip.
OpenVoice’s voice cloning works by converting raw audio waveforms into mel-spectrograms, which are then converted into embeddings.
To use OpenVoice, either install the library via pip or clone the repository from GitHub. The library also includes a Python API with functions for cloning and synthesizing speech, changing the speaking rate and pitch, and adding background noise using the built-in sound effects library.
OpenVoice is still under development, and MyShell encourages developers to submit requests or issues on GitHub and to join its Discord server for discussions. The company is committed to maintaining the library and developing new features by adding more voice models, languages, and export formats.
MyShell is a software company specializing in audio and speech artificial intelligence. It provides services for audio restoration and enhancement, voice cloning, text-to-speech, and more.