Close Menu
    Facebook X (Twitter) Instagram
    • Privacy Policy
    • Terms Of Service
    • Social Media Disclaimer
    • DMCA Compliance
    • Anti-Spam Policy
    Facebook X (Twitter) Instagram
    Bytecore News
    • Home
    • Crypto News
      • Bitcoin
      • Ethereum
      • Altcoins
      • Blockchain
      • DeFi
    • AI News
    • Stock News
    • Learn
      • AI for Beginners
      • AI Tips
      • Make Money with AI
    • Reviews
    • Tools
      • Best AI Tools
      • Crypto Market Cap List
      • Stock Market Overview
      • Market Heatmap
    • Contact
    Bytecore News
    Home»AI News»Meet OmniVoice Studio: A Local, Open-Source Alternative to ElevenLabs
    Meet OmniVoice Studio: A Local, Open-Source Alternative to ElevenLabs
    AI News

    Meet OmniVoice Studio: A Local, Open-Source Alternative to ElevenLabs

    May 26, 20264 Mins Read
    Share
    Facebook Twitter LinkedIn Pinterest Email
    binance


    OmniVoice Studio — How to Use It
    01 / 08

    What Is OmniVoice Studio?

    OmniVoice Studio is an open-source desktop application for voice cloning, video dubbing, real-time dictation, and speaker diarization. Everything runs locally on your machine. No API keys, no cloud account, no subscription required.

    • 646 languages supported for TTS via the default OmniVoice engine
    • 99 languages for transcription via WhisperX
    • Available on macOS, Windows, and Linux
    • GPU is optional — full pipeline runs on CPU
    • Free for personal, educational, and research use (FSL-1.1-ALv2)

    OmniVoice Studio — How to Use It
    02 / 08

    ledger

    System Requirements

    A GPU is optional. Without one, TTS runs approximately 3× slower on CPU. With ≤8 GB VRAM, TTS automatically offloads to CPU during transcription — no config needed.

    ComponentMinimumRecommended

    OSWin 10 / macOS 12+ / Ubuntu 20.04+Any modern 64-bit OS
    RAM8 GB16 GB+
    VRAM4 GB (auto-offloads)8 GB+ (RTX 3060+)
    Disk10 GB free20 GB+ SSD
    Python3.10+3.11–3.12
    GPUOptionalCUDA / MPS / ROCm

    OmniVoice Studio — How to Use It
    03 / 08

    Installation

    The project recommends running from source. Install three prerequisites first: ffmpeg, Bun (JS runtime), and uv (Python package manager).

    git clone https://github.com/debpalash/OmniVoice-Studio.git
    cd OmniVoice-Studio
    uv sync
    bun install
    bun dev

    Frontend loads at http://localhost:5173  |  API runs on port 8000.Model weights download automatically on first generation.

    Pre-built installers available: macOS DMG, Windows MSI, Linux AppImage and .deb — see the Releases page on GitHub.

    OmniVoice Studio — How to Use It
    04 / 08

    Voice Cloning

    Voice cloning uses zero-shot learning — it clones a voice from a clip as short as 3 seconds, without prior training on that voice. The default OmniVoice engine conditions a diffusion-based TTS model on the reference audio.

    • Go to the Voice Clone tab in the UI
    • Upload or record a 3-second audio clip of the target voice
    • Enter your text and select a target language (646 available)
    • Click Generate — output is saved to your project library

    Voice Gallery: Search YouTube, browse categories, and download reference clips directly inside the app to build your voice library.

    OmniVoice Studio — How to Use It
    05 / 08

    Video Dubbing

    The full dubbing pipeline runs locally: transcribe → translate → synthesize → mux. Demucs isolates vocals so the original background audio is preserved in the final export.

    • Go to the Dub tab — paste a YouTube URL or upload a local file
    • WhisperX transcribes speech with word-level alignment
    • Select a target language; translation runs automatically
    • TTS engine re-voices the transcript; Demucs preserves background audio
    • Export the final MP4 with dubbed audio mixed in

    Batch Queue: Drop up to 50 videos and walk away. Each job has its own progress bar tracking through the full pipeline.

    OmniVoice Studio — How to Use It
    06 / 08

    Dictation & Speaker Diarization

    Dictation works system-wide from any application. Diarization identifies individual speakers in a multi-speaker audio file using Pyannote + WhisperX.

    • Press ⌘+⇧+Space (macOS) to open the floating dictation widget
    • Speech streams via WebSocket and auto-pastes into the active input field
    • Upload a multi-speaker file to the Diarization tab
    • Pyannote identifies who said what; each speaker gets an auto-extracted voice profile
    • Assign a TTS voice per speaker for per-speaker dubbing

    Hugging Face token required for Pyannote diarization. See docs/setup/huggingface-token.md in the repo.

    OmniVoice Studio — How to Use It
    07 / 08

    TTS Engines

    Six TTS engines are built in. Switch via Settings → TTS Engine or the env var:OMNIVOICE_TTS_BACKEND=cosyvoice

    EngineLanguagesClonePlatform

    OmniVoice (default)600+✓CUDA / MPS / CPU
    CosyVoice 39 + 18 dialects✓CUDA / MPS / CPU
    MLX-AudioMultiVariesApple Silicon only
    VoxCPM230✓CUDA / MPS / CPU
    MOSS-TTS-Nano20✓CUDA / CPU
    KittenTTSEnglish✗CPU only

    Custom engine: Subclass TTSBackend in backend/services/tts_backend.py and add it to _REGISTRY. ~50 lines of Python.

    OmniVoice Studio — How to Use It
    08 / 08

    MCP Server & Resources

    OmniVoice Studio ships a built-in MCP Server, exposing voice and dubbing capabilities to any MCP-compatible client — Claude, Cursor, or your own tooling — without opening the desktop UI.

    • MCP Server starts alongside the FastAPI backend on bun dev
    • Point your MCP client at the local server to access all endpoints
    • AudioSeal (Meta) embeds an invisible neural watermark in all generated audio for AI provenance
    • GitHub: github.com/debpalash/OmniVoice-Studio
    • Install docs: docs/install/ (macos / windows / linux / docker)
    • Troubleshooting: docs/install/troubleshooting.md
    • Discord: discord.gg/bzQavDfVV9



    Source link

    ledger
    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    CryptoExpert
    • Website

    Related Posts

    Jinhua Zhao named head of the Department of Urban Studies and Planning | MIT News

    June 13, 2026

    NanoClaw and JFrog launch 'immune system' to block AI agents from downloading malicious code

    June 12, 2026

    Visa ChatGPT integration enables AI agent retail purchasing

    June 11, 2026

    Google AI Releases DiffusionGemma, a 26B MoE Open Model Using Text Diffusion for Up to 4x Faster Generation

    June 10, 2026
    Add A Comment
    Leave A Reply Cancel Reply

    aistudios
    Latest Posts

    2 Canadian Growth Stocks Worth Adding to a TFSA This Year

    June 13, 2026

    Jinhua Zhao named head of the Department of Urban Studies and Planning | MIT News

    June 13, 2026

    TOP 7 AI CERTIFICATIONS THAT CAN MAKE YOU RICH IN 2026

    June 13, 2026

    AI Was a Mistake

    June 13, 2026

    Saylor Says Bitcoin Sales Are Necessary for Strategy’s Digital Credit Business

    June 13, 2026
    notion
    LEGAL INFORMATION
    • Privacy Policy
    • Terms Of Service
    • Social Media Disclaimer
    • DMCA Compliance
    • Anti-Spam Policy
    Top Insights

    The Bitcoin 400-Day Cycle: Historical Performance Shows How Low The Bottom Goes

    June 14, 2026

    Morpho’s $175M DeFi Round Tests Onchain Credit’s Future

    June 14, 2026
    coinbase
    Facebook X (Twitter) Instagram Pinterest
    © 2026 BytecoreNews.com - All rights reserved.

    Type above and press Enter to search. Press Esc to cancel.