Audio Transcription Live Chrome extension icon

Audio Transcription Live

✨ AI-Powered
👥 57 users
📦 v3.1.0
💾 83.2KiB
📅 2026-05-18
View on Chrome Web Store

Chrome will indicate if you already have this installed.

Overview

Audio Transcription is a powerful extension that turns your browser into a real-time interpreter. It captures any audio playing in a tab (transcribing it via Whisper AI) or reads existing video subtitles, translates them live, and reads the results back to you via Text-to-Speech (TTS).

🌟 Designed with privacy and efficiency in mind, it is optimized to run smoothly on low-resource computers and operates as independently from cloud services as possible. Compatible with Linux, Windows, and macOS, this extension acts as a true Live Interpreter for any media stream.

✨ Key Features:

• 🎬 Subtitle TTS Mode: Read aloud and translate existing subtitles from YouTube, Twitch, or any HTML5 video without needing a local server.
• 🌐 Source Language Control: Rely on smart auto-detection, or manually select the subtitle language for maximum accuracy.
• 🗣️ Real-Time Speech-to-Speech: Listen to live translations with a natural, fluid voice that buffers complete sentences for a seamless experience.
• 📝 Live Audio Transcription: Fast and accurate transcription from scratch using your local machine's processing power with OpenAI's Whisper AI (WhisperLive server required).
• 🤖 Instant Translation: Translate live text using Google Translate (free) or the latest Google Gemini (Flash-Lite) & Gemma 4 AI models.
• 🖼️ Flexible UI Modes: View transcripts in a floating overlay or a dedicated Standalone popup window.
• 🛡️ Total Privacy: Local audio processing and transparent open-source code.

⚙️ SERVER INSTRUCTIONS & SOURCE CODE:

The Subtitle TTS mode works completely out-of-the-box. However, to use the advanced "Live Audio Transcription" feature, you must run the local WhisperLive server on your computer.

Get the server scripts and detailed setup instructions at:
https://github.com/antor44/Audio-Transcription

⚖️ LICENSE:

This is a free and open-source project distributed under the GNU General Public License v3.0 (GPL-3.0). For more details, visit the GitHub repository.

---
🆕 WHAT'S NEW IN VERSION 3.1.0:
• 🎤 Smart TTS: The voice engine now intelligently waits for sentence boundaries (periods), creating a much more natural and less choppy listening experience.
• ⭐ Language Selector: Added an optional 'Source Language' menu in Subtitle TTS mode to fix auto-detection edge cases.
• 🤖 AI Update: Cleaned up deprecated models and added support for the new Gemma 4 generation.
• 🐞 Bug Fixes: Fixed initial auto-detect hangs, stopped short phrases from being skipped, and fixed the "Stop" button state when tabs are closed.

Tags

Productivity/communication video productivity/communication

Privacy Practices

Not being sold to third parties, outside of the approved use cases
Not being used or transferred for purposes that are unrelated to the item's core functionality
Not being used or transferred to determine creditworthiness or for lending purposes

🔐 Security Analysis

⏳ Security scan is queued. Check back soon.

Grammarly for Chrome helps you write with confidence. Get AI support for grammar, clarity, and tone, from first draft to…
Productivity/communication AI
Record your screen and camera with one click. Share that content in an instant with a link.
Productivity/communication
Write better emails, essays and messages with Quillbot's Grammar Checker, Paraphrasing Tool, Generative AI, Summarizer a…
Productivity/communication AI