Voice-to-text for Linux using Whisper AI. Press a keyboard shortcut, speak, release to paste.
- Two modes: CLI (terminal) or GUI (system tray with visual indicator)
- Whisper AI transcription: Offline, private speech recognition
- Wayland native: Built for modern Linux desktop (KDE Plasma, GNOME, etc.)
- Configurable: Choose AI models, languages, keyboard shortcuts
- GPU accelerated: Automatic CUDA detection with CPU fallback
Required packages:
# Fedora/RHEL
sudo dnf install wl-clipboard ydotool ffmpeg-free-devel
sudo systemctl enable --now ydotoold
# Debian/Ubuntu
sudo apt install wl-clipboard ydotool libavformat-dev libavcodec-dev libavutil-dev
sudo systemctl enable --now ydotoold
# Arch
sudo pacman -S wl-clipboard ffmpeg
yay -S ydotool-git # AUR
sudo systemctl enable --now ydotooldUser permissions:
# Required for keyboard monitoring
sudo usermod -a -G input $USERLog out and log back in for group membership to take effect.
# Install with uv (recommended)
uv pip install talk-it-out
# Or with pip
pip install talk-it-outtalk-it-out guiShows a visual indicator and system tray icon:
- Red indicator: Recording (brightness shows voice volume)
- Blue pulsing indicator: Transcribing
- Hidden: Idle
- System tray: Right-click to quit
- Notifications: Errors shown as desktop notifications
Default shortcut: Meta+Alt (hold to record, release to transcribe and paste)
talk-it-out runRuns in terminal with the same keyboard shortcuts.
On first startup, Whisper will download the AI model (~809MB for default "turbo" model). This happens once and requires internet connection. Subsequent runs use the cached model offline.
Config file location: ~/.config/talk-it-out/config.toml
Edit configuration:
talk-it-out config-editMinimal configuration example:
[keys.combos]
record_for_paste = [["KEY_LEFTMETA", "KEY_LEFTALT"]]
[whisper]
model = "turbo" # tiny, base, small, turbo, medium, large
language = "en" # Language code or "" for auto-detect
[output]
strategy = "wl-clip-simplepaste"Define shortcuts using evdev key names:
[keys.combos]
record_for_paste = [
["KEY_LEFTMETA", "KEY_LEFTALT"], # Left Meta + Alt
["KEY_RIGHTMETA", "KEY_RIGHTALT"], # Right Meta + Alt
]Common key names: KEY_LEFTMETA, KEY_LEFTALT, KEY_LEFTCTRL, KEY_LEFTSHIFT
Choose model size in config (tradeoff between speed and accuracy):
tiny- Fastest, lowest accuracy (~140MB)base- Fast, reasonable accuracy (~140MB)small- Good balance (~466MB)turbo- Best balance, recommended (~809MB)medium- Higher accuracy, slower (~1.5GB)large- Best accuracy, slowest (~3GB)
[whisper]
model = "turbo"
language = "en" # Language code or "" for auto-detect
device = "auto" # "auto", "cuda", or "cpu"
compute_type = "auto" # "auto", "int8", "float16", "float32"
beam_size = 5 # Search quality (1-10, higher = better but slower)
vad_filter = true # Skip silence during transcription
[output.wl-clip]
targets = ["clipboard", "primary"] # Which clipboards to populate
ydotool_socket = "" # Custom socket path (usually auto-detected)
[audio]
sample_rate = 16000
channels = 1
device = "" # Empty = default microphone
[logging]
level = "INFO" # DEBUG, INFO, WARNING, ERROR- Verify input group membership:
groups | grep input - Reboot after adding to input group
- Check config syntax:
talk-it-out config-edit
- Ensure you're in the
inputgroup - Reboot required after adding to group
- Check
/dev/input/event*permissions
Verify dependencies:
which wl-copy wl-paste ydotool
systemctl status ydotoold # Should show "active (running)"Test clipboard manually:
echo "test" | wl-copy
wl-paste # Should output "test"Test ydotool:
ydotool key 28:1 28:0 # Should send Enter keyIf paste still doesn't work:
Some applications require different clipboard targets. Try:
[output.wl-clip]
targets = ["clipboard"] # or ["primary"]GNOME Terminal and some terminals prefer primary selection.
First run requires internet to download Whisper model. If download fails:
- Check internet connection
- Verify firewall allows HTTPS to huggingface.co
- For corporate networks, check proxy settings
Python 3.13 users on Fedora 42: If you see SSL errors during model download, this is a known issue with Python 3.13 + OpenSSL 3.5. The application includes a workaround that should resolve this automatically.
Check socket exists and is accessible:
ls -l /run/ydotool/socket
# or
ls -l $YDOTOOL_SOCKETIf socket is in a different location, configure it:
[output.wl-clip]
ydotool_socket = "/path/to/socket"Primary support:
- KDE Plasma Wayland
- GNOME Wayland
- Other Wayland compositors
Requirements:
- Wayland compositor
- D-Bus session bus (for notifications in GUI mode)
X11 support:
- Works with
QT_QPA_PLATFORM=xcbenvironment variable
This project is licensed under the GNU General Public License v3.0 or later (GPLv3+).
See LICENSE file for details.
Copyright (C) 2025 Ed Ropple
This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.