'Whispering' enables free local voice-to-text input, real-time transcription, and transcription from audio files using AI models

Whispering is open-source software that transcribes audio simply by pressing a keyboard shortcut. It runs locally, stores data locally, and copies the transcribed text to the clipboard so you can paste it immediately.
epicenter/apps/whispering at main · epicenter-so/epicenter · GitHub
Whispering has official installation instructions for Windows, Mac, and Linux. For Windows, click the link to download Whispering_7.3.0_x64_en-US.msi and run the file.

The setup wizard will start, so click 'Next'.

Check the installation location and click 'Next'.

Click “Install”.

Once the installation is complete, click 'Finish' to launch Whispering.
When it starts up, a pop-up will appear in the bottom right corner, so check each one. First, click 'Install FFmpeg'.

The installation of the audio conversion software 'FFmpeg' is strongly recommended, and the installation instructions are displayed. Follow the instructions to install FFmpeg.

First, open the Start menu and run the command prompt as administrator.

Paste the command 'winget install ffmpeg' in the instructions and press Enter to proceed with the installation.

Restart Whispering and if it says 'FFmpeg is Installed' then the installation is complete.

Click 'View Update' to update.

When you press it, the following will be displayed, so click 'Download & Install' to complete.

Finally, click 'Configure'.

Download the model you want to use. There are four options: 'Tiny,' 'Small,' 'Medium,' and 'Large v3 Turbo.' If you just want to check the operation, 'Small' is the best.

Once the download is complete, 'Activated' will be displayed.

Additionally, scroll down on the model selection screen and change 'Output Language.' The default is Auto, but Auto does not respond well to Japanese, so when recording in Japanese, it is better to explicitly select 'Japanese.'

Return to the main menu and press the keyboard shortcut 'Space' (to toggle recording on/off) or 'P' (push-to-talk) to speak. The app will automatically record, transcribe, and even copy the transcribed text. Below is the result of me speaking 'I Am a Cat,' and the accuracy of the Small model seems to be moderate. However, I was unable to download the Medium or higher models, so I gave up on using anything other than the Small model.

Additionally, it supports file uploads up to 25MB.

The transcription results for both the recording and the audio file will be saved to the clipboard.

Click the gear icon to change various settings.

Whether or not to copy the results to the clipboard.

Recording, microphone, and bitrate settings.

Settings for using API keys, etc. Whispering supports API keys from OpenAI, Anthropic, Groq, Google, and ElevenLabs, and by entering an API key, you can process transcription via each service instead of locally.

Other shortcuts are as follows:

A demo version is also available that can be tried online without downloading. It is free to use, but you will need to obtain and enter a Groq API key .
Whispering
https://whispering.epicenter.so/
Related Posts: