Please enter your password to continue.
From a file: Click Select File, choose any audio or video file, then click Transcribe it.
From a recording: Click Record to capture from your microphone. When done, click Stop to preview, or Process to transcribe immediately.
Supported formats: MP3, MP4, M4A, WAV, WebM, AAC, OGG, FLAC, OPUS, AMR, AIFF, MOV, AVI, MKV.
Click the Language button to open the language picker:
de, fr, ja) and click Apply.The button label updates to show the active language so you always know what's set.
Tag Audio Events: Detects and labels non-speech sounds inline β e.g. [laughter], [applause], [music].
Diarize (Speaker ID): Identifies who is speaking and labels each segment (Speaker 1, Speaker 2, β¦). Useful for interviews or meetings.
Output Format:
Raw: Returns the full JSON response from the transcription engine, including word-level timestamps. Gemini post-processing is disabled in this mode.
The purple Gemini it button (and the Gemini button during recording) transcribes your audio and immediately sends the result through your Gemini AI prompt for cleanup, formatting, or summarisation β all in one step.
The Gemini system prompt and per-language post-processing prompts can be customised in Settings β Post-Processing Prompts.
You need a Gemini API key configured in Settings for this to work.
While recording is active, three buttons replace the normal controls:
Below those, Pause / Resume lets you pause mid-recording, and Cancel discards the recording entirely.
After a transcription completes, click AI Post-Processing. This will:
Configure which AI app opens and customise the prompts (per language: EN / SK) in Settings β Post-Processing Prompts.
API Keys: Add one or more ElevenLabs API keys. The app automatically picks the key with the most remaining quota for each job. You can view per-key quota usage here.
Gemini API Key: Required for the Gemini it feature and Gemini post-processing.
Auth Codes: Create named access codes that other users can use to log in via ?auth_code=β¦ without knowing the main password. Each code shows its last-used time (displayed in your local timezone).
Post-Processing Prompts: Customise the prompt sent with the transcript for EN, SK, and Gemini AI modes.
Debug Logging: Prints detailed server-side logs β useful for troubleshooting failed transcriptions.
Auto-record on load: Append ?rec_now to the URL β the app starts recording the moment the page opens.
Auto-login: Append ?auth_code=YOUR_CODE to log in automatically using an auth code.
Combine both: /?auth_code=YOUR_CODE&rec_now β logs in and starts recording instantly. Perfect for a one-tap recording shortcut on mobile.
Best accuracy: Clear audio with minimal background noise gives the best results. Use a headset or get close to the microphone when recording.
Forcing the language: If you know the language in advance, selecting it instead of Auto-detect is faster and often more accurate.
Dark / Light mode: Toggle with the moon/sun icon in the navigation bar. Your preference is saved across sessions.
Mobile home screen: Add this page to your Home Screen (iOS: Share β Add to Home Screen) for a full-screen, app-like experience with no browser chrome.