Transcribe audio to text (Whisper)

Whisper is a speech-to-text recognition model

  • You can upload your audio in a few ways:

    • From the Internet: Provide a direct link (URL) to the audio file.

    • From Your Computer: Upload the audio file directly from your device.

  • File String: Upload the audio in string format if needed.

  • Priority Queue: Set the order in which your request is processed.

  • File URL: Enter the direct URL of the audio file.

  • Group Segments: Decide if you want to group transcription segments together.

  • Transcript Output Format:

    • Words: Get a plain text transcript containing only words.

    • Segments: Get a transcript with timestamps for creating subtitles.

  • Num Speakers: Specify the number of speakers, or let Whisper automatically detect them.

  • Language: Select the language of the audio file, or let Whisper automatically detect it for you.

  • Prompt: Add any specific words or phrases used in your audio to make the transcription more accurate.

Offset Seconds: Set a starting point if you need the transcription to begin at a specific point in the audio.

Last updated