Using Whisper API to Generate .SRT Transcripts?

Quick Fix: In the Python code, specify response_format="srt" as a named parameter to the openai.Audio.transcribe() function to generate a .SRT transcript.

The Problem:

Given an audio file, how can I use the Whisper API to generate an SRT file of transcriptions, allowing speech-to-text functionality without having to run the model locally?

The Solutions:

Solution 1: Generating SRT Transcripts Using OpenAI’s Whisper API

Generating SRT transcripts from audio files is feasible using OpenAI’s Whisper API. By setting the `response_format` parameter to “srt” while creating a transcription, you can obtain the transcript directly in the SRT (SubRip Text) format, which is commonly used for subtitles. Here’s an improved solution:

import os
import tempfile
import openai

openai.api_key = "YOUR_API_KEY"

# Read the audio file
audio_file_path = "path/to/audio_file.wav"

# Optionally, you can set the audio format
audio_format = "wav"  # Supported formats: "mp3", "wav"

# For longer audio files, you may need to store the transcript temporarily
temp_file = tempfile.NamedTemporaryFile(delete=False)

# Perform transcription and store the result in the temporary file
transcription = openai.Audio.transcribe(
    model="whisper",
    file=audio_file_path,
    audio_format=audio_format,
    response_format="srt",
    output_file=temp_file.name,
)

# Get the transcript from the temporary file
transcript_text = open(temp_file.name, "r").read()

# Clean up the temporary file
os.unlink(temp_file.name)

# Do something with the transcript text, e.g., save it or display it
print(transcript_text)

This revised solution includes additional comments and explanations, clarifying the purpose of each step. It also handles storing the transcript in a temporary file, which might be necessary for longer audio files to prevent memory issues. The preferred audio format can also be specified explicitly if desired. Finally, the transcript is printed as an example of how to use the obtained result.

Q&A

Can Whisper API generate .SRT transcripts through the API without running the model locally?

—

Yes, the Whisper API supports SRT transcription. Set response_format to srt.

How to pass response_format parameter in API call using Python?

—

Use response_format as a named parameter in the openai.Audio.transcribe() function call.

Video Explanation:

The following video, titled "I Made an App that Accurately Produces Subtitles Using Whisper ...", provides additional insights and in-depth exploration related to the topics discussed in this post.