Prompt Modes

Granite Speech Plus supports three prompt modes in this package.

`asr`

Standard speech transcription.

from granite_speech_plus_mlx import GraniteSpeechPlusPipeline

pipe = GraniteSpeechPlusPipeline.from_pretrained()
text = pipe.transcribe("audio.wav", prompt_mode="asr")

`saa`

Speaker-attributed ASR. The prompt asks the model to add speaker turn labels such as [Speaker 1]: and [Speaker 2]:.

text = pipe.transcribe("meeting.wav", prompt_mode="saa")

`ts`

Word-level timestamps. The prompt asks the model to append centisecond tags after words, for example hello [T:45] world [T:82].

text = pipe.transcribe("clip.wav", prompt_mode="ts")

For long audio, the pipeline chunks the waveform and feeds a short previous transcript prefix into later chunks for continuity. The prefix is context only; the model is instructed not to repeat it.

918 B Raw Permalink Blame History

Prompt Modes

asr

saa

ts

918 B

Raw Permalink Blame History

`asr`

`saa`

`ts`