API¶

Client
Response Handlers
Transcription Model

Note: All interfaces not documented here are considered to be private.

Client¶

class amazon_transcribe.client.TranscribeStreamingClient(*, region, endpoint_resolver=None, credential_resolver=None)¶

High level client for orchestrating setup and transmission of audio streams to Amazon TranscribeStreaming service.

Parameters

region (str) – An AWS region to use for Amazon Transcribe (e.g. us-east-2)
endpoint_resolver (Optional[BaseEndpointResolver]) – Optional resolver for client endpoints.
credential_resolver (Optional[CredentialResolver]) – Optional credential resolver for client.

async start_stream_transcription(*, language_code, media_sample_rate_hz, media_encoding, vocabulary_name=None, session_id=None, vocab_filter_method=None)¶

Coordinate transcription settings and start stream.

Pay careful attention to language_code and media_sample_rate_hz configurations. Incorrect setups may lead to streams hanging indefinitely. More info on constraints can be found here: https://docs.aws.amazon.com/transcribe/latest/dg/streaming.html

Parameters

language_code (str) – Indicates the source language used in the input audio stream.
media_sample_rate_hz (int) – The sample rate, in Hertz, of the input audio. We suggest that you use 8000 Hz for low quality audio and 16000 Hz for high quality audio.
media_encoding (str) – The encoding used for the input audio.
vocabulary_name (Optional[str]) – The name of the vocabulary to use when processing the transcription job.
session_id (Optional[str]) – A identifier for the transcription session. Use this parameter when you want to retry a session. If you don’t provide a session ID, Amazon Transcribe will generate one for you and return it in the response.
vocab_filter_method (Optional[str]) – The manner in which you use your vocabulary filter to filter words in your transcript. See Transcribe Streaming API docs for more info.

Return type

StartStreamTranscriptionEventStream

Response Handlers¶

class amazon_transcribe.handlers.TranscriptResultStreamHandler(transcript_result_stream)¶

Parameters: transcript_result_stream (TranscriptResultStream) –

async handle_events()¶: Process generic incoming events from Amazon Transcribe and delegate to appropriate sub-handlers.

async handle_transcript_event(transcript_event)¶

Specific handling for TranscriptionEvent responses from Amazon Transcribe.

This should be implemented by the end user with desired data handling.

Parameters: transcript_event (TranscriptEvent) –

Transcription Model¶

class amazon_transcribe.model.Alternative(transcript, items)¶

Bases: object

A list of possible transcriptions for the audio.

Parameters

transcript – The text that was transcribed from the audio.
items – One or more alternative interpretations of the input audio.

class amazon_transcribe.model.AudioEvent(audio_chunk)¶

Bases: amazon_transcribe.eventstream.BaseEvent

Provides a wrapper for the audio chunks that you are sending.

Parameters: audio_chunk (Optional[bytes]) – A blob of audio from your application. You audio stream consists of one or more audio events.

property audio_chunk¶

class amazon_transcribe.model.AudioStream(input_stream=None, event_serializer=None, eventstream_serializer=None, event_signer=None, initial_signature=None, credential_resolver=None)¶

Bases: amazon_transcribe.eventstream.BaseStream

Input audio stream for transcription stream request.

This should never be instantiated by the end user. It will be returned from the client within a relevant wrapper object.

async send_audio_event(audio_chunk)¶

Enqueue audio bytes to be sent for transcription.

Parameters: audio_chunk (Optional[bytes]) – byte-string chunk of audio input.

class amazon_transcribe.model.Item(start_time=None, end_time=None, item_type=None, content=None, vocabulary_filter_match=None)¶

Bases: object

A word or phrase transcribed from the input audio.

Parameters

start_time (Optional[float]) – The offset from the beginning of the audio stream to the beginning of the audio that resulted in the item.
end_time (Optional[float]) – The offset from the beginning of the audio stream to the end of the audio that resulted in the item.
item_type (Optional[str]) – The type of the item.
content (Optional[str]) – The word or punctuation that was recognized in the input audio.
vocabulary_filter_match (Optional[bool]) – Indicates whether a word in the item matches a word in the vocabulary filter you’ve chosen for your real-time stream. If True then a word in the item matches your vocabulary filter.

class amazon_transcribe.model.Result(result_id=None, start_time=None, end_time=None, is_partial=None, alternatives=None)¶

Bases: object

The result of transcribing a portion of the input audio stream.

Parameters

result_id (Optional[str]) – A unique identifier for the result.
start_time (Optional[float]) – The offset in seconds from the beginning of the audio stream to the beginning of the result.
end_time (Optional[float]) – The offset in seconds from the beginning of the audio stream to the end of the result.
is_partial (Optional[bool]) – Amazon Transcribe divides the incoming audio stream into segments at natural points in the audio. Transcription results are returned based on these segments. True indicates that Amazon Transcribe has additional transcription data to send, False to indicate that this is the last transcription result for the segment.
alternatives (Optional[List[Alternative]]) – A list of possible transcriptions for the audio. Each alternative typically contains one Item that contains the result of the transcription.

class amazon_transcribe.model.StartStreamTranscriptionEventStream(audio_stream, response)¶

Bases: object

Event stream wrapper containing both input and output interfaces to Amazon Transcribe. This should only be created by the client.

Parameters

audio_stream (AudioStream) – Audio input stream generated by client for new transcription requests.
response – Response object from Amazon Transcribe.

property input_stream¶

Audio stream to Amazon Transcribe that takes input audio.

Return type: AudioStream

property output_stream¶

Response stream containing transcribed event output.

Return type: TranscriptResultStream

property response¶

Response object from Amazon Transcribe containing metadata and response output stream.

Return type: StartStreamTranscriptionResponse

class amazon_transcribe.model.StartStreamTranscriptionRequest(language_code=None, media_sample_rate_hz=None, media_encoding=None, vocabulary_name=None, session_id=None, vocab_filter_method=None)¶

Bases: object

Transcription Request

Parameters

language_code – Indicates the source language used in the input audio stream.
media_sample_rate_hz – The sample rate, in Hertz, of the input audio. We suggest that you use 8000 Hz for low quality audio and 16000 Hz for high quality audio.
media_encoding – The encoding used for the input audio.
vocabulary_name – The name of the vocabulary to use when processing the transcription job.
session_id – A identifier for the transcription session. Use this parameter when you want to retry a session. If you don’t provide a session ID, Amazon Transcribe will generate one for you and return it in the response.
vocab_filter_method – The manner in which you use your vocabulary filter to filter words in your transcript.

class amazon_transcribe.model.StartStreamTranscriptionResponse(transcript_result_stream, request_id=None, language_code=None, media_sample_rate_hz=None, media_encoding=None, vocabulary_name=None, session_id=None, vocab_filter_name=None, vocab_filter_method=None)¶

Bases: object

Transcription Response

Parameters

transcript_result_stream – Represents the stream of transcription events from Amazon Transcribe to your application.
request_id – An identifier for the streaming transcription.
language_code – Indicates the source language used in the input audio stream.
media_sample_rate_hz – The sample rate, in Hertz, of the input audio. We suggest that you use 8000 Hz for low quality audio and 16000 Hz for high quality audio.
media_encoding – The encoding used for the input audio.
session_id – A identifier for the transcription session. Use this parameter when you want to retry a session. If you don’t provide a session ID, Amazon Transcribe will generate one for you and return it in the response.
vocab_filter_name – The name of the vocabulary filter used in your real-time stream.
vocab_filter_method – The manner in which you use your vocabulary filter to filter words in your transcript.

class amazon_transcribe.model.Transcript(results)¶

Bases: object

The transcription in a TranscriptEvent.

Parameters: results (List[Result]) – Result objects that contain the results of transcribing a portion of the input audio stream. The array can be empty.

class amazon_transcribe.model.TranscriptEvent(transcript)¶

Bases: amazon_transcribe.eventstream.BaseEvent

Represents a set of transcription results from the server to the client. It contains one or more segments of the transcription.

Parameters: transcript (Transcript) – The transcription of the audio stream. The transcription is composed of all of the items in the results list.

class amazon_transcribe.model.TranscriptResultStream(raw_stream, parser)¶

Bases: amazon_transcribe.eventstream.EventStream

Transcription result stream containing returned TranscriptEvent output.

Results are surfaced through the async iterator interface (i.e. async for)

Raises

BadRequestException – A client error occurred when the stream was created. Check the parameters of the request and try your request again.
LimitExceededException – Your client has exceeded one of the Amazon Transcribe limits, typically the limit on audio length. Break your audio stream into smaller chunks and try your request again.
InternalFailureException – A problem occurred while processing the audio. Amazon Transcribe terminated processing.
ConflictException – A new stream started with the same session ID. The current stream has been terminated.
ServiceUnavailableException – Service is currently unavailable. Try your request later.