Skip to content

API Reference

Details on API request and response protocols for the ASR, MT, TTS and Video Interpolation services have been provided here.

ASR (Speech-to-Text)

The endpoint for ASR service is https://asr.iitm.ac.in/internal/asr/decode

Request keys

Key Description
file the media file to be transcribed
language the language of the source audio/video in all lowercase (eg: hindi, english)
vtt (optional) whether a webVTT caption file has to be generated. This is an optional value. It accepts two string values either true or false. By default, this is false. This can be used for captioning purposes.

Response keys

Upon successful service of the request, the API returns a JSON response with the following keys:

Key Description
status success
time_taken time taken to transcribe the given audio/video in seconds
transcript the transcription of the given speech input, as infered by the deployed model
vtt WebVTT caption if it was requested. i.e, if the vtt key was set to true in the request, a WebVTT caption would be returned.

In case of a service failure, the API returns a JSON response with the following keys:

Key Description
status failure
reason a reason for the failure in serving the request

Supported Languages

  • Bengali
  • English
  • Gujarati
  • Hindi
  • Kannada
  • Malayalam
  • Marathi
  • Odia
  • Punjabi
  • Sanskrit
  • Tamil
  • Telugu
  • Urdu

Usage

import requests

files = {
'file': open('speech_sample1.mp3', 'rb'),
'language': (None, 'hindi'),
'vtt': (None, 'true'),
}

response = requests.post('https://asr.iitm.ac.in/internal/asr/decode', files=files)
print(response.json())
curl -v -X POST -F 'file=@speech_sample2.wav' -F 'language=english' \
-F 'vtt=true' https://asr.iitm.ac.in/internal/asr/decode

Sample audio files to test the API: english speech, tamil speech.

The ASR API accepts media files from most of the common formats such as .mp3, .mp4, .wav, .ogg etc.

Web Demo interface available at https://asr.iitm.ac.in/demo/asr


Last update: August 2024
Created: March 24, 2023