Convert Audio to Text with Google Cloud Speech API

The Online Dictation app uses the HTML5 Speech Recognition API to transcribe your voice into digital text. If you have a pre-recorded audio file, you can turn on speech recognition inside Dictation, play the audio file and get the speech as text.

Play ;

Google offers a Cloud Speech API for developers to convert audio to text. You can upload the audio file in FLAC format to Google Cloud storage and the speech API will transcribe the audio to text. If you have audio in MP3 format, use the FFMpeg tool for converting the audio to the desired format.

Also see: Cloud Speech API with Google Service Account

In this example, we upload the .flac audio file to Google Drive (for those who don’t have Google Cloud Storage) and call the Cloud Speech API via the UrlFetchApp service. You need to enable billing in your Google Cloud console, enable the Speech API and also setup an API Key or a service account.

/*

Written by Amit Agarwal
email: amit@labnol.org
web: https://digitalinspiration.com
twitter: @labnol

*/

function convertAudioToText(flacFile, languageCode) {
  var file = DriveApp.getFilesByName(flacFile).next();
  var bytes = file.getBlob().getBytes();

  var payload = {
    config: {
      encoding: 'LINEAR16',
      sampleRate: 16000,
      languageCode: languageCode || 'en-US'
    },
    audio: {
      // You may also upload the audio file to Google
      // Cloud Storage and pass the object URL here
      content: Utilities.base64Encode(bytes)
    }
  };

  // Replace XYZ with your Cloud Speech API key
  var response = UrlFetchApp.fetch('https://speech.googleapis.com/v1/speech:recognize?key=XYZ', {
    method: 'POST',
    contentType: 'application/json',
    payload: JSON.stringify(payload),
    muteHttpExceptions: true
  });

  Logger.log(response.getContentText());
}

Here’s another example that uses the CURL library to send speech recognition requests from the command line.

curl --silent --insecure --header "Content-Type: application/json"
"https://speech.googleapis.com/v1/speech:recognize?key=XYZ"
--data @payload.json

// Content of payload.json
  {
    "config": {
        "encoding":"FLAC",
        "sampleRate": 16000,
        "languageCode": "en-US"

    },
    "audio": {
        "uri":"gs://ctrlq.org/audio.flac"
    }
  }

Amit Agarwal is a web geek, solo entrepreneur and loves making things on the Internet. Google recently awarded him the Google Developer Expert and Google Cloud Champion title for his work on Google Workspace and Google Apps Script.

Awards & Recognition

Google Developer Expert

Google Developer Expert

Google awarded us the Developer Expert title recogizing our work in Workspace

ProductHunt Golden Kitty

ProductHunt Golden Kitty

Our Gmail tool won the Lifehack of the Year award at ProductHunt Golden Kitty Awards

Microsoft MVP Alumni

Microsoft MVP Alumni

Microsoft awarded us the Most Valuable Professional title for 5 years in a row

Google Cloud Champion

Google Cloud Champion

Google awarded us the Champion Innovator award for technical expertise

Want to stay up to date?
Sign up for our email newsletter.

We will never send any spam emails. Promise 🫶🏻