How to Add Speech Recognition to your Website

Open the Google website on your desktop computer and you’ll find a little microphone icon embedded inside the search box. Click the icon, say something and your voice is quickly transcribed into words. Unlike earlier speech recognition products, you no longer have to train the browser to understand your speech and, for those who don’t know touch typing, speech is often a faster mode of input than the keyboard.

Sounds like magic, right? Well, did you know that you can also include similar speech recognition capabilities to your own website with a few lines of code. Visitors can search your website, or even fill forms, using just their voice. Both Google Chrome and Firefox browsers support the speech recognition API.

Before we dive into the actual implementation, let’s play with a working demo. If you are viewing this page inside Google Chrome (desktop or mobile), click the voice icon inside the search box and say a search query. You may have allow the browser to access your microphone. When you are done speaking, the search results page will open automatically.

<style>
  .speech {
    border: 1px solid #ddd;
    width: 300px;
    padding: 0;
    margin: 0;
  }

  .speech input {
    border: 0;
    width: 240px;
    display: inline-block;
    height: 30px;
    font-size: 14px;
  }

  .speech img {
    float: right;
    width: 40px;
  }
</style>

<form id="labnol" method="get" action="http://www.labnol.org">
  <div class="speech">
    <input type="text" name="s" id="transcript" placeholder="Say Something" />
    <img onclick="startDictation()" src="https://i.imgur.com/cHidSVu.gif" />
  </div>
</form>

<script>
  function startDictation() {
    if (window.hasOwnProperty('webkitSpeechRecognition')) {
      var recognition = new webkitSpeechRecognition();

      recognition.continuous = false;
      recognition.interimResults = false;
      recognition.lang = 'en-US';
      recognition.start();

      recognition.onresult = function (e) {
        document.getElementById('transcript').value = e.results[0][0].transcript;
        recognition.stop();
        document.getElementById('labnol').submit();
      };
      recognition.onerror = function (e) {
        recognition.stop();
      };
    }
  }
</script>

Add Voice Recognition to your Website

The HTML5 Web Speech API has been around for few years now but it takes slightly more work now to include it in your website.

Earlier, you could add the attribute x-webkit-speech to any form input field and it would become voice capable. The x-webkit-speech attribute has however been deprecated and you are now required to use the JavaScript API to include speech recognition. Here’s the updated code:

<!-- CSS Styles -->
<style>
  .speech {
    border: 1px solid #ddd;
    width: 300px;
    padding: 0;
    margin: 0;
  }
  .speech input {
    border: 0;
    width: 240px;
    display: inline-block;
    height: 30px;
  }
  .speech img {
    float: right;
    width: 40px;
  }
</style>

<!-- Search Form -->
<form id="labnol" method="get" action="https://www.google.com/search">
  <div class="speech">
    <input type="text" name="q" id="transcript" placeholder="Speak" />
    <img onclick="startDictation()" src="//i.imgur.com/cHidSVu.gif" />
  </div>
</form>

<!-- HTML5 Speech Recognition API -->
<script>
  function startDictation() {
    if (window.hasOwnProperty('webkitSpeechRecognition')) {
      var recognition = new webkitSpeechRecognition();

      recognition.continuous = false;
      recognition.interimResults = false;

      recognition.lang = 'en-US';
      recognition.start();

      recognition.onresult = function (e) {
        document.getElementById('transcript').value = e.results[0][0].transcript;
        recognition.stop();
        document.getElementById('labnol').submit();
      };

      recognition.onerror = function (e) {
        recognition.stop();
      };
    }
  }
</script>

We have the CSS to place the microphone image inside the input box, the form code containing the input button and the JavaScript that does all the heavy work.

When the user click the mic image inside the search box, the JavaScript checks if the user’s browser supports speech recognition. If so, it waits for the transcribed text to arrive from Google servers and then submits the form.

The Dictation App also uses the speech recognition API though it writes the transcribed text to textarea field instead of an input box.

Some notes:

If the HTML form / search box is embedded inside an HTTPS website, the browser will not repeatedly ask for permission to use the microphone.
You can change the value of the recognition.lang property from ‘en-US’ to another language (like hi-In for Hindi or fr-FR for Français). See the complete list of supported languages.