To start, it’s typically a good idea to explore which browsers best support the technologies we’re going to be working with. Here’s MDN’s spec sheet for the speechRecognition API: https://developer.mozilla.org/en-US/docs/Web/API/SpeechRecognition#Browser_compatibility. As you can see, it’s pretty much Chrome leading the way; however, Firefox has some capability as well. The same holds true for the speechSynthesis API: https://developer.mozilla.org/en-US/docs/Web/API/SpeechSynthesis#Browser_compatibility. Do note that Microsoft’s Edge browser enjoys some speech synthesis capabilities. David Walsh wrote a nice article on setting up the speechSynthesis API for cross-browser functionality; but, for simplicity’s sake and for the remainder of this tutorial, I’m going to assume the use of Chrome.
Try it out. Give the browser access to your computers mic and then try saying a few different words or phrases. If all goes according to plan, you should see what you say being written to the body of the html. If you’re having trouble getting it to work, try:
1) Making sure your computer has a mic and that your browser has access to it.
2) Making sure you have open only 1 application/tab/window that’s using the microphone.
3) Making sure to use Chrome as your browser and loading up the JSFiddle example in a new tab/window.
Okay, so the above example simply writes what the computer heard to the body of the html. That’s pretty interesting, but now let’s do something a bit more fancy; let’s tell the computer to do something for us! In the following example, trying opening up the Fiddle and telling the browser to change the background color of the HTML. The way I programmed it, you’ll have to say this exact phrase:
“Change background color to…” and then say any of the many colors recognized by CSS (e.g., “red,” “blue,” “green,” “yellow,”etc.).
Pretty cool, right? And really not all that difficult to pull off!
Now let’s look at giving the program a voice. I’m going to use Chrome’s default voice (yes, it sounds pretty robotic); but once you get the hang of it, feel free to read up on how to get and use different voices. Here we go; let’s see what it’s got to say:
Hear that? This time, the program audibly confirms that it’s changing the color of the background! Fantastic.
To recap all of this…
– The browser’s
speechRecognition object has
stop methods you can use to start and stop listening for audio input.
speechRecoginition object can react to for
– To get a string/text of what the computer heard, you can pass the onresult event to a function and then reference the
speechSynthesis object has a
speak method that you can use to utter new
– You can pass a string (or number) value to the
SpeechSynthesisUtterance constructor to create words or phrases.
– pass that whole thing to the
speak method and you’ve got a talking computer!