Speech-to-speech translator with Azure and Python
Introduction
Today I wanted to do something with Azure Cognitive Services and what is a better place to start than a translator? This particular one lets two people communicate with each other as long as they have python installed on their computers! Let me talk you through how it works!
Setup
Let's start with importing the required modules:
Enter your keys and locations
Here you have to initialize the variables with your own keys, I'm not showing mine here
Take the input from the user
This program can receive two inputs from the command line, -l/--language, which is for this user's language, and -hs/--headset, which is a boolean if the user uses headphones. The language entered has to be the Locale code from here, for example pl-PL if you want to use this endpoint to translate to and from Polish.
Configure the Services in the code
Using the arguments from the command line, we can configure a Speech Recognizer, Speech Synthezizer, a producer and a consumer for Event Hub on Azure.
Sending text to the event hub
When a sentence is recognized, it has to be sent to the event hub...
but before we send it there, we have to do some preparations - first, we create an event, then we set the spoken text as the event data and later we have to set the properties of the event data for sender id and language. Finally, thanks to all of this, we can send the event data as a batch.
Receive and translate text
When you receive a message from the other user, it is still in their native language, not yours. Before we display the content of the message to the user, we have to translate it first:
In the last line of the function, an another function called 'translate' is called. It actually consists of two parts - the first part creates a rest request to the Azure translator and the second one is saying out loud the translated text using Azure Speech Synthezizer.
Part 2:
Conclusion
This is how my translator works. You can open both instances on one device but the synthesized responses will overlap and it will be hard to disntiguish what is said. For full experience, run it on two separate devices! You can find the code here!