Translation and Text-to-Speech with Microsoft Translator
Text to speech is a popular technique used by many websites to provide their content in an interactive way. The generation of artificial human voice is known as Speech Synthesis. Even though it's highly popular, there are very few speech synthesis services, especially when looking for those free of charge. Microsoft Translator is one of the services we can use to get a speech service with limited features. In this tutorial, we are going to look at how we can use Microsoft Translator API to translate content and then make audio files using said content.
You can download the entire source code of this project on Github.
Creating Windows Azure Marketplace Application
First, we have to create an application in Windows Azure Data Marketplace to subscribe to Microsoft Translator API. Let's get started on the application creation process.
Prerequisites – Microsoft Email Account
Step 1 – Sign into your Azure Account
Use the email account and sign into Azure Marketplace.
Step 2 – Registering the application
Now we have to create an application to use the translation and speech service. This is similar to the applications we create for popular social networking sites such as Facebook, LinkedIn or Twitter. Click the Register Application link to create a new application. You will get a screen similar to the following.
Fill all the details of the given application form. You can define your own Client ID to be used for the application. Client Secret will be automatically generated for you. Keep it unchanged and fill the remaining details as necessary.
Click on the Create button and you will get a list of created applications as shown in the following screen.
Note down the Client ID and Client Secret to be used for the application.
Step 3 – Subscribe to Translation service
Next, we have to subscribe to Microsoft Translator to use the API. Navigate to Microft Translator and subscribe to one of the packages. First 2,000,000 characters are free and it's obvious that we have to use it for testing purposes.
The following screenshot previews the subscription screen. Subscribe to the free package.
Now we have completed the prerequisites for using the Microsoft Translator API. Let's get started on the development of the text to speech service.
Initializing Translation API Settings
Let's get started by initializing the necessary settings for using the Translator API. Create a file called translation_api_initializer.php
with the following code.
<?php
class TranslationApiInitializer {
private $clientID;
private $clientSecret;
private $authUrl;
private $grantType;
private $scopeUrl;
public function __construct() {
$this->clientID = "Client ID";
$this->clientSecret = "Client Secret";
//OAuth Url.
$this->authUrl = "https://datamarket.accesscontrol.windows.net/v2/OAuth2-13/";
//Application Scope Url
$this->scopeUrl = "http://api.microsofttranslator.com";
//Application grant type
$this->grantType = "client_credentials";
}
}
?>
Here, we have configured the necessary settings for using the translation service. You have to use the Client ID and Client Secret generated in the application registration process. Other parameters contain the authentication URL and grant type. We can keep it as is for every application, unless it's changed officially by Microsoft.
Generating Application Tokens
The next task is to generate tokens to access the Translator service. Tokens have a limited life time and we have to generate them regularly. Let's take a look at the implementation of the token generation function.
/*
* Get the access token.
*
* @param string $grantType Grant type.
* @param string $scopeUrl Application Scope URL.
* @param string $clientID Application client ID.
* @param string $clientSecret Application client ID.
* @param string $authUrl Oauth Url.
*
* @return string.
*/
function getTokens($grantType, $scopeUrl, $clientID, $clientSecret, $authUrl) {
try {
//Initialize the Curl Session.
$ch = curl_init();
//Create the request Array.
$paramArr = array(
'grant_type' => $grantType,
'scope' => $scopeUrl,
'client_id' => $clientID,
'client_secret' => $clientSecret
);
//Create an Http Query.//
$paramArr = http_build_query($paramArr);
//Set the Curl URL.
curl_setopt($ch, CURLOPT_URL, $authUrl);
//Set HTTP POST Request.
curl_setopt($ch, CURLOPT_POST, TRUE);
//Set data to POST in HTTP "POST" Operation.
curl_setopt($ch, CURLOPT_POSTFIELDS, $paramArr);
//CURLOPT_RETURNTRANSFER- TRUE to return the transfer as a string of the return value of curl_exec().
curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);
//CURLOPT_SSL_VERIFYPEER- Set FALSE to stop cURL from verifying the peer's certificate.
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
//Execute the cURL session.
$strResponse = curl_exec($ch);
//Get the Error Code returned by Curl.
$curlErrno = curl_errno($ch);
if ($curlErrno) {
$curlError = curl_error($ch);
throw new Exception($curlError);
}
//Close the Curl Session.
curl_close($ch);
//Decode the returned JSON string.
$objResponse = json_decode($strResponse);
if ($objResponse->error) {
throw new Exception($objResponse->error_description);
}
return $objResponse->access_token;
} catch (Exception $e) {
echo "Exception-" . $e->getMessage();
}
}
Here, we are using a function called getTokens
, which accepts all the settings as parameters. Inside the function, we make a curl request to the defined authentication URL by passing the remaining parameters. On successful execution, we can access the token using
$objResponse->access_token
.
Implementing Reusable Curl Request
Once an access token is retrieved, we can access the translation functions by authorizing the request with the access token. Generally, we use curl for making requests to APIs, so let's implement a reusable function for our curl request as shown in the following code.
/*
* Create and execute the HTTP CURL request.
*
* @param string $url HTTP Url.
* @param string $authHeader Authorization Header string.
* @param string $postData Data to post.
*
* @return string.
*
*/
function curlRequest($url, $authHeader, $postData=''){
//Initialize the Curl Session.
$ch = curl_init();
//Set the Curl url.
curl_setopt($ch, CURLOPT_URL, $url);
//Set the HTTP HEADER Fields.
curl_setopt($ch, CURLOPT_HTTPHEADER, array($authHeader, "Content-Type: text/xml"));
//CURLOPT_RETURNTRANSFER- TRUE to return the transfer as a string of the return value of curl_exec().
curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);
//CURLOPT_SSL_VERIFYPEER- Set FALSE to stop cURL from verifying the peer's certificate.
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, False);
if ($postData) {
//Set HTTP POST Request.
curl_setopt($ch, CURLOPT_POST, TRUE);
//Set data to POST in HTTP "POST" Operation.
curl_setopt($ch, CURLOPT_POSTFIELDS, $postData);
}
//Execute the cURL session.
$curlResponse = curl_exec($ch);
//Get the Error Code returned by Curl.
$curlErrno = curl_errno($ch);
if ($curlErrno) {
$curlError = curl_error($ch);
throw new Exception($curlError);
}
//Close a cURL session.
curl_close($ch);
return $curlResponse;
}
This function takes a request URL, request header and post data as the parameters and returns the curl response or error. Now all the necessary functions for accessing the translator service are ready.
Translating Content
Microsoft Translator API provides a wide range of methods for translation related functionality. In this tutorial, we will be using Translate and Speak methods. You can look at the complete set of API methods at http://msdn.microsoft.com/en-us/library/ff512419.aspx.
Let's get started with the implementation of the translate function.
/*
* Get the translated text
*
* @param string $text Text content for translation
* @param string $from_lang Language of the text
* @param string $to_lang Translation language
* @return array Result set
*/
public function textTranslate($text, $from_lang, $to_lang) {
try {
//Get the Access token.
$accessToken = $this->getTokens($this->grantType, $this->scopeUrl, $this->clientID, $this->clientSecret, $this->authUrl);
//Create the authorization Header string.
$authHeader = "Authorization: Bearer " . $accessToken;
//Input String.
$inputStr = urlencode($text);
$from = $from_lang;
$to = $to_lang;
//HTTP Detect Method URL.
$detectMethodUrl = "http://api.microsofttranslator.com/V2/Http.svc/Translate?text=" . urlencode($inputStr) .
"&from=" . $from . "&to=" . $to."&contentType=text/plain";
//Call the curlRequest.
$strResponse = $this->curlRequest($detectMethodUrl, $authHeader);
//Interprets a string of XML into an object.
$xmlObj = simplexml_load_string($strResponse);
foreach ((array) $xmlObj[0] as $val) {
$translated_str = $val;
}
return array("status" => "success", "msg" => $translated_str);
} catch (Exception $e) {
return array("status" => "error", "msg" => $e->getMessage());
}
}
Translation requires source language, destination language and text to be translated. So we have used them as the parameters of the textTranslate
function. Then we use the previously created getToken
function to retrieve the token and assign it to the request header using Authorization: Bearer
.
Then we set the source and destination languages using $from_lang
and $to_lang
variables. Also we have to encode the text content using PHP's urlencode function.
Now it's time to start the translation process. Translator API provides a method called Translate and we can access it using the URL at http://api.microsofttranslator.com/V2/Http.svc/Translate
This method takes appId, text, to language and content type as mandatory parameters. Since we are using the authorization header, it's not a must to specify the appId. So we assign all the necessary parameters into the Translate API URL using $detectMethodUrl
variable.
Finally, we initialize the curl request by passing translation API URL and the authorization header. On successful execution, we will get the translated data in XML format. So we use simplexml_load_string
function to load the XML string and filter the translated text.
Now we can translate text between any of the supported languages.
Generating Speech Files
The final task of this tutorial is to generate speech in an mp3 file by using the translated text. We will be using a similar technique to do so. Create a function called textToSpeech
with the following code.
/**
* Returns a stream of a wave-file speaking the passed-in text in the desired language.
* @param string $text text of language to break
* @param string $to_lang language of the text
* @return -
*/
public function textToSpeech($text, $to_lang) {
try {
//Get the Access token.
$accessToken = $this->getTokens($this->grantType, $this->scopeUrl, $this->clientID, $this->clientSecret, $this->authUrl);
//Create the authorization Header string.
$authHeader = "Authorization: Bearer " . $accessToken;
//Set the params.
$inputStr = urlencode($text);
$language = $to_lang;
$params = "text=$inputStr&language=$language&format=audio/mp3";
//HTTP Speak method URL.
$url = "http://api.microsofttranslator.com/V2/Http.svc/Speak?$params";
//Set the Header Content Type.
header('Content-Type: audio/mp3');
header('Content-Disposition: attachment; filename=' . uniqid('SPC_') . '.mp3');
//Call the curlRequest.
$strResponse = $this->curlRequest($url, $authHeader);
echo $strResponse;
} catch (Exception $e) {
echo "Exception: " . $e->getMessage() . PHP_EOL;
}
}
This function is similar in nature to the translate function. Here we use the Speak method URL at
http://api.microsofttranslator.com/V2/Http.svc/Speak instead of the translate URL. We have to use text, language and format as the necessary parameters. Since we are using the translated text for generating speech, language parameter should be equal to the $to_lang
used in textTranslate
function.
Then we have to use the necessary headers to automatically download the speech file. Here, we have used audio/mp3
as the content type and the uniqid
function is used to generate a unique file name. Now we have both the translate and speech functions ready for use.
Implementing Frontend Interface
So far, we implemented translate and speech functions in the backend of the application. Now we have to build a simplified frontend interface to access the speech file generation service. Let's create a file called index.php
with basic HTML form.
<?php
$bing_language_codes = array('da' => 'Danish',
'de' =>'German','en'=> 'English','fi'=> 'Finnish',
'fr'=>'French','nl'=>'Dutch','ja'=> 'Japanese','pl'=> 'Polish', 'es'=> 'Spanish','ru'=> 'Russian',);
?>
<form action='' method='POST' >
<table>
<tr><td>Select From Language</td>
<td><select name='from_lang' >
<?php foreach($bing_language_codes as $code=>$lang){ ?>
<option value='<?php echo $code; ?>'><?php echo $lang; ?></option>
<?php } ?>
</select>
</td>
</tr>
<tr><td>Select To Language</td>
<td><select name='to_lang' >
<?php foreach($bing_language_codes as $code=>$lang){ ?>
<option value='<?php echo $code; ?>'><?php echo $lang; ?></option>
<?php } ?>
</select>
</td>
</tr>
<tr><td>Text</td>
<td><textarea cols='50' name='text' ></textarea>
</td>
</tr>
<tr><td></td>
<td><input type='submit' value='Submit' />
</td>
</tr>
</table>
</form>
This form allows users to select the preferred source and destination languages and type the content to be converted into speech. I have used a few of the supported languages in an array called $bing_language_codes
. Now we can move to the form submission handling process to generate the speech file as shown below.
<?php
include_once "translation_api_initializer.php";
$bing_language_codes = array('da' => 'Danish',
'de' => 'German','en'=> 'English','fi'=> 'Finnish',
'fr'=> 'French', 'nl'=> 'Dutch','ja'=>'Japanese',
'pl'=> 'Polish','es'=> 'Spanish','ru'=> 'Russian');
if($_POST){
$bing = new TranslationApiInitializer();
$from_lang = isset($_POST['from_lang']) ? $_POST['from_lang'] : 'en';
$to_lang = isset($_POST['to_lang']) ? $_POST['to_lang'] : 'fr';
$text = isset($_POST['text']) ? $_POST['text'] : '';
$result = $bing->textTranslate($text, $from_lang, $to_lang);
$bing->textToSpeech($result['msg'],$to_lang);
}
?>
We include the TranslationApiInitializer
file created earlier in the process and execute the translation and speech function respectively to generate the audio file. Once the code is completed, you will be able to generate audio files using this service.
You can take a look at a live demo at http://www.innovativephp.com/demo/spc_text_to_speech
Wrap Up
Throughout this tutorial we implemented translate and speech generation using Microsoft Translator API. Even though it's easy to use and free to some extent, there are also limitations. The speech service is only provided for a limited number of characters, roughly around 2000. So it's not possible use this service for a larger block of text. I recommend you use Acapela for large scale speech generation services.
Hope you enjoyed the tutorial and looking forward to your comments and suggestions.