VOOZH about

URL: https://dzone.com/articles/using-google-cloud-text-to-speech-with-java

⇱ Using Google Cloud Text-to-Speech With Java


Related

  1. DZone
  2. Coding
  3. Frameworks
  4. Using Google Cloud Text-to-Speech With Java

Using Google Cloud Text-to-Speech With Java

Google Cloud recently released a new text-to-speech service. Let's take it for a test run with a Spring Boot app to see how to work with it in Java.

By Apr. 02, 18 · Tutorial
Likes
Comment
Save
36.2K Views

Join the DZone community and get the full member experience.

Join For Free

Google Cloud Text-to-Speech is a text-to-speech conversion service that got launched a few days back by Google Cloud. This was one of the most important services missing from Google Cloud's AI portfolio, which is now available and completes the loop for text-to-speech and speech-to-text services by Google Cloud. In the next few weeks, you will learn about different usages of Google Cloud's text-to-speech service with other Google cloud services.

In this post, you will learn about some of the following:

  • Set up an Eclipse IDE-based development environment
  • Create a Maven or Spring Boot (Spring Starter) Project

Set Up an Eclipse IDE-Based Development Environment

The following are some of the key aspects of setting up the development environment using Eclipse IDE:

Create a Maven or Spring Boot (Spring Starter) Project

The following are two key steps that need to be taken to create a sample program/app for demonstrating Google Cloud text-to-speech services

  • Include Maven pom.xml artifacts for Text-to-Speech APIs
  • Create the demo app related to text-to-speech

Include Maven POM.xml Artifacts for Text-to-Speech APIs

The following are some of the artifacts that need to be included for working with Google Cloud Text-to-speech APIs

  • com.google.guava
  • org.threeten (threetenbp)
  • com.google.http-client (google-http-client)
  • com.google.cloud (google-cloud-texttospeech)
<!-- https://mvnrepository.com/artifact/com.google.guava/guava -->
<dependency>
 <groupId>com.google.guava</groupId>
 <artifactId>guava</artifactId>
 <version>24.1-jre</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.threeten/threetenbp -->
<dependency>
 <groupId>org.threeten</groupId>
 <artifactId>threetenbp</artifactId>
 <version>1.3.6</version>
</dependency>
<!-- https://mvnrepository.com/artifact/com.google.http-client/google-http-client -->
<dependency>
 <groupId>com.google.http-client</groupId>
 <artifactId>google-http-client</artifactId>
 <version>1.22.0</version>
</dependency>
<!-- https://mvnrepository.com/artifact/com.google.cloud/google-cloud-texttospeech -->
<dependency>
 <groupId>com.google.cloud</groupId>
 <artifactId>google-cloud-texttospeech</artifactId>
 <version>0.42.0-beta</version>
</dependency>


Create the Demo App Related to Text-to-Speech Conversion

Pay attention to some of the following aspects that needed to be done for achieving text-to-speech conversion:

  • Create an instance of TextToSpeechClient
  • Set the text input to be synthesized
  • Build the voice request. Set the voice type (male or female) and language code appropriately.
  • Select the type of audio file you want as an output based on audio encoding value. In the example below, MP3 is the type of audio encoding used. The following are some of the different audio encoding supported, the details of which could be found on the page Introduction to Audio Encoding
    • FLAC
    • Linear 16
    • MULAW
    • AMR_WB
    • OGG_OPUS
  • Process the text to speech conversion
  • Retrieve the audio output/content
  • Write the audio content to a file

The following is the code represents the steps above:

@SpringBootApplication
public class GCloudText2SpeechApplication implements CommandLineRunner {
 
 public static void main(String[] args) {
 SpringApplication app = new SpringApplication(GCloudText2SpeechApplication.class);
 app.run(args);
 }
 
 @Override
 public void run(String... arg0) throws Exception {
 
 String text = "Hello World! How are you doing today? This is Google Cloud Text-to-Speech Demo!";
 String outputAudioFilePath = "/home/support/Documents/output.mp3";
 
 try (TextToSpeechClient textToSpeechClient = TextToSpeechClient.create()) {
 // Set the text input to be synthesized
 SynthesisInput input = SynthesisInput.newBuilder().setText(text).build();
 
 // Build the voice request; languageCode = "en_us"
 VoiceSelectionParams voice = VoiceSelectionParams.newBuilder().setLanguageCode("en-US")
 .setSsmlGender(SsmlVoiceGender.FEMALE)
 .build();
 
 // Select the type of audio file you want returned
 AudioConfig audioConfig = AudioConfig.newBuilder().setAudioEncoding(AudioEncoding.MP3) // MP3 audio.
 .build();
 
 // Perform the text-to-speech request
 SynthesizeSpeechResponse response = textToSpeechClient.synthesizeSpeech(input, voice, audioConfig);
 
 // Get the audio contents from the response
 ByteString audioContents = response.getAudioContent();
 
 // Write the response to the output file.
 try (OutputStream out = new FileOutputStream(outputAudioFilePath)) {
 out.write(audioContents.toByteArray());
 System.out.println("Audio content written to file \"output.mp3\"");
 }
 }
 }
}


Further Reading/References

Summary

In this post, you learned about how to get started with Google Cloud's Text-to-Speech Service using a Java/Sring Boot app.

Did you find this article useful? Do you have any questions or suggestions about this article? Leave a comment and ask your questions and I shall do my best to address your queries.

Cloud Google (verb) Spring Framework Spring Boot Java (programming language)

Published at DZone with permission of Ajitesh Kumar. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

  • Google Cloud Pub/Sub: Messaging With Spring Boot 2.5
  • Keep Your Application Secrets Secret
  • Be Punctual! Avoiding Kotlin’s lateinit In Spring Boot Testing
  • Spring Boot: How To Use Java Persistence Query Language (JPQL)

Partner Resources

×

Comments

The likes didn't load as expected. Please refresh the page and try again.

Let's be friends: