Quickstart: Convert text to speech

In this quickstart, try out the text to speech model from Azure Speech in Foundry Tools, using Microsoft Foundry.

Prerequisites

An Azure subscription. Create one for free.
A Foundry project. If you need to create a project, see Create a Microsoft Foundry project.

Try text to speech

Try text to speech in the Foundry portal by following these steps:

Foundry (new) portal
Foundry (classic) portal

Select Build from the top right menu.
Select Models on the left pane.
The AI Services tab shows the Azure AI models that can be used out of the box in the Foundry portal. Select Azure Speech - Text to Speech to open the Text to Speech playground.
Choose a prebuilt voice from the dropdown menu, and optionally tune it with the provider parameter sliders.
Enter your sample text in the text box.
Select Play to hear the synthetic voice read your text.

Other Foundry (new) features

The following Speech features are available in the Foundry (new) portal:

Reference documentation | Package (NuGet) | Additional samples on GitHub With Azure Speech in Foundry Tools, you can run an application that synthesizes a human-like voice to read text. You can change the voice, enter text to be spoken, and listen to the output on your computer’s speaker.

You can try text to speech in the Speech Studio Voice Gallery without signing up or writing any code.

Try out the Azure Speech Toolkit to easily build and run samples on Visual Studio Code.

Prerequisites

An Azure subscription. You can create one for free.

Create an AI Services resource for Speech in the Azure portal.

Get the Speech resource key and endpoint. After your Speech resource is deployed, select Go to resource to view and manage keys.

Set up the environment

The Speech SDK is available as a NuGet package that implements .NET Standard 2.0. Install the Speech SDK later in this guide by using the console. For detailed installation instructions, see Install the Speech SDK.

Set environment variables

You need to authenticate your application to access Foundry Tools. This article shows you how to use environment variables to store your credentials. You can then access the environment variables from your code to authenticate your application. For production, use a more secure way to store and access your credentials. To set the environment variables for your Speech resource key and endpoint, open a console window, and follow the instructions for your operating system and development environment.

To set the SPEECH_KEY environment variable, replace your-key with one of the keys for your resource.
To set the ENDPOINT environment variable, replace your-endpoint with one of the endpoints for your resource.

Windows
Linux
macOS

setx SPEECH_KEY your-key
setx ENDPOINT your-endpoint

If you only need to access the environment variables in the current console, you can set the environment variable with set instead of setx.

After you add the environment variables, you might need to restart any programs that need to read the environment variables, including the console window. For example, if you’re using Visual Studio as your editor, restart Visual Studio before you run the example.

Bash

Edit your .bashrc file, and add the environment variables:

export SPEECH_KEY=your-key
export ENDPOINT=your-endpoint

After you add the environment variables, run source ~/.bashrc from your console window to make the changes effective.

Bash

Edit your .bash_profile file, and add the environment variables:

export SPEECH_KEY=your-key
export ENDPOINT=your-endpoint

After you add the environment variables, run source ~/.bash_profile from your console window to make the changes effective.

Xcode

For iOS and macOS development, you set the environment variables in Xcode. For example, follow these steps to set the environment variable in Xcode 13.4.1.

Select Product > Scheme > Edit scheme.
Select Arguments on the Run (Debug Run) page.
Under Environment Variables select the plus (+) sign to add a new environment variable.
Enter SPEECH_KEY for the Name and enter your Speech resource key for the Value.

To set the environment variable for your Speech resource endpoint, follow the same steps. Set ENDPOINT to the endpoint of your resource. For example, https://YourServiceRegion.api.cognitive.microsoft.com.For more configuration options, see the Xcode documentation.

Create the application

Follow these steps to create a console application and install the Speech SDK.

Open a command prompt window in the folder where you want the new project. Run this command to create a console application with the .NET CLI.
```
dotnet new console
```
The command creates a Program.cs file in the project directory.
Install the Speech SDK in your new project with the .NET CLI.
```
dotnet add package Microsoft.CognitiveServices.Speech
```

Replace the contents of Program.cs with the following code.

using System;
using System.IO;
using System.Threading.Tasks;
using Microsoft.CognitiveServices.Speech;
using Microsoft.CognitiveServices.Speech.Audio;
    
class Program 
{
    // This example requires environment variables named "SPEECH_KEY" and "ENDPOINT"
    static string speechKey = Environment.GetEnvironmentVariable("SPEECH_KEY");
    static string endpoint = Environment.GetEnvironmentVariable("ENDPOINT");

    static void OutputSpeechSynthesisResult(SpeechSynthesisResult speechSynthesisResult, string text)
    {
        switch (speechSynthesisResult.Reason)
        {
            case ResultReason.SynthesizingAudioCompleted:
                Console.WriteLine($"Speech synthesized for text: [{text}]");
                break;
            case ResultReason.Canceled:
                var cancellation = SpeechSynthesisCancellationDetails.FromResult(speechSynthesisResult);
                Console.WriteLine($"CANCELED: Reason={cancellation.Reason}");

                if (cancellation.Reason == CancellationReason.Error)
                {
                    Console.WriteLine($"CANCELED: ErrorCode={cancellation.ErrorCode}");
                    Console.WriteLine($"CANCELED: ErrorDetails=[{cancellation.ErrorDetails}]");
                    Console.WriteLine($"CANCELED: Did you set the speech resource key and endpoint values?");
                }
                break;
            default:
                break;
        }
    }

    async static Task Main(string[] args)
    {
        var speechConfig = SpeechConfig.FromEndpoint(new Uri(endpoint), speechKey); 

        // The neural multilingual voice can speak different languages based on the input text.
        speechConfig.SpeechSynthesisVoiceName = "en-US-Ava:DragonHDLatestNeural"; 

        using (var speechSynthesizer = new SpeechSynthesizer(speechConfig))
        {
            // Get text from the console and synthesize to the default speaker.
            Console.WriteLine("Enter some text that you want to speak >");
            string text = Console.ReadLine();

            var speechSynthesisResult = await speechSynthesizer.SpeakTextAsync(text);
            OutputSpeechSynthesisResult(speechSynthesisResult, text);
        }

        Console.WriteLine("Press any key to exit...");
        Console.ReadKey();
    }
}

To change the speech synthesis language, replace en-US-Ava:DragonHDLatestNeural with another supported voice. All neural voices are multilingual and fluent in their own language and English. For example, if the input text in English is I’m excited to try text to speech and you set es-ES-Ximena:DragonHDLatestNeural as the language, the text is spoken in English with a Spanish accent. If the voice doesn’t speak the language of the input text, the Speech service doesn’t output synthesized audio.
Run your new console application to start speech synthesis to the default speaker.
```
dotnet run
```

Make sure that you set the SPEECH_KEY and ENDPOINT environment variables. If you don’t set these variables, the sample fails with an error message.

Enter some text that you want to speak. For example, type I’m excited to try text to speech. Select the Enter key to hear the synthesized speech.
```
Enter some text that you want to speak >
I'm excited to try text to speech
```

Remarks

More speech synthesis options

This quickstart uses the SpeakTextAsync operation to synthesize a short block of text that you enter. You can also use long-form text from a file and get finer control over voice styles, prosody, and other settings.

See how to synthesize speech and Speech Synthesis Markup Language (SSML) overview for information about speech synthesis from a file and finer control over voice styles, prosody, and other settings.
See batch synthesis API for text to speech for information about synthesizing long-form text to speech.

OpenAI text to speech voices in Azure Speech in Foundry Tools

OpenAI text to speech voices are also supported. See OpenAI text to speech voices in Azure Speech and multilingual voices. You can replace en-US-Ava:DragonHDLatestNeural with a supported OpenAI voice name such as en-US-FableMultilingualNeural.

Clean up resources

You can use the Azure portal or Azure Command Line Interface (CLI) to remove the Speech resource you created. Reference documentation | Package (NuGet) | Additional samples on GitHub With Azure Speech in Foundry Tools, you can run an application that synthesizes a human-like voice to read text. You can change the voice, enter text to be spoken, and listen to the output on your computer’s speaker.

You can try text to speech in the Speech Studio Voice Gallery without signing up or writing any code.

Try out the Azure Speech Toolkit to easily build and run samples on Visual Studio Code.

Prerequisites

An Azure subscription. You can create one for free.

Create an AI Services resource for Speech in the Azure portal.

Get the Speech resource key and endpoint. After your Speech resource is deployed, select Go to resource to view and manage keys.

Set up the environment

The Speech SDK is available as a NuGet package that implements .NET Standard 2.0. Install the Speech SDK later in this guide. For detailed installation instructions, see Install the Speech SDK.

Set environment variables

To set the SPEECH_KEY environment variable, replace your-key with one of the keys for your resource.
To set the ENDPOINT environment variable, replace your-endpoint with one of the endpoints for your resource.

Windows
Linux
macOS

setx SPEECH_KEY your-key
setx ENDPOINT your-endpoint

If you only need to access the environment variables in the current console, you can set the environment variable with set instead of setx.

Bash

Edit your .bashrc file, and add the environment variables:

export SPEECH_KEY=your-key
export ENDPOINT=your-endpoint

After you add the environment variables, run source ~/.bashrc from your console window to make the changes effective.

Bash

Edit your .bash_profile file, and add the environment variables:

export SPEECH_KEY=your-key
export ENDPOINT=your-endpoint

After you add the environment variables, run source ~/.bash_profile from your console window to make the changes effective.

Xcode

For iOS and macOS development, you set the environment variables in Xcode. For example, follow these steps to set the environment variable in Xcode 13.4.1.

Select Product > Scheme > Edit scheme.
Select Arguments on the Run (Debug Run) page.
Under Environment Variables select the plus (+) sign to add a new environment variable.
Enter SPEECH_KEY for the Name and enter your Speech resource key for the Value.

Create the application

Follow these steps to create a console application and install the Speech SDK.

Create a C++ console project in Visual Studio Community named SpeechSynthesis.

Replace the contents of SpeechSynthesis.cpp with the following code:

#include <iostream> 
#include <stdlib.h>
#include <speechapi_cxx.h>

using namespace Microsoft::CognitiveServices::Speech;
using namespace Microsoft::CognitiveServices::Speech::Audio;

std::string GetEnvironmentVariable(const char* name);

int main()
{
    // This example requires environment variables named "SPEECH_KEY" and "ENDPOINT"
    auto speechKey = GetEnvironmentVariable("SPEECH_KEY");
    auto endpoint = GetEnvironmentVariable("ENDPOINT");

    if (std::string(speechKey).empty() || std::string(endpoint).empty()) {
        std::cout << "Please set both SPEECH_KEY and ENDPOINT environment variables." << std::endl;
        return -1;
    }

    auto speechConfig = SpeechConfig::FromEndpoint(endpoint, speechKey);

    // The neural multilingual voice can speak different languages based on the input text.
    speechConfig->SetSpeechSynthesisVoiceName("en-US-AriaNeural");

    auto speechSynthesizer = SpeechSynthesizer::FromConfig(speechConfig);

    // Get text from the console and synthesize to the default speaker.
    std::cout << "Enter some text that you want to speak >" << std::endl;
    std::string text;
    getline(std::cin, text);

    auto result = speechSynthesizer->SpeakTextAsync(text).get();

    // Checks result.
    if (result->Reason == ResultReason::SynthesizingAudioCompleted)
    {
        std::cout << "Speech synthesized to speaker for text [" << text << "]" << std::endl;
    }
    else if (result->Reason == ResultReason::Canceled)
    {
        auto cancellation = SpeechSynthesisCancellationDetails::FromResult(result);
        std::cout << "CANCELED: Reason=" << (int)cancellation->Reason << std::endl;

        if (cancellation->Reason == CancellationReason::Error)
        {
            std::cout << "CANCELED: ErrorCode=" << (int)cancellation->ErrorCode << std::endl;
            std::cout << "CANCELED: ErrorDetails=[" << cancellation->ErrorDetails << "]" << std::endl;
            std::cout << "CANCELED: Did you set the speech resource key and endpoint values?" << std::endl;
        }
    }

    std::cout << "Press enter to exit..." << std::endl;
    std::cin.get();
}

std::string GetEnvironmentVariable(const char* name)
{
#if defined(_MSC_VER)
    size_t requiredSize = 0;
    (void)getenv_s(&requiredSize, nullptr, 0, name);
    if (requiredSize == 0)
    {
        return "";
    }
    auto buffer = std::make_unique<char[]>(requiredSize);
    (void)getenv_s(&requiredSize, buffer.get(), requiredSize, name);
    return buffer.get();
#else
    auto value = getenv(name);
    return value ? value : "";
#endif
}  

Select Tools > Nuget Package Manager > Package Manager Console. In the Package Manager Console, run this command:
```
Install-Package Microsoft.CognitiveServices.Speech
```
To change the speech synthesis language, replace en-US-Ava:DragonHDLatestNeural with another supported voice. All neural voices are multilingual and fluent in their own language and English. For example, if the input text in English is I’m excited to try text to speech and you set es-ES-Ximena:DragonHDLatestNeural, the text is spoken in English with a Spanish accent. If the voice doesn’t speak the language of the input text, the Speech service doesn’t output synthesized audio.
To start speech synthesis to the default speaker, Build and run your new console application.

Make sure that you set the SPEECH_KEY and ENDPOINT environment variables. If you don’t set these variables, the sample fails with an error message.

Enter some text that you want to speak. For example, type I’m excited to try text to speech. Select the Enter key to hear the synthesized speech.
```
Enter some text that you want to speak >
I'm excited to try text to speech
```

Remarks

More speech synthesis options

See how to synthesize speech and Speech Synthesis Markup Language (SSML) overview for information about speech synthesis from a file and finer control over voice styles, prosody, and other settings.
See batch synthesis API for text to speech for information about synthesizing long-form text to speech.

OpenAI text to speech voices in Azure Speech in Foundry Tools

Clean up resources

You can use the Azure portal or Azure Command Line Interface (CLI) to remove the Speech resource you created. Reference documentation | Package (Go) | Additional samples on GitHub With Azure Speech in Foundry Tools, you can run an application that synthesizes a human-like voice to read text. You can change the voice, enter text to be spoken, and listen to the output on your computer’s speaker.

You can try text to speech in the Speech Studio Voice Gallery without signing up or writing any code.

Try out the Azure Speech Toolkit to easily build and run samples on Visual Studio Code.

Prerequisites

An Azure subscription. You can create one for free.

Create a Foundry resource for Speech in the Azure portal.

Get the Speech resource key and region. After your Speech resource is deployed, select Go to resource to view and manage keys.

Set up the environment

Install the Speech SDK for the Go language. For detailed installation instructions, see Install the Speech SDK.

Set environment variables

You need to authenticate your application to access Foundry Tools. This article shows you how to use environment variables to store your credentials. You can then access the environment variables from your code to authenticate your application. For production, use a more secure way to store and access your credentials. To set the environment variables for your Speech resource key and region, open a console window, and follow the instructions for your operating system and development environment.

To set the SPEECH_KEY environment variable, replace your-key with one of the keys for your resource.
To set the SPEECH_REGION environment variable, replace your-region with one of the regions for your resource.
To set the ENDPOINT environment variable, replace your-endpoint with the actual endpoint of your Speech resource.

Windows
Linux
macOS

setx SPEECH_KEY your-key
setx SPEECH_REGION your-region
setx ENDPOINT your-endpoint

If you only need to access the environment variables in the current console, you can set the environment variable with set instead of setx.

Bash

Edit your .bashrc file, and add the environment variables:

export SPEECH_KEY=your-key
export SPEECH_REGION=your-region
export ENDPOINT=your-endpoint

After you add the environment variables, run source ~/.bashrc from your console window to make the changes effective.

Bash

Edit your .bash_profile file, and add the environment variables:

export SPEECH_KEY=your-key
export SPEECH_REGION=your-region
export ENDPOINT=your-endpoint

After you add the environment variables, run source ~/.bash_profile from your console window to make the changes effective.

Xcode

For iOS and macOS development, you set the environment variables in Xcode. For example, follow these steps to set the environment variable in Xcode 13.4.1.

Select Product > Scheme > Edit scheme.
Select Arguments on the Run (Debug Run) page.
Under Environment Variables select the plus (+) sign to add a new environment variable.
Enter SPEECH_KEY for the Name and enter your Speech resource key for the Value.

To set the environment variable for your Speech resource region, follow the same steps. Set SPEECH_REGION to the region of your resource. For example, westus. Set ENDPOINT to the endpoint of your resourceFor more configuration options, see the Xcode documentation.

Create the application

Follow these steps to create a Go module.

Open a command prompt window in the folder where you want the new project. Create a new file named speech-synthesis.go.

Copy the following code into speech-synthesis.go:

package main

import (
    "bufio"
    "fmt"
    "os"
    "strings"
    "time"

    "github.com/Microsoft/cognitive-services-speech-sdk-go/audio"
    "github.com/Microsoft/cognitive-services-speech-sdk-go/common"
    "github.com/Microsoft/cognitive-services-speech-sdk-go/speech"
)

func synthesizeStartedHandler(event speech.SpeechSynthesisEventArgs) {
    defer event.Close()
    fmt.Println("Synthesis started.")
}

func synthesizingHandler(event speech.SpeechSynthesisEventArgs) {
    defer event.Close()
    fmt.Printf("Synthesizing, audio chunk size %d.\n", len(event.Result.AudioData))
}

func synthesizedHandler(event speech.SpeechSynthesisEventArgs) {
    defer event.Close()
    fmt.Printf("Synthesized, audio length %d.\n", len(event.Result.AudioData))
}

func cancelledHandler(event speech.SpeechSynthesisEventArgs) {
    defer event.Close()
    fmt.Println("Received a cancellation.")
}

func main() {
    // This example requires environment variables named "SPEECH_KEY" and "SPEECH_REGION"
    speechKey :=  os.Getenv("SPEECH_KEY")
    speechRegion := os.Getenv("SPEECH_REGION")

    audioConfig, err := audio.NewAudioConfigFromDefaultSpeakerOutput()
    if err != nil {
        fmt.Println("Got an error: ", err)
        return
    }
    defer audioConfig.Close()
    speechConfig, err := speech.NewSpeechConfigFromSubscription(speechKey, speechRegion)
    if err != nil {
        fmt.Println("Got an error: ", err)
        return
    }
    defer speechConfig.Close()

    speechConfig.SetSpeechSynthesisVoiceName("en-US-Ava:DragonHDLatestNeural")

    speechSynthesizer, err := speech.NewSpeechSynthesizerFromConfig(speechConfig, audioConfig)
    if err != nil {
        fmt.Println("Got an error: ", err)
        return
    }
    defer speechSynthesizer.Close()

    speechSynthesizer.SynthesisStarted(synthesizeStartedHandler)
    speechSynthesizer.Synthesizing(synthesizingHandler)
    speechSynthesizer.SynthesisCompleted(synthesizedHandler)
    speechSynthesizer.SynthesisCanceled(cancelledHandler)

    for {
        fmt.Printf("Enter some text that you want to speak, or enter empty text to exit.\n> ")
        text, _ := bufio.NewReader(os.Stdin).ReadString('\n')
        text = strings.TrimSuffix(text, "\n")
        if len(text) == 0 {
            break
        }

        task := speechSynthesizer.SpeakTextAsync(text)
        var outcome speech.SpeechSynthesisOutcome
        select {
        case outcome = <-task:
        case <-time.After(60 * time.Second):
            fmt.Println("Timed out")
            return
        }
        defer outcome.Close()
        if outcome.Error != nil {
            fmt.Println("Got an error: ", outcome.Error)
            return
        }

        if outcome.Result.Reason == common.SynthesizingAudioCompleted {
            fmt.Printf("Speech synthesized to speaker for text [%s].\n", text)
        } else {
            cancellation, _ := speech.NewCancellationDetailsFromSpeechSynthesisResult(outcome.Result)
            fmt.Printf("CANCELED: Reason=%d.\n", cancellation.Reason)

            if cancellation.Reason == common.Error {
                fmt.Printf("CANCELED: ErrorCode=%d\nCANCELED: ErrorDetails=[%s]\nCANCELED: Did you set the speech resource key and region values?\n",
                    cancellation.ErrorCode,
                    cancellation.ErrorDetails)
            }
        }
    }
}

To change the speech synthesis language, replace en-US-Ava:DragonHDLatestNeural with another supported voice. All neural voices are multilingual and fluent in their own language and English. For example, if the input text in English is I’m excited to try text to speech and you set es-ES-Ximena:DragonHDLatestNeural, the text is spoken in English with a Spanish accent. If the voice doesn’t speak the language of the input text, the Speech service doesn’t output synthesized audio.
Run the following commands to create a go.mod file that links to components hosted on GitHub:
```
go mod init speech-synthesis
go get github.com/Microsoft/cognitive-services-speech-sdk-go
```

Make sure that you set the SPEECH_KEY and SPEECH_REGION environment variables. If you don’t set these variables, the sample fails with an error message.

Now build and run the code:
```
go build
go run speech-synthesis
```

Remarks

OpenAI text to speech voices in Azure Speech in Foundry Tools

Clean up resources

You can use the Azure portal or Azure Command Line Interface (CLI) to remove the Speech resource you created. Reference documentation | Additional samples on GitHub With Azure Speech in Foundry Tools, you can run an application that synthesizes a human-like voice to read text. You can change the voice, enter text to be spoken, and listen to the output on your computer’s speaker.

You can try text to speech in the Speech Studio Voice Gallery without signing up or writing any code.

Try out the Azure Speech Toolkit to easily build and run samples on Visual Studio Code.

Prerequisites

An Azure subscription. You can create one for free.

Create an AI Services resource for Speech in the Azure portal.

Get the Speech resource key and endpoint. After your Speech resource is deployed, select Go to resource to view and manage keys.

Set up the environment

To set up your environment, install the Speech SDK. The sample in this quickstart works with the Java Runtime.

Install Apache Maven. Then run mvn -v to confirm successful installation.

Create a pom.xml file in the root of your project, and copy the following code into it:

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>
    <groupId>com.microsoft.cognitiveservices.speech.samples</groupId>
    <artifactId>quickstart-eclipse</artifactId>
    <version>1.0.0-SNAPSHOT</version>
    <build>
        <sourceDirectory>src</sourceDirectory>
        <plugins>
        <plugin>
            <artifactId>maven-compiler-plugin</artifactId>
            <version>3.7.0</version>
            <configuration>
            <source>1.8</source>
            <target>1.8</target>
            </configuration>
        </plugin>
        </plugins>
    </build>
    <dependencies>
        <dependency>
        <groupId>com.microsoft.cognitiveservices.speech</groupId>
        <artifactId>client-sdk</artifactId>
        <version>1.43.0</version>
        </dependency>
    </dependencies>
</project>

Install the Speech SDK and dependencies.
```
mvn clean dependency:copy-dependencies
```

Set environment variables

To set the SPEECH_KEY environment variable, replace your-key with one of the keys for your resource.
To set the ENDPOINT environment variable, replace your-endpoint with one of the endpoints for your resource.

Windows
Linux
macOS

setx SPEECH_KEY your-key
setx ENDPOINT your-endpoint

If you only need to access the environment variables in the current console, you can set the environment variable with set instead of setx.

Bash

Edit your .bashrc file, and add the environment variables:

export SPEECH_KEY=your-key
export ENDPOINT=your-endpoint

After you add the environment variables, run source ~/.bashrc from your console window to make the changes effective.

Bash

Edit your .bash_profile file, and add the environment variables:

export SPEECH_KEY=your-key
export ENDPOINT=your-endpoint

After you add the environment variables, run source ~/.bash_profile from your console window to make the changes effective.

Xcode

For iOS and macOS development, you set the environment variables in Xcode. For example, follow these steps to set the environment variable in Xcode 13.4.1.

Select Product > Scheme > Edit scheme.
Select Arguments on the Run (Debug Run) page.
Under Environment Variables select the plus (+) sign to add a new environment variable.
Enter SPEECH_KEY for the Name and enter your Speech resource key for the Value.

Create the application

Follow these steps to create a console application for speech recognition.

Create a file named SpeechSynthesis.java in the same project root directory.

Copy the following code into SpeechSynthesis.java:

import com.microsoft.cognitiveservices.speech.*;
import com.microsoft.cognitiveservices.speech.audio.*;

import java.net.URI;
import java.net.URISyntaxException;
import java.util.Scanner;
import java.util.concurrent.ExecutionException;
 
public class SpeechSynthesis {
    // This example requires environment variables named "SPEECH_KEY" and "ENDPOINT"
    private static String speechKey = System.getenv("SPEECH_KEY");
    private static String endpoint = System.getenv("ENDPOINT");
 
    public static void main(String[] args) throws InterruptedException, ExecutionException, URISyntaxException {
        SpeechConfig speechConfig = SpeechConfig.fromEndpoint(new URI(endpoint), speechKey);
        
        speechConfig.setSpeechSynthesisVoiceName("en-US-Ava:DragonHDLatestNeural"); 
 
        SpeechSynthesizer speechSynthesizer = new SpeechSynthesizer(speechConfig);
 
        // Get text from the console and synthesize to the default speaker.
        System.out.println("Enter some text that you want to speak >");
        String text = new Scanner(System.in).nextLine();
        if (text.isEmpty())
        {
            return;
        }
 
        SpeechSynthesisResult speechSynthesisResult = speechSynthesizer.SpeakTextAsync(text).get();
 
        if (speechSynthesisResult.getReason() == ResultReason.SynthesizingAudioCompleted) {
            System.out.println("Speech synthesized to speaker for text [" + text + "]");
        }
        else if (speechSynthesisResult.getReason() == ResultReason.Canceled) {
            SpeechSynthesisCancellationDetails cancellation = SpeechSynthesisCancellationDetails.fromResult(speechSynthesisResult);
            System.out.println("CANCELED: Reason=" + cancellation.getReason());
 
            if (cancellation.getReason() == CancellationReason.Error) {
                System.out.println("CANCELED: ErrorCode=" + cancellation.getErrorCode());
                System.out.println("CANCELED: ErrorDetails=" + cancellation.getErrorDetails());
                System.out.println("CANCELED: Did you set the speech resource key and endpoint values?");
            }
        }
 
        System.exit(0);
    }
}

To change the speech synthesis language, replace en-US-Ava:DragonHDLatestNeural with another supported voice. All neural voices are multilingual and fluent in their own language and English. For example, if the input text in English is I’m excited to try text to speech and you set es-ES-Ximena:DragonHDLatestNeural, the text is spoken in English with a Spanish accent. If the voice doesn’t speak the language of the input text, the Speech service doesn’t output synthesized audio.

Run your console application to output speech synthesis to the default speaker.

javac SpeechSynthesis.java -cp ".;target\dependency\*"
java -cp ".;target\dependency\*" SpeechSynthesis

Make sure that you set the SPEECH_KEY and ENDPOINT environment variables. If you don’t set these variables, the sample fails with an error message.

Enter some text that you want to speak. For example, type I’m excited to try text to speech. Select the Enter key to hear the synthesized speech.
```
Enter some text that you want to speak >
I'm excited to try text to speech
```

Remarks

More speech synthesis options

See how to synthesize speech and Speech Synthesis Markup Language (SSML) overview for information about speech synthesis from a file and finer control over voice styles, prosody, and other settings.
See batch synthesis API for text to speech for information about synthesizing long-form text to speech.

OpenAI text to speech voices in Azure Speech in Foundry Tools

Clean up resources

You can use the Azure portal or Azure Command Line Interface (CLI) to remove the Speech resource you created. Reference documentation | Package (npm) | Additional samples on GitHub | Library source code With Azure Speech in Foundry Tools, you can run an application that synthesizes a human-like voice to read text. You can change the voice, enter text to be spoken, and listen to the output on your computer’s speaker.

You can try text to speech in the Speech Studio Voice Gallery without signing up or writing any code.

Try out the Azure Speech Toolkit to easily build and run samples on Visual Studio Code.

Prerequisites

An Azure subscription. You can create one for free.

Create a Foundry resource for Speech in the Azure portal.

Get the Speech resource key and region. After your Speech resource is deployed, select Go to resource to view and manage keys.

Set up

Create a new folder synthesis-quickstart and go to the quickstart folder with the following command:
```
mkdir synthesis-quickstart && cd synthesis-quickstart
```
Create the package.json with the following command:
```
npm init -y
```
Install the Speech SDK for JavaScript with:
```
npm install microsoft-cognitiveservices-speech-sdk
```

Retrieve resource information

You need to authenticate your application to access Foundry Tools. This article shows you how to use environment variables to store your credentials. You can then access the environment variables from your code to authenticate your application. For production, use a more secure way to store and access your credentials. To set the environment variables for your Speech resource key and region, open a console window, and follow the instructions for your operating system and development environment.

To set the SPEECH_KEY environment variable, replace your-key with one of the keys for your resource.
To set the SPEECH_REGION environment variable, replace your-region with one of the regions for your resource.
To set the ENDPOINT environment variable, replace your-endpoint with the actual endpoint of your Speech resource.

Windows
Linux
macOS

setx SPEECH_KEY your-key
setx SPEECH_REGION your-region
setx ENDPOINT your-endpoint

If you only need to access the environment variables in the current console, you can set the environment variable with set instead of setx.

Bash

Edit your .bashrc file, and add the environment variables:

export SPEECH_KEY=your-key
export SPEECH_REGION=your-region
export ENDPOINT=your-endpoint

After you add the environment variables, run source ~/.bashrc from your console window to make the changes effective.

Bash

Edit your .bash_profile file, and add the environment variables:

export SPEECH_KEY=your-key
export SPEECH_REGION=your-region
export ENDPOINT=your-endpoint

After you add the environment variables, run source ~/.bash_profile from your console window to make the changes effective.

Xcode

For iOS and macOS development, you set the environment variables in Xcode. For example, follow these steps to set the environment variable in Xcode 13.4.1.

Select Product > Scheme > Edit scheme.
Select Arguments on the Run (Debug Run) page.
Under Environment Variables select the plus (+) sign to add a new environment variable.
Enter SPEECH_KEY for the Name and enter your Speech resource key for the Value.

Synthesize speech to a file

To translate speech from a file:

Create a new file named synthesis.js with the following content:

import { createInterface } from "readline";
import { SpeechConfig, AudioConfig, SpeechSynthesizer, ResultReason } from "microsoft-cognitiveservices-speech-sdk";
function synthesizeSpeech() {
    const audioFile = "YourAudioFile.wav";
    // This example requires environment variables named "ENDPOINT" and "SPEECH_KEY"
    const speechConfig = SpeechConfig.fromEndpoint(new URL(ENDPOINT), process.env.SPEECH_KEY);
    const audioConfig = AudioConfig.fromAudioFileOutput(audioFile);
    // The language of the voice that speaks.
    speechConfig.speechSynthesisVoiceName = "en-US-Ava:DragonHDLatestNeural";
    // Create the speech synthesizer.
    const synthesizer = new SpeechSynthesizer(speechConfig, audioConfig);
    const rl = createInterface({
        input: process.stdin,
        output: process.stdout
    });
    rl.question("Enter some text that you want to speak >\n> ", function (text) {
        rl.close();
        // Start the synthesizer and wait for a result.
        synthesizer.speakTextAsync(text, function (result) {
            if (result.reason === ResultReason.SynthesizingAudioCompleted) {
                console.log("synthesis finished.");
            }
            else {
                console.error("Speech synthesis canceled, " + result.errorDetails +
                    "\nDid you set the speech resource key and region values?");
            }
            synthesizer.close();
        }, function (err) {
            console.trace("err - " + err);
            synthesizer.close();
        });
        console.log("Now synthesizing to: " + audioFile);
    });
}
synthesizeSpeech();

In synthesis.js, optionally you can rename YourAudioFile.wav to another output file name. To change the speech synthesis language, replace en-US-Ava:DragonHDLatestNeural with another supported voice. All neural voices are multilingual and fluent in their own language and English. For example, if the input text in English is I’m excited to try text to speech and you set es-ES-Ximena:DragonHDLatestNeural, the text is spoken in English with a Spanish accent. If the voice doesn’t speak the language of the input text, the Speech service doesn’t output synthesized audio.

Run your console application to start speech synthesis to a file:
```
node synthesis.js
```

Output

You should see the following output in the console. Follow the prompt to enter text that you want to synthesize:

Enter some text that you want to speak >
> I'm excited to try text to speech
Now synthesizing to: YourAudioFile.wav
synthesis finished.

Remarks

More speech synthesis options

See how to synthesize speech and Speech Synthesis Markup Language (SSML) overview for information about speech synthesis from a file and finer control over voice styles, prosody, and other settings.
See batch synthesis API for text to speech for information about synthesizing long-form text to speech.

OpenAI text to speech voices in Azure Speech in Foundry Tools

Clean up resources

You can use the Azure portal or Azure Command Line Interface (CLI) to remove the Speech resource you created. Reference documentation | Package (download) | Additional samples on GitHub With Azure Speech in Foundry Tools, you can run an application that synthesizes a human-like voice to read text. You can change the voice, enter text to be spoken, and listen to the output on your computer’s speaker.

You can try text to speech in the Speech Studio Voice Gallery without signing up or writing any code.

Try out the Azure Speech Toolkit to easily build and run samples on Visual Studio Code.

Prerequisites

An Azure subscription. You can create one for free.

Create a Foundry resource for Speech in the Azure portal.

Get the Speech resource key and region. After your Speech resource is deployed, select Go to resource to view and manage keys.

Set up the environment

The Speech SDK for Swift is distributed as a framework bundle. The framework supports both Objective-C and Swift on both iOS and macOS. The Speech SDK can be used in Xcode projects as a CocoaPod, or downloaded directly and linked manually. This guide uses a CocoaPod. Install the CocoaPod dependency manager as described in its installation instructions.

Set environment variables

You need to authenticate your application to access Foundry Tools. This article shows you how to use environment variables to store your credentials. You can then access the environment variables from your code to authenticate your application. For production, use a more secure way to store and access your credentials. To set the environment variables for your Speech resource key and region, open a console window, and follow the instructions for your operating system and development environment.

To set the SPEECH_KEY environment variable, replace your-key with one of the keys for your resource.
To set the SPEECH_REGION environment variable, replace your-region with one of the regions for your resource.
To set the ENDPOINT environment variable, replace your-endpoint with the actual endpoint of your Speech resource.

Windows
Linux
macOS

setx SPEECH_KEY your-key
setx SPEECH_REGION your-region
setx ENDPOINT your-endpoint

If you only need to access the environment variables in the current console, you can set the environment variable with set instead of setx.

Bash

Edit your .bashrc file, and add the environment variables:

export SPEECH_KEY=your-key
export SPEECH_REGION=your-region
export ENDPOINT=your-endpoint

After you add the environment variables, run source ~/.bashrc from your console window to make the changes effective.

Bash

Edit your .bash_profile file, and add the environment variables:

export SPEECH_KEY=your-key
export SPEECH_REGION=your-region
export ENDPOINT=your-endpoint

After you add the environment variables, run source ~/.bash_profile from your console window to make the changes effective.

Xcode

For iOS and macOS development, you set the environment variables in Xcode. For example, follow these steps to set the environment variable in Xcode 13.4.1.

Select Product > Scheme > Edit scheme.
Select Arguments on the Run (Debug Run) page.
Under Environment Variables select the plus (+) sign to add a new environment variable.
Enter SPEECH_KEY for the Name and enter your Speech resource key for the Value.

Create the application

Follow these steps to synthesize speech in a macOS application.

Clone the Azure-Samples/cognitive-services-speech-sdk repository to get the Synthesize audio in Swift on macOS using the Speech SDK sample project. The repository also has iOS samples.
Navigate to the directory of the downloaded sample app (helloworld) in a terminal.
Run the command pod install. This command generates a helloworld.xcworkspace Xcode workspace that contains both the sample app and the Speech SDK as a dependency.
Open the helloworld.xcworkspace workspace in Xcode.

Open the file named AppDelegate.swift and locate the applicationDidFinishLaunching and synthesize methods as shown here.

import Cocoa

@NSApplicationMain
class AppDelegate: NSObject, NSApplicationDelegate, NSTextFieldDelegate {
    var textField: NSTextField!
    var synthesisButton: NSButton!

    var inputText: String!

    var sub: String!
    var region: String!

    @IBOutlet weak var window: NSWindow!

    func applicationDidFinishLaunching(_ aNotification: Notification) {
        print("loading")
        // load subscription information
        sub = ProcessInfo.processInfo.environment["SPEECH_KEY"]
        region = ProcessInfo.processInfo.environment["SPEECH_REGION"]

        inputText = ""

        textField = NSTextField(frame: NSRect(x: 100, y: 200, width: 200, height: 50))
        textField.textColor = NSColor.black
        textField.lineBreakMode = .byWordWrapping

        textField.placeholderString = "Type something to synthesize."
        textField.delegate = self

        self.window.contentView?.addSubview(textField)

        synthesisButton = NSButton(frame: NSRect(x: 100, y: 100, width: 200, height: 30))
        synthesisButton.title = "Synthesize"
        synthesisButton.target = self
        synthesisButton.action = #selector(synthesisButtonClicked)
        self.window.contentView?.addSubview(synthesisButton)
    }

    @objc func synthesisButtonClicked() {
        DispatchQueue.global(qos: .userInitiated).async {
            self.synthesize()
        }
    }

    func synthesize() {
        var speechConfig: SPXSpeechConfiguration?
        do {
            try speechConfig = SPXSpeechConfiguration(subscription: sub, region: region)
        } catch {
            print("error \(error) happened")
            speechConfig = nil
        }

        speechConfig?.speechSynthesisVoiceName = "en-US-Ava:DragonHDLatestNeural";

        let synthesizer = try! SPXSpeechSynthesizer(speechConfig!)
        let result = try! synthesizer.speakText(inputText)
        if result.reason == SPXResultReason.canceled
        {
            let cancellationDetails = try! SPXSpeechSynthesisCancellationDetails(fromCanceledSynthesisResult: result)
            print("cancelled, error code: \(cancellationDetails.errorCode) detail: \(cancellationDetails.errorDetails!) ")
            print("Did you set the speech resource key and region values?");
            return
        }
    }

    func controlTextDidChange(_ obj: Notification) {
        let textFiled = obj.object as! NSTextField
        inputText = textFiled.stringValue
    }
}

In AppDelegate.m, use the environment variables that you previously set for your Speech resource key and region.

sub = ProcessInfo.processInfo.environment["SPEECH_KEY"]
region = ProcessInfo.processInfo.environment["SPEECH_REGION"]

Optionally in AppDelegate.m, include a speech synthesis voice name as shown here:
```
speechConfig?.speechSynthesisVoiceName = "en-US-Ava:DragonHDLatestNeural";
```
To change the speech synthesis language, replace en-US-Ava:DragonHDLatestNeural with another supported voice. All neural voices are multilingual and fluent in their own language and English. For example, if the input text in English is I’m excited to try text to speech and you set es-ES-Ximena:DragonHDLatestNeural, the text is spoken in English with a Spanish accent. If the voice doesn’t speak the language of the input text, the Speech service doesn’t output synthesized audio.
To make the debug output visible, select View > Debug Area > Activate Console.
To build and run the example code, select Product > Run from the menu or select the Play button.

Make sure that you set the SPEECH_KEY and SPEECH_REGION environment variables. If you don’t set these variables, the sample fails with an error message.

After you input some text and select the button in the app, you should hear the synthesized audio played.

Remarks

More speech synthesis options

This quickstart uses the SpeakText operation to synthesize a short block of text that you enter. You can also use long-form text from a file and get finer control over voice styles, prosody, and other settings.

See how to synthesize speech and Speech Synthesis Markup Language (SSML) overview for information about speech synthesis from a file and finer control over voice styles, prosody, and other settings.
See batch synthesis API for text to speech for information about synthesizing long-form text to speech.

OpenAI text to speech voices in Azure Speech in Foundry Tools

Clean up resources

You can use the Azure portal or Azure Command Line Interface (CLI) to remove the Speech resource you created. Reference documentation | Package (PyPi) | Additional samples on GitHub With Azure Speech in Foundry Tools, you can run an application that synthesizes a human-like voice to read text. You can change the voice, enter text to be spoken, and listen to the output on your computer’s speaker.

You can try text to speech in the Speech Studio Voice Gallery without signing up or writing any code.

Try out the Azure Speech Toolkit to easily build and run samples on Visual Studio Code.

Prerequisites

An Azure subscription. You can create one for free.

Create an AI Services resource for Speech in the Azure portal.

Get the Speech resource key and endpoint. After your Speech resource is deployed, select Go to resource to view and manage keys.

Set up the environment

The Speech SDK for Python is available as a Python Package Index (PyPI) module. The Speech SDK for Python is compatible with Windows, Linux, and macOS.

On Windows, install the Microsoft Visual C++ Redistributable for Visual Studio 2015, 2017, 2019, and 2022 for your platform. Installing this package might require a restart.
On Linux, you must use the x64 target architecture.

Install a version of Python from 3.7 or later. For any requirements, see Install the Speech SDK.

Set environment variables

To set the SPEECH_KEY environment variable, replace your-key with one of the keys for your resource.
To set the ENDPOINT environment variable, replace your-endpoint with one of the endpoints for your resource.

Windows
Linux
macOS

setx SPEECH_KEY your-key
setx ENDPOINT your-endpoint

If you only need to access the environment variables in the current console, you can set the environment variable with set instead of setx.

Bash

Edit your .bashrc file, and add the environment variables:

export SPEECH_KEY=your-key
export ENDPOINT=your-endpoint

After you add the environment variables, run source ~/.bashrc from your console window to make the changes effective.

Bash

Edit your .bash_profile file, and add the environment variables:

export SPEECH_KEY=your-key
export ENDPOINT=your-endpoint

After you add the environment variables, run source ~/.bash_profile from your console window to make the changes effective.

Xcode

For iOS and macOS development, you set the environment variables in Xcode. For example, follow these steps to set the environment variable in Xcode 13.4.1.

Select Product > Scheme > Edit scheme.
Select Arguments on the Run (Debug Run) page.
Under Environment Variables select the plus (+) sign to add a new environment variable.
Enter SPEECH_KEY for the Name and enter your Speech resource key for the Value.

Create the application

Follow these steps to create a console application.

Open a command prompt window in the folder where you want the new project. Create a file named speech_synthesis.py.
Run this command to install the Speech SDK:
```
pip install azure-cognitiveservices-speech
```

Copy the following code into speech_synthesis.py:

import os
import azure.cognitiveservices.speech as speechsdk

# This example requires environment variables named "SPEECH_KEY" and "ENDPOINT"
# Replace with your own subscription key and endpoint, the endpoint is like : "https://YourServiceRegion.api.cognitive.microsoft.com"
speech_config = speechsdk.SpeechConfig(subscription=os.environ.get('SPEECH_KEY'), endpoint=os.environ.get('ENDPOINT'))
audio_config = speechsdk.audio.AudioOutputConfig(use_default_speaker=True)

# The neural multilingual voice can speak different languages based on the input text.
speech_config.speech_synthesis_voice_name='en-US-Ava:DragonHDLatestNeural'

speech_synthesizer = speechsdk.SpeechSynthesizer(speech_config=speech_config, audio_config=audio_config)

# Get text from the console and synthesize to the default speaker.
print("Enter some text that you want to speak >")
text = input()

speech_synthesis_result = speech_synthesizer.speak_text_async(text).get()

if speech_synthesis_result.reason == speechsdk.ResultReason.SynthesizingAudioCompleted:
    print("Speech synthesized for text [{}]".format(text))
elif speech_synthesis_result.reason == speechsdk.ResultReason.Canceled:
    cancellation_details = speech_synthesis_result.cancellation_details
    print("Speech synthesis canceled: {}".format(cancellation_details.reason))
    if cancellation_details.reason == speechsdk.CancellationReason.Error:
        if cancellation_details.error_details:
            print("Error details: {}".format(cancellation_details.error_details))
            print("Did you set the speech resource key and endpoint values?")

To change the speech synthesis language, replace en-US-Ava:DragonHDLatestNeural with another supported voice. All neural voices are multilingual and fluent in their own language and English. For example, if the input text in English is I’m excited to try text to speech and you set es-ES-Ximena:DragonHDLatestNeural, the text is spoken in English with a Spanish accent. If the voice doesn’t speak the language of the input text, the Speech service doesn’t output synthesized audio.
Run your new console application to start speech synthesis to the default speaker.
```
python speech_synthesis.py
```

Make sure that you set the SPEECH_KEY and ENDPOINT environment variables. If you don’t set these variables, the sample fails with an error message.

Enter some text that you want to speak. For example, type I’m excited to try text to speech. Select the Enter key to hear the synthesized speech.
```
Enter some text that you want to speak > 
I'm excited to try text to speech
```

Remarks

More speech synthesis options

This quickstart uses the speak_text_async operation to synthesize a short block of text that you enter. You can also use long-form text from a file and get finer control over voice styles, prosody, and other settings.

See how to synthesize speech and Speech Synthesis Markup Language (SSML) overview for information about speech synthesis from a file and finer control over voice styles, prosody, and other settings.
See batch synthesis API for text to speech for information about synthesizing long-form text to speech.

OpenAI text to speech voices in Azure Speech in Foundry Tools

Clean up resources

You can use the Azure portal or Azure Command Line Interface (CLI) to remove the Speech resource you created. Speech to text REST API reference | Speech to text REST API for short audio reference | Additional samples on GitHub With Azure Speech in Foundry Tools, you can run an application that synthesizes a human-like voice to read text. You can change the voice, enter text to be spoken, and listen to the output on your computer’s speaker.

You can try text to speech in the Speech Studio Voice Gallery without signing up or writing any code.

Try out the Azure Speech Toolkit to easily build and run samples on Visual Studio Code.

Prerequisites

An Azure subscription. You can create one for free.

Create a Foundry resource for Speech in the Azure portal.

Get the Speech resource key and region. After your Speech resource is deployed, select Go to resource to view and manage keys.

Set environment variables

You need to authenticate your application to access Foundry Tools. This article shows you how to use environment variables to store your credentials. You can then access the environment variables from your code to authenticate your application. For production, use a more secure way to store and access your credentials. To set the environment variables for your Speech resource key and region, open a console window, and follow the instructions for your operating system and development environment.

To set the SPEECH_KEY environment variable, replace your-key with one of the keys for your resource.
To set the SPEECH_REGION environment variable, replace your-region with one of the regions for your resource.
To set the ENDPOINT environment variable, replace your-endpoint with the actual endpoint of your Speech resource.

Windows
Linux
macOS

setx SPEECH_KEY your-key
setx SPEECH_REGION your-region
setx ENDPOINT your-endpoint

If you only need to access the environment variables in the current console, you can set the environment variable with set instead of setx.

Bash

Edit your .bashrc file, and add the environment variables:

export SPEECH_KEY=your-key
export SPEECH_REGION=your-region
export ENDPOINT=your-endpoint

After you add the environment variables, run source ~/.bashrc from your console window to make the changes effective.

Bash

Edit your .bash_profile file, and add the environment variables:

export SPEECH_KEY=your-key
export SPEECH_REGION=your-region
export ENDPOINT=your-endpoint

After you add the environment variables, run source ~/.bash_profile from your console window to make the changes effective.

Xcode

For iOS and macOS development, you set the environment variables in Xcode. For example, follow these steps to set the environment variable in Xcode 13.4.1.

Select Product > Scheme > Edit scheme.
Select Arguments on the Run (Debug Run) page.
Under Environment Variables select the plus (+) sign to add a new environment variable.
Enter SPEECH_KEY for the Name and enter your Speech resource key for the Value.

Synthesize speech to a file

At a command prompt, run the following cURL command. Optionally, you can rename output.mp3 to another output file name.

curl --location --request POST "https://%SPEECH_REGION%.tts.speech.microsoft.com/cognitiveservices/v1" ^
--header "Ocp-Apim-Subscription-Key: %SPEECH_KEY%" ^
--header "Content-Type: application/ssml+xml" ^
--header "X-Microsoft-OutputFormat: audio-16khz-128kbitrate-mono-mp3" ^
--header "User-Agent: curl" ^
--data-raw "<speak version='1.0' xml:lang='en-US'><voice xml:lang='en-US' xml:gender='Female' name='en-US-Ava:DragonHDLatestNeural'>my voice is my passport verify me</voice></speak>" --output output.mp3

curl --location --request POST "https://${SPEECH_REGION}.tts.speech.microsoft.com/cognitiveservices/v1" \
--header "Ocp-Apim-Subscription-Key: ${SPEECH_KEY}" \
--header 'Content-Type: application/ssml+xml' \
--header 'X-Microsoft-OutputFormat: audio-16khz-128kbitrate-mono-mp3' \
--header 'User-Agent: curl' \
--data-raw '<speak version='\''1.0'\'' xml:lang='\''en-US'\''>
    <voice xml:lang='\''en-US'\'' xml:gender='\''Female'\'' name='\''en-US-Ava:DragonHDLatestNeural'\''>
        my voice is my passport verify me
    </voice>
</speak>' > output.mp3

curl --location --request POST "https://${SPEECH_REGION}.tts.speech.microsoft.com/cognitiveservices/v1" \
--header "Ocp-Apim-Subscription-Key: ${SPEECH_KEY}" \
--header 'Content-Type: application/ssml+xml' \
--header 'X-Microsoft-OutputFormat: audio-16khz-128kbitrate-mono-mp3' \
--header 'User-Agent: curl' \
--data-raw '<speak version='\''1.0'\'' xml:lang='\''en-US'\''>
    <voice xml:lang='\''en-US'\'' xml:gender='\''Female'\'' name='\''en-US-Ava:DragonHDLatestNeural'\''>
        my voice is my passport verify me
    </voice>
</speak>' > output.mp3

Make sure that you set the SPEECH_KEY and SPEECH_REGION environment variables. If you don’t set these variables, the sample fails with an error message.

The provided text should be output to an audio file named output.mp3.To change the speech synthesis language, replace en-US-Ava:DragonHDLatestNeural with another supported voice.All neural voices are multilingual and fluent in their own language and English. For example, if the input text in English is I’m excited to try text to speech and you set es-ES-Ximena:DragonHDLatestNeural, the text is spoken in English with a Spanish accent. If the voice doesn’t speak the language of the input text, the Speech service doesn’t output synthesized audio.For more information, see Text to speech REST API.

Remarks

OpenAI text to speech voices in Azure Speech in Foundry Tools

Clean up resources

You can use the Azure portal or Azure Command Line Interface (CLI) to remove the Speech resource you created. ::: zone-end::: zone pivot=“programming-language-typescript” Reference documentation | Package (npm) | Additional samples on GitHub | Library source codeWith Azure Speech in Foundry Tools, you can run an application that synthesizes a human-like voice to read text. You can change the voice, enter text to be spoken, and listen to the output on your computer’s speaker.

You can try text to speech in the Speech Studio Voice Gallery without signing up or writing any code.

Try out the Azure Speech Toolkit to easily build and run samples on Visual Studio Code.

Prerequisites

An Azure subscription. You can create one for free.

Create a Foundry resource for Speech in the Azure portal.

Get the Speech resource key and region. After your Speech resource is deployed, select Go to resource to view and manage keys.

Set up

Create a new folder synthesis-quickstart and go to the quickstart folder with the following command:
```
mkdir synthesis-quickstart && cd synthesis-quickstart
```
Create the package.json with the following command:
```
npm init -y
```
Update the package.json to ECMAScript with the following command:
```
npm pkg set type=module
```
Install the Speech SDK for JavaScript with:
```
npm install microsoft-cognitiveservices-speech-sdk
```
You need to install the Node.js type definitions to avoid TypeScript errors. Run the following command:
```
npm install --save-dev @types/node
```

Retrieve resource information

You need to authenticate your application to access Foundry Tools. This article shows you how to use environment variables to store your credentials. You can then access the environment variables from your code to authenticate your application. For production, use a more secure way to store and access your credentials.To set the environment variables for your Speech resource key and region, open a console window, and follow the instructions for your operating system and development environment.

To set the SPEECH_KEY environment variable, replace your-key with one of the keys for your resource.
To set the SPEECH_REGION environment variable, replace your-region with one of the regions for your resource.
To set the ENDPOINT environment variable, replace your-endpoint with the actual endpoint of your Speech resource.

setx SPEECH_KEY your-key
setx SPEECH_REGION your-region
setx ENDPOINT your-endpoint

If you only need to access the environment variables in the current console, you can set the environment variable with set instead of setx.

Bash

Edit your .bashrc file, and add the environment variables:

export SPEECH_KEY=your-key
export SPEECH_REGION=your-region
export ENDPOINT=your-endpoint

After you add the environment variables, run source ~/.bashrc from your console window to make the changes effective.

Bash

Edit your .bash_profile file, and add the environment variables:

export SPEECH_KEY=your-key
export SPEECH_REGION=your-region
export ENDPOINT=your-endpoint

After you add the environment variables, run source ~/.bash_profile from your console window to make the changes effective.

Xcode

For iOS and macOS development, you set the environment variables in Xcode. For example, follow these steps to set the environment variable in Xcode 13.4.1.

Select Product > Scheme > Edit scheme.
Select Arguments on the Run (Debug Run) page.
Under Environment Variables select the plus (+) sign to add a new environment variable.
Enter SPEECH_KEY for the Name and enter your Speech resource key for the Value.

Synthesize speech to a file

To translate speech from a file:

Create a new file named synthesis.ts with the following content:

import { createInterface } from "readline";
import { 
    SpeechConfig, 
    AudioConfig, 
    SpeechSynthesizer, 
    ResultReason,
    SpeechSynthesisResult 
} from "microsoft-cognitiveservices-speech-sdk";

function synthesizeSpeech(): void {
    const audioFile = "YourAudioFile.wav";
    // This example requires environment variables named "ENDPOINT" and "SPEECH_KEY"
    const speechConfig: SpeechConfig = SpeechConfig.fromEndpoint(new URL(process.env.ENDPOINT!), process.env.SPEECH_KEY!);
    const audioConfig: AudioConfig = AudioConfig.fromAudioFileOutput(audioFile);
    
    // The language of the voice that speaks.
    speechConfig.speechSynthesisVoiceName = "en-US-Ava:DragonHDLatestNeural";
    
    // Create the speech synthesizer.
    const synthesizer: SpeechSynthesizer = new SpeechSynthesizer(speechConfig, audioConfig);
    
    const rl = createInterface({
        input: process.stdin,
        output: process.stdout
    });
    
    rl.question("Enter some text that you want to speak >\n> ", function (text: string) {
        rl.close();
        // Start the synthesizer and wait for a result.
        synthesizer.speakTextAsync(text,
            function (result: SpeechSynthesisResult) {
                if (result.reason === ResultReason.SynthesizingAudioCompleted) {
                    console.log("synthesis finished.");
                } else {
                    console.error("Speech synthesis canceled, " + result.errorDetails +
                        "\nDid you set the speech resource key and region values?");
                }
                synthesizer.close();
            },
            function (err: string) {
                console.trace("err - " + err);
                synthesizer.close();
            });
        console.log("Now synthesizing to: " + audioFile);
    });
}

synthesizeSpeech();

In synthesis.ts, optionally you can rename YourAudioFile.wav to another output file name. To change the speech synthesis language, replace en-US-Ava:DragonHDLatestNeural with another supported voice. All neural voices are multilingual and fluent in their own language and English. For example, if the input text in English is I’m excited to try text to speech and you set es-ES-Ximena:DragonHDLatestNeural, the text is spoken in English with a Spanish accent. If the voice doesn’t speak the language of the input text, the Speech service doesn’t output synthesized audio.

Create the tsconfig.json file to transpile the TypeScript code and copy the following code for ECMAScript.

{
    "compilerOptions": {
      "module": "NodeNext",
      "target": "ES2022", // Supports top-level await
      "moduleResolution": "NodeNext",
      "skipLibCheck": true, // Avoid type errors from node_modules
      "strict": true // Enable strict type-checking options
    },
    "include": ["*.ts"]
}

Transpile from TypeScript to JavaScript.
```
tsc
```
This command should produce no output if successful.
Run your console application to start speech synthesis to a file:
```
node synthesis.js
```

Output

You should see the following output in the console. Follow the prompt to enter text that you want to synthesize:

Enter some text that you want to speak >
> I'm excited to try text to speech
Now synthesizing to: YourAudioFile.wav
synthesis finished.

Remarks

More speech synthesis options

See how to synthesize speech and Speech Synthesis Markup Language (SSML) overview for information about speech synthesis from a file and finer control over voice styles, prosody, and other settings.
See batch synthesis API for text to speech for information about synthesizing long-form text to speech.

OpenAI text to speech voices in Azure Speech in Foundry Tools

Clean up resources

You can use the Azure portal or Azure Command Line Interface (CLI) to remove the Speech resource you created.

What is Microsoft Foundry (new)?

Get started

Agent development

Agent tools & integration

Model capabilities

Fine-tuning

Manage agents, models, & tools

Observability, evaluation, & tracing

Developer experience

API & SDK

Responsible AI

Best practices

Setup & configure

Security & governance

Operate & support

​Quickstart: Convert text to speech

​Prerequisites

​Try text to speech

​Other Foundry (new) features

​Prerequisites

​Set up the environment

​Set environment variables

Bash

Bash

Xcode

​Create the application

​Remarks

​More speech synthesis options

​OpenAI text to speech voices in Azure Speech in Foundry Tools

​Clean up resources

​Prerequisites

​Set up the environment

​Set environment variables

Bash

Bash

Xcode

​Create the application

​Remarks

​More speech synthesis options

​OpenAI text to speech voices in Azure Speech in Foundry Tools

​Clean up resources

​Prerequisites

​Set up the environment

​Set environment variables

Bash

Bash

Xcode

​Create the application

​Remarks

​OpenAI text to speech voices in Azure Speech in Foundry Tools

​Clean up resources

​Prerequisites

​Set up the environment

​Set environment variables

Bash

Bash

Xcode

​Create the application

​Remarks

​More speech synthesis options

​OpenAI text to speech voices in Azure Speech in Foundry Tools

​Clean up resources

​Prerequisites

​Set up

​Retrieve resource information

Bash

Bash

Xcode

​Synthesize speech to a file

​Output

​Remarks

​More speech synthesis options

​OpenAI text to speech voices in Azure Speech in Foundry Tools

​Clean up resources

​Prerequisites

​Set up the environment

​Set environment variables

Bash

Bash

Xcode

Quickstart: Convert text to speech

Prerequisites

Try text to speech

Other Foundry (new) features

Prerequisites

Set up the environment

Set environment variables

Create the application

Remarks

More speech synthesis options

OpenAI text to speech voices in Azure Speech in Foundry Tools

Clean up resources

Prerequisites

Set up the environment

Set environment variables

Create the application

Remarks

More speech synthesis options

OpenAI text to speech voices in Azure Speech in Foundry Tools

Clean up resources

Prerequisites

Set up the environment

Set environment variables

Create the application

Remarks

OpenAI text to speech voices in Azure Speech in Foundry Tools

Clean up resources

Prerequisites

Set up the environment

Set environment variables

Create the application

Remarks

More speech synthesis options

OpenAI text to speech voices in Azure Speech in Foundry Tools

Clean up resources

Prerequisites

Set up

Retrieve resource information

Synthesize speech to a file

Output

Remarks

More speech synthesis options

OpenAI text to speech voices in Azure Speech in Foundry Tools

Clean up resources

Prerequisites

Set up the environment

Set environment variables

Create the application

Remarks

More speech synthesis options

OpenAI text to speech voices in Azure Speech in Foundry Tools

Clean up resources

Prerequisites

Set up the environment

Set environment variables

Create the application

Remarks

More speech synthesis options

OpenAI text to speech voices in Azure Speech in Foundry Tools

Clean up resources

Prerequisites

Set environment variables

Synthesize speech to a file

Remarks

OpenAI text to speech voices in Azure Speech in Foundry Tools

Clean up resources

Prerequisites

Set up

Retrieve resource information

Synthesize speech to a file

Output

Remarks

More speech synthesis options

OpenAI text to speech voices in Azure Speech in Foundry Tools

Clean up resources

Next step