Quickstart: Convert text to speech
In this quickstart, try out the text to speech model from Azure Speech in Foundry Tools, using Microsoft Foundry.Prerequisites
- An Azure subscription. Create one for free.
- A Foundry project. If you need to create a project, see Create a Microsoft Foundry project.
Try text to speech
Try text to speech in the Foundry portal by following these steps:- Foundry (new) portal
- Foundry (classic) portal
- Sign in to Microsoft Foundry. Make sure the New Foundry toggle is on. These steps refer to Foundry (new).

- Select Build from the top right menu.
- Select Models on the left pane.
- The AI Services tab shows the Azure AI models that can be used out of the box in the Foundry portal. Select Azure Speech - Text to Speech to open the Text to Speech playground.
- Choose a prebuilt voice from the dropdown menu, and optionally tune it with the provider parameter sliders.
- Enter your sample text in the text box.
- Select Play to hear the synthetic voice read your text.
Other Foundry (new) features
The following Speech features are available in the Foundry (new) portal:Prerequisites
An Azure subscription. You can create one for free.
Create an AI Services resource for Speech in the Azure portal.
Get the Speech resource key and endpoint. After your Speech resource is deployed, select Go to resource to view and manage keys.
Set up the environment
The Speech SDK is available as a NuGet package that implements .NET Standard 2.0. Install the Speech SDK later in this guide by using the console. For detailed installation instructions, see Install the Speech SDK.Set environment variables
You need to authenticate your application to access Foundry Tools. This article shows you how to use environment variables to store your credentials. You can then access the environment variables from your code to authenticate your application. For production, use a more secure way to store and access your credentials. To set the environment variables for your Speech resource key and endpoint, open a console window, and follow the instructions for your operating system and development environment.- To set the
SPEECH_KEYenvironment variable, replace your-key with one of the keys for your resource. - To set the
ENDPOINTenvironment variable, replace your-endpoint with one of the endpoints for your resource.
- Windows
- Linux
- macOS
If you only need to access the environment variables in the current console, you can set the environment variable with
set instead of setx.Create the application
Follow these steps to create a console application and install the Speech SDK.-
Open a command prompt window in the folder where you want the new project. Run this command to create a console application with the .NET CLI.
The command creates a Program.cs file in the project directory.
-
Install the Speech SDK in your new project with the .NET CLI.
-
Replace the contents of Program.cs with the following code.
-
To change the speech synthesis language, replace
en-US-Ava:DragonHDLatestNeuralwith another supported voice. All neural voices are multilingual and fluent in their own language and English. For example, if the input text in English is I’m excited to try text to speech and you setes-ES-Ximena:DragonHDLatestNeuralas the language, the text is spoken in English with a Spanish accent. If the voice doesn’t speak the language of the input text, the Speech service doesn’t output synthesized audio. -
Run your new console application to start speech synthesis to the default speaker.
Make sure that you set the
SPEECH_KEY and ENDPOINT environment variables. If you don’t set these variables, the sample fails with an error message.-
Enter some text that you want to speak. For example, type I’m excited to try text to speech. Select the Enter key to hear the synthesized speech.
Remarks
More speech synthesis options
This quickstart uses theSpeakTextAsync operation to synthesize a short block of text that you enter. You can also use long-form text from a file and get finer control over voice styles, prosody, and other settings.
- See how to synthesize speech and Speech Synthesis Markup Language (SSML) overview for information about speech synthesis from a file and finer control over voice styles, prosody, and other settings.
- See batch synthesis API for text to speech for information about synthesizing long-form text to speech.
OpenAI text to speech voices in Azure Speech in Foundry Tools
OpenAI text to speech voices are also supported. See OpenAI text to speech voices in Azure Speech and multilingual voices. You can replaceen-US-Ava:DragonHDLatestNeural with a supported OpenAI voice name such as en-US-FableMultilingualNeural.
Clean up resources
You can use the Azure portal or Azure Command Line Interface (CLI) to remove the Speech resource you created. Reference documentation | Package (NuGet) | Additional samples on GitHub With Azure Speech in Foundry Tools, you can run an application that synthesizes a human-like voice to read text. You can change the voice, enter text to be spoken, and listen to the output on your computer’s speaker.Prerequisites
An Azure subscription. You can create one for free.
Create an AI Services resource for Speech in the Azure portal.
Get the Speech resource key and endpoint. After your Speech resource is deployed, select Go to resource to view and manage keys.
Set up the environment
The Speech SDK is available as a NuGet package that implements .NET Standard 2.0. Install the Speech SDK later in this guide. For detailed installation instructions, see Install the Speech SDK.Set environment variables
You need to authenticate your application to access Foundry Tools. This article shows you how to use environment variables to store your credentials. You can then access the environment variables from your code to authenticate your application. For production, use a more secure way to store and access your credentials. To set the environment variables for your Speech resource key and endpoint, open a console window, and follow the instructions for your operating system and development environment.- To set the
SPEECH_KEYenvironment variable, replace your-key with one of the keys for your resource. - To set the
ENDPOINTenvironment variable, replace your-endpoint with one of the endpoints for your resource.
- Windows
- Linux
- macOS
If you only need to access the environment variables in the current console, you can set the environment variable with
set instead of setx.Create the application
Follow these steps to create a console application and install the Speech SDK.-
Create a C++ console project in Visual Studio Community named
SpeechSynthesis. -
Replace the contents of SpeechSynthesis.cpp with the following code:
-
Select Tools > Nuget Package Manager > Package Manager Console. In the Package Manager Console, run this command:
-
To change the speech synthesis language, replace
en-US-Ava:DragonHDLatestNeuralwith another supported voice. All neural voices are multilingual and fluent in their own language and English. For example, if the input text in English is I’m excited to try text to speech and you setes-ES-Ximena:DragonHDLatestNeural, the text is spoken in English with a Spanish accent. If the voice doesn’t speak the language of the input text, the Speech service doesn’t output synthesized audio. - To start speech synthesis to the default speaker, Build and run your new console application.
Make sure that you set the
SPEECH_KEY and ENDPOINT environment variables. If you don’t set these variables, the sample fails with an error message.-
Enter some text that you want to speak. For example, type I’m excited to try text to speech. Select the Enter key to hear the synthesized speech.
Remarks
More speech synthesis options
This quickstart uses theSpeakTextAsync operation to synthesize a short block of text that you enter. You can also use long-form text from a file and get finer control over voice styles, prosody, and other settings.
- See how to synthesize speech and Speech Synthesis Markup Language (SSML) overview for information about speech synthesis from a file and finer control over voice styles, prosody, and other settings.
- See batch synthesis API for text to speech for information about synthesizing long-form text to speech.
OpenAI text to speech voices in Azure Speech in Foundry Tools
OpenAI text to speech voices are also supported. See OpenAI text to speech voices in Azure Speech and multilingual voices. You can replaceen-US-Ava:DragonHDLatestNeural with a supported OpenAI voice name such as en-US-FableMultilingualNeural.
Clean up resources
You can use the Azure portal or Azure Command Line Interface (CLI) to remove the Speech resource you created. Reference documentation | Package (Go) | Additional samples on GitHub With Azure Speech in Foundry Tools, you can run an application that synthesizes a human-like voice to read text. You can change the voice, enter text to be spoken, and listen to the output on your computer’s speaker.Prerequisites
An Azure subscription. You can create one for free.
Create a Foundry resource for Speech in the Azure portal.
Get the Speech resource key and region. After your Speech resource is deployed, select Go to resource to view and manage keys.
Set up the environment
Install the Speech SDK for the Go language. For detailed installation instructions, see Install the Speech SDK.Set environment variables
You need to authenticate your application to access Foundry Tools. This article shows you how to use environment variables to store your credentials. You can then access the environment variables from your code to authenticate your application. For production, use a more secure way to store and access your credentials. To set the environment variables for your Speech resource key and region, open a console window, and follow the instructions for your operating system and development environment.- To set the
SPEECH_KEYenvironment variable, replace your-key with one of the keys for your resource. - To set the
SPEECH_REGIONenvironment variable, replace your-region with one of the regions for your resource. - To set the
ENDPOINTenvironment variable, replaceyour-endpointwith the actual endpoint of your Speech resource.
- Windows
- Linux
- macOS
If you only need to access the environment variables in the current console, you can set the environment variable with
set instead of setx.Create the application
Follow these steps to create a Go module.- Open a command prompt window in the folder where you want the new project. Create a new file named speech-synthesis.go.
-
Copy the following code into speech-synthesis.go:
-
To change the speech synthesis language, replace
en-US-Ava:DragonHDLatestNeuralwith another supported voice. All neural voices are multilingual and fluent in their own language and English. For example, if the input text in English is I’m excited to try text to speech and you setes-ES-Ximena:DragonHDLatestNeural, the text is spoken in English with a Spanish accent. If the voice doesn’t speak the language of the input text, the Speech service doesn’t output synthesized audio. -
Run the following commands to create a go.mod file that links to components hosted on GitHub:
Make sure that you set the
SPEECH_KEY and SPEECH_REGION environment variables. If you don’t set these variables, the sample fails with an error message.-
Now build and run the code:
Remarks
OpenAI text to speech voices in Azure Speech in Foundry Tools
OpenAI text to speech voices are also supported. See OpenAI text to speech voices in Azure Speech and multilingual voices. You can replaceen-US-Ava:DragonHDLatestNeural with a supported OpenAI voice name such as en-US-FableMultilingualNeural.
Clean up resources
You can use the Azure portal or Azure Command Line Interface (CLI) to remove the Speech resource you created. Reference documentation | Additional samples on GitHub With Azure Speech in Foundry Tools, you can run an application that synthesizes a human-like voice to read text. You can change the voice, enter text to be spoken, and listen to the output on your computer’s speaker.Prerequisites
An Azure subscription. You can create one for free.
Create an AI Services resource for Speech in the Azure portal.
Get the Speech resource key and endpoint. After your Speech resource is deployed, select Go to resource to view and manage keys.
Set up the environment
To set up your environment, install the Speech SDK. The sample in this quickstart works with the Java Runtime.-
Install Apache Maven. Then run
mvn -vto confirm successful installation. -
Create a pom.xml file in the root of your project, and copy the following code into it:
-
Install the Speech SDK and dependencies.
Set environment variables
You need to authenticate your application to access Foundry Tools. This article shows you how to use environment variables to store your credentials. You can then access the environment variables from your code to authenticate your application. For production, use a more secure way to store and access your credentials. To set the environment variables for your Speech resource key and endpoint, open a console window, and follow the instructions for your operating system and development environment.- To set the
SPEECH_KEYenvironment variable, replace your-key with one of the keys for your resource. - To set the
ENDPOINTenvironment variable, replace your-endpoint with one of the endpoints for your resource.
- Windows
- Linux
- macOS
If you only need to access the environment variables in the current console, you can set the environment variable with
set instead of setx.Create the application
Follow these steps to create a console application for speech recognition.- Create a file named SpeechSynthesis.java in the same project root directory.
-
Copy the following code into SpeechSynthesis.java:
-
To change the speech synthesis language, replace
en-US-Ava:DragonHDLatestNeuralwith another supported voice. All neural voices are multilingual and fluent in their own language and English. For example, if the input text in English is I’m excited to try text to speech and you setes-ES-Ximena:DragonHDLatestNeural, the text is spoken in English with a Spanish accent. If the voice doesn’t speak the language of the input text, the Speech service doesn’t output synthesized audio. -
Run your console application to output speech synthesis to the default speaker.
Make sure that you set the
SPEECH_KEY and ENDPOINT environment variables. If you don’t set these variables, the sample fails with an error message.-
Enter some text that you want to speak. For example, type I’m excited to try text to speech. Select the Enter key to hear the synthesized speech.
Remarks
More speech synthesis options
This quickstart uses theSpeakTextAsync operation to synthesize a short block of text that you enter. You can also use long-form text from a file and get finer control over voice styles, prosody, and other settings.
- See how to synthesize speech and Speech Synthesis Markup Language (SSML) overview for information about speech synthesis from a file and finer control over voice styles, prosody, and other settings.
- See batch synthesis API for text to speech for information about synthesizing long-form text to speech.
OpenAI text to speech voices in Azure Speech in Foundry Tools
OpenAI text to speech voices are also supported. See OpenAI text to speech voices in Azure Speech and multilingual voices. You can replaceen-US-Ava:DragonHDLatestNeural with a supported OpenAI voice name such as en-US-FableMultilingualNeural.
Clean up resources
You can use the Azure portal or Azure Command Line Interface (CLI) to remove the Speech resource you created. Reference documentation | Package (npm) | Additional samples on GitHub | Library source code With Azure Speech in Foundry Tools, you can run an application that synthesizes a human-like voice to read text. You can change the voice, enter text to be spoken, and listen to the output on your computer’s speaker.Prerequisites
An Azure subscription. You can create one for free.
Create a Foundry resource for Speech in the Azure portal.
Get the Speech resource key and region. After your Speech resource is deployed, select Go to resource to view and manage keys.
Set up
-
Create a new folder
synthesis-quickstartand go to the quickstart folder with the following command: -
Create the
package.jsonwith the following command: -
Install the Speech SDK for JavaScript with:
Retrieve resource information
You need to authenticate your application to access Foundry Tools. This article shows you how to use environment variables to store your credentials. You can then access the environment variables from your code to authenticate your application. For production, use a more secure way to store and access your credentials. To set the environment variables for your Speech resource key and region, open a console window, and follow the instructions for your operating system and development environment.- To set the
SPEECH_KEYenvironment variable, replace your-key with one of the keys for your resource. - To set the
SPEECH_REGIONenvironment variable, replace your-region with one of the regions for your resource. - To set the
ENDPOINTenvironment variable, replaceyour-endpointwith the actual endpoint of your Speech resource.
- Windows
- Linux
- macOS
If you only need to access the environment variables in the current console, you can set the environment variable with
set instead of setx.Synthesize speech to a file
To translate speech from a file:-
Create a new file named synthesis.js with the following content:
In synthesis.js, optionally you can rename YourAudioFile.wav to another output file name. To change the speech synthesis language, replace
en-US-Ava:DragonHDLatestNeuralwith another supported voice. All neural voices are multilingual and fluent in their own language and English. For example, if the input text in English is I’m excited to try text to speech and you setes-ES-Ximena:DragonHDLatestNeural, the text is spoken in English with a Spanish accent. If the voice doesn’t speak the language of the input text, the Speech service doesn’t output synthesized audio. -
Run your console application to start speech synthesis to a file:
Output
You should see the following output in the console. Follow the prompt to enter text that you want to synthesize:Remarks
More speech synthesis options
This quickstart uses theSpeakTextAsync operation to synthesize a short block of text that you enter. You can also use long-form text from a file and get finer control over voice styles, prosody, and other settings.
- See how to synthesize speech and Speech Synthesis Markup Language (SSML) overview for information about speech synthesis from a file and finer control over voice styles, prosody, and other settings.
- See batch synthesis API for text to speech for information about synthesizing long-form text to speech.
OpenAI text to speech voices in Azure Speech in Foundry Tools
OpenAI text to speech voices are also supported. See OpenAI text to speech voices in Azure Speech and multilingual voices. You can replaceen-US-Ava:DragonHDLatestNeural with a supported OpenAI voice name such as en-US-FableMultilingualNeural.
Clean up resources
You can use the Azure portal or Azure Command Line Interface (CLI) to remove the Speech resource you created. Reference documentation | Package (download) | Additional samples on GitHub With Azure Speech in Foundry Tools, you can run an application that synthesizes a human-like voice to read text. You can change the voice, enter text to be spoken, and listen to the output on your computer’s speaker.Prerequisites
An Azure subscription. You can create one for free.
Create a Foundry resource for Speech in the Azure portal.
Get the Speech resource key and region. After your Speech resource is deployed, select Go to resource to view and manage keys.
Set up the environment
The Speech SDK for Swift is distributed as a framework bundle. The framework supports both Objective-C and Swift on both iOS and macOS. The Speech SDK can be used in Xcode projects as a CocoaPod, or downloaded directly and linked manually. This guide uses a CocoaPod. Install the CocoaPod dependency manager as described in its installation instructions.Set environment variables
You need to authenticate your application to access Foundry Tools. This article shows you how to use environment variables to store your credentials. You can then access the environment variables from your code to authenticate your application. For production, use a more secure way to store and access your credentials. To set the environment variables for your Speech resource key and region, open a console window, and follow the instructions for your operating system and development environment.- To set the
SPEECH_KEYenvironment variable, replace your-key with one of the keys for your resource. - To set the
SPEECH_REGIONenvironment variable, replace your-region with one of the regions for your resource. - To set the
ENDPOINTenvironment variable, replaceyour-endpointwith the actual endpoint of your Speech resource.
- Windows
- Linux
- macOS
If you only need to access the environment variables in the current console, you can set the environment variable with
set instead of setx.Create the application
Follow these steps to synthesize speech in a macOS application.- Clone the Azure-Samples/cognitive-services-speech-sdk repository to get the Synthesize audio in Swift on macOS using the Speech SDK sample project. The repository also has iOS samples.
-
Navigate to the directory of the downloaded sample app (
helloworld) in a terminal. -
Run the command
pod install. This command generates ahelloworld.xcworkspaceXcode workspace that contains both the sample app and the Speech SDK as a dependency. -
Open the
helloworld.xcworkspaceworkspace in Xcode. -
Open the file named AppDelegate.swift and locate the
applicationDidFinishLaunchingandsynthesizemethods as shown here. -
In AppDelegate.m, use the environment variables that you previously set for your Speech resource key and region.
-
Optionally in AppDelegate.m, include a speech synthesis voice name as shown here:
-
To change the speech synthesis language, replace
en-US-Ava:DragonHDLatestNeuralwith another supported voice. All neural voices are multilingual and fluent in their own language and English. For example, if the input text in English is I’m excited to try text to speech and you setes-ES-Ximena:DragonHDLatestNeural, the text is spoken in English with a Spanish accent. If the voice doesn’t speak the language of the input text, the Speech service doesn’t output synthesized audio. - To make the debug output visible, select View > Debug Area > Activate Console.
- To build and run the example code, select Product > Run from the menu or select the Play button.
Make sure that you set the
SPEECH_KEY and SPEECH_REGION environment variables. If you don’t set these variables, the sample fails with an error message.Remarks
More speech synthesis options
This quickstart uses theSpeakText operation to synthesize a short block of text that you enter. You can also use long-form text from a file and get finer control over voice styles, prosody, and other settings.
- See how to synthesize speech and Speech Synthesis Markup Language (SSML) overview for information about speech synthesis from a file and finer control over voice styles, prosody, and other settings.
- See batch synthesis API for text to speech for information about synthesizing long-form text to speech.
OpenAI text to speech voices in Azure Speech in Foundry Tools
OpenAI text to speech voices are also supported. See OpenAI text to speech voices in Azure Speech and multilingual voices. You can replaceen-US-Ava:DragonHDLatestNeural with a supported OpenAI voice name such as en-US-FableMultilingualNeural.
Clean up resources
You can use the Azure portal or Azure Command Line Interface (CLI) to remove the Speech resource you created. Reference documentation | Package (PyPi) | Additional samples on GitHub With Azure Speech in Foundry Tools, you can run an application that synthesizes a human-like voice to read text. You can change the voice, enter text to be spoken, and listen to the output on your computer’s speaker.Prerequisites
An Azure subscription. You can create one for free.
Create an AI Services resource for Speech in the Azure portal.
Get the Speech resource key and endpoint. After your Speech resource is deployed, select Go to resource to view and manage keys.
Set up the environment
The Speech SDK for Python is available as a Python Package Index (PyPI) module. The Speech SDK for Python is compatible with Windows, Linux, and macOS.- On Windows, install the Microsoft Visual C++ Redistributable for Visual Studio 2015, 2017, 2019, and 2022 for your platform. Installing this package might require a restart.
- On Linux, you must use the x64 target architecture.
Set environment variables
You need to authenticate your application to access Foundry Tools. This article shows you how to use environment variables to store your credentials. You can then access the environment variables from your code to authenticate your application. For production, use a more secure way to store and access your credentials. To set the environment variables for your Speech resource key and endpoint, open a console window, and follow the instructions for your operating system and development environment.- To set the
SPEECH_KEYenvironment variable, replace your-key with one of the keys for your resource. - To set the
ENDPOINTenvironment variable, replace your-endpoint with one of the endpoints for your resource.
- Windows
- Linux
- macOS
If you only need to access the environment variables in the current console, you can set the environment variable with
set instead of setx.Create the application
Follow these steps to create a console application.- Open a command prompt window in the folder where you want the new project. Create a file named speech_synthesis.py.
-
Run this command to install the Speech SDK:
-
Copy the following code into speech_synthesis.py:
-
To change the speech synthesis language, replace
en-US-Ava:DragonHDLatestNeuralwith another supported voice. All neural voices are multilingual and fluent in their own language and English. For example, if the input text in English is I’m excited to try text to speech and you setes-ES-Ximena:DragonHDLatestNeural, the text is spoken in English with a Spanish accent. If the voice doesn’t speak the language of the input text, the Speech service doesn’t output synthesized audio. -
Run your new console application to start speech synthesis to the default speaker.
Make sure that you set the
SPEECH_KEY and ENDPOINT environment variables. If you don’t set these variables, the sample fails with an error message.-
Enter some text that you want to speak. For example, type I’m excited to try text to speech. Select the Enter key to hear the synthesized speech.
Remarks
More speech synthesis options
This quickstart uses thespeak_text_async operation to synthesize a short block of text that you enter. You can also use long-form text from a file and get finer control over voice styles, prosody, and other settings.
- See how to synthesize speech and Speech Synthesis Markup Language (SSML) overview for information about speech synthesis from a file and finer control over voice styles, prosody, and other settings.
- See batch synthesis API for text to speech for information about synthesizing long-form text to speech.
OpenAI text to speech voices in Azure Speech in Foundry Tools
OpenAI text to speech voices are also supported. See OpenAI text to speech voices in Azure Speech and multilingual voices. You can replaceen-US-Ava:DragonHDLatestNeural with a supported OpenAI voice name such as en-US-FableMultilingualNeural.
Clean up resources
You can use the Azure portal or Azure Command Line Interface (CLI) to remove the Speech resource you created. Speech to text REST API reference | Speech to text REST API for short audio reference | Additional samples on GitHub With Azure Speech in Foundry Tools, you can run an application that synthesizes a human-like voice to read text. You can change the voice, enter text to be spoken, and listen to the output on your computer’s speaker.Prerequisites
An Azure subscription. You can create one for free.
Create a Foundry resource for Speech in the Azure portal.
Get the Speech resource key and region. After your Speech resource is deployed, select Go to resource to view and manage keys.
Set environment variables
You need to authenticate your application to access Foundry Tools. This article shows you how to use environment variables to store your credentials. You can then access the environment variables from your code to authenticate your application. For production, use a more secure way to store and access your credentials. To set the environment variables for your Speech resource key and region, open a console window, and follow the instructions for your operating system and development environment.- To set the
SPEECH_KEYenvironment variable, replace your-key with one of the keys for your resource. - To set the
SPEECH_REGIONenvironment variable, replace your-region with one of the regions for your resource. - To set the
ENDPOINTenvironment variable, replaceyour-endpointwith the actual endpoint of your Speech resource.
- Windows
- Linux
- macOS
If you only need to access the environment variables in the current console, you can set the environment variable with
set instead of setx.Synthesize speech to a file
At a command prompt, run the following cURL command. Optionally, you can rename output.mp3 to another output file name.- Windows
- Linux
- macOS
- Windows
- Linux
- macOS
Synthesize speech to a file
To translate speech from a file:-
Create a new file named synthesis.ts with the following content:
In synthesis.ts, optionally you can rename YourAudioFile.wav to another output file name. To change the speech synthesis language, replace
en-US-Ava:DragonHDLatestNeuralwith another supported voice. All neural voices are multilingual and fluent in their own language and English. For example, if the input text in English is I’m excited to try text to speech and you setes-ES-Ximena:DragonHDLatestNeural, the text is spoken in English with a Spanish accent. If the voice doesn’t speak the language of the input text, the Speech service doesn’t output synthesized audio. -
Create the
tsconfig.jsonfile to transpile the TypeScript code and copy the following code for ECMAScript. -
Transpile from TypeScript to JavaScript.
This command should produce no output if successful.
-
Run your console application to start speech synthesis to a file:
Output
You should see the following output in the console. Follow the prompt to enter text that you want to synthesize:Remarks
More speech synthesis options
This quickstart uses theSpeakTextAsync operation to synthesize a short block of text that you enter. You can also use long-form text from a file and get finer control over voice styles, prosody, and other settings.
- See how to synthesize speech and Speech Synthesis Markup Language (SSML) overview for information about speech synthesis from a file and finer control over voice styles, prosody, and other settings.
- See batch synthesis API for text to speech for information about synthesizing long-form text to speech.
OpenAI text to speech voices in Azure Speech in Foundry Tools
OpenAI text to speech voices are also supported. See OpenAI text to speech voices in Azure Speech and multilingual voices. You can replaceen-US-Ava:DragonHDLatestNeural with a supported OpenAI voice name such as en-US-FableMultilingualNeural.
