Skip to main content

Quickstart: Generate a video with Sora (preview)

This article refers to the Microsoft Foundry (new) portal.
In this quickstart, you generate video clips using the Azure OpenAI service. The example uses the Sora model, which is a video generation model that creates realistic and imaginative video scenes from text instructions and/or image or video inputs. This guide shows you how to create a video generation job, poll for its status, and retrieve the generated video. For more information on video generation, see Video generation concepts.

Prerequisites

Limitations and quotas

Sora video generation is currently in preview. Keep the following limitations in mind:
  • Region availability: Sora is available only in East US 2 and Sweden Central (Global Standard deployments).
  • Supported resolutions: 480×480, 720×720, 1080×1080, 1280×720, 1920×1080 (width × height).
  • Video duration: 5 to 20 seconds (n_seconds parameter).
  • Variants: Generate 1 to 4 video variants per request (n_variants parameter).
  • Rate limits: Subject to your deployment’s tokens-per-minute (TPM) quota. See Quotas and limits for details.
  • Content filtering: Prompts are subject to content moderation. Requests with harmful content are rejected.

Microsoft Entra ID prerequisites

For the recommended keyless authentication with Microsoft Entra ID, you need to:
  • Install the Azure CLI used for keyless authentication with Microsoft Entra ID.
  • Assign the Cognitive Services User role to your user account. You can assign roles in the Azure portal under Access control (IAM) > Add role assignment.

Set up

  1. Create a new folder video-generation-quickstart and go to the quickstart folder with the following command:
    mkdir video-generation-quickstart && cd video-generation-quickstart
    
  2. Create a virtual environment. If you already have Python 3.10 or higher installed, you can create a virtual environment using the following commands: Windows
    py -3 -m venv .venv
    .venv\scripts\activate
    
    Linux
    python3 -m venv .venv
    source .venv/bin/activate
    
    macOS
    python3 -m venv .venv
    source .venv/bin/activate
    
    Activating the Python environment means that when you run python or pip from the command line, you then use the Python interpreter contained in the .venv folder of your application. You can use the deactivate command to exit the python virtual environment, and can later reactivate it when needed.
We recommend that you create and activate a new Python environment to use to install the packages you need for this tutorial. Don’t install packages into your global python installation. You should always use a virtual or conda environment when installing python packages, otherwise you can break your global installation of Python.
  1. Install the required packages. Microsoft Entra ID
    pip install requests azure-identity
    
    The azure-identity package provides DefaultAzureCredential for secure, keyless authentication. API key
    pip install requests
    
    The requests library handles HTTP calls to the REST API.

Retrieve resource information

You need to retrieve the following information to authenticate your application with your Azure OpenAI resource:
Variable nameValue
AZURE_OPENAI_ENDPOINTThis value can be found in the Keys and Endpoint section when examining your resource from the Azure portal.
AZURE_OPENAI_DEPLOYMENT_NAMEThis value will correspond to the custom name you chose for your deployment when you deployed a model. This value can be found under Resource Management > Model Deployments in the Azure portal.
Learn more about keyless authentication and setting environment variables.

Generate video with Sora

You can generate a video with the Sora model by creating a video generation job, polling for its status, and retrieving the generated video. The following code shows how to do this via the REST API using Python.

Choose your input type

Sora supports three input modes:
Input typeBest forExample use case
Text prompt onlyCreating entirely new scenes from descriptions”A cat playing piano in a jazz bar”
Image + text promptAnimating a still image or using it as a starting frameBring a product photo to life
Video + text promptExtending or modifying existing video footageAdd visual effects to existing clips

Set up authentication

  1. Create the sora-quickstart.py file and add the following code to authenticate your resource: Microsoft Entra ID
    import json
    import requests
    import time
    import os
    from azure.identity import DefaultAzureCredential
    
    # Set environment variables or edit the corresponding values here.
    endpoint = os.environ.get('AZURE_OPENAI_ENDPOINT')
    deployment_name = os.environ.get('AZURE_OPENAI_DEPLOYMENT_NAME')
    if not endpoint or not deployment_name:
        raise ValueError("Set AZURE_OPENAI_ENDPOINT and AZURE_OPENAI_DEPLOYMENT_NAME.")
    
    # Keyless authentication
    credential = DefaultAzureCredential()
    token = credential.get_token("https://cognitiveservices.azure.com/.default")
    
    # Video generation uses 'preview' as the API version during the preview period
    api_version = 'preview'
    headers= { "Authorization": f"Bearer {token.token}", "Content-Type": "application/json" }
    
    API key
    import json
    import requests
    import time
    import os
    
    # Set environment variables or edit the corresponding values here.
    endpoint = os.environ.get('AZURE_OPENAI_ENDPOINT')
    deployment_name = os.environ.get('AZURE_OPENAI_DEPLOYMENT_NAME')
    api_key = os.environ.get('AZURE_OPENAI_API_KEY')
    if not endpoint or not deployment_name or not api_key:
        raise ValueError(
            "Set AZURE_OPENAI_ENDPOINT, AZURE_OPENAI_DEPLOYMENT_NAME, and AZURE_OPENAI_API_KEY."
        )
    
    # Video generation uses 'preview' as the API version during the preview period
    api_version = 'preview'
    headers= { "api-key": api_key, "Content-Type": "application/json" }
    

Create the video generation job

  1. Add the code to create and monitor the video generation job. Choose the input type that matches your use case. Text prompt
    # 1. Create a video generation job
    create_url = f"{endpoint}/openai/v1/video/generations/jobs?api-version={api_version}"
    body = {
        "prompt": "A cat playing piano in a jazz bar.",
        "width": 480,
        "height": 480,
        "n_seconds": 5,
        "model": deployment_name
    }
    response = requests.post(create_url, headers=headers, json=body)
    response.raise_for_status()
    print("Full response JSON:", response.json())
    job_id = response.json()["id"]
    print(f"Job created: {job_id}")
    
    # 2. Poll for job status
    status_url = f"{endpoint}/openai/v1/video/generations/jobs/{job_id}?api-version={api_version}"
    status=None
    while status not in ("succeeded", "failed", "cancelled"):
        time.sleep(5)  # Wait before polling again
        status_response = requests.get(status_url, headers=headers).json()
        status = status_response.get("status")
        print(f"Job status: {status}")
        
    # 3. Retrieve generated video 
    if status == "succeeded":
        generations = status_response.get("generations", [])
        if generations:
            print(f"✅ Video generation succeeded.")
            generation_id = generations[0].get("id")
            video_url = f"{endpoint}/openai/v1/video/generations/{generation_id}/content/video?api-version={api_version}"
            video_response = requests.get(video_url, headers=headers)
            if video_response.ok:
                output_filename = "output.mp4"
                with open(output_filename, "wb") as file:
                    file.write(video_response.content)
                    print(f'Generated video saved as "{output_filename}"')
        else:
            raise Exception("No generations found in job result.")
    else:
        raise Exception(f"Job didn't succeed. Status: {status}")
    
    Image prompt Replace the "file_name" field in "inpaint_items" with the name of your input image file. Also replace the construction of the files array, which associates the path to the actual file with the filename that the API uses. Use the "crop_bounds" data (image crop distances, from each direction, as a fraction of the total image dimensions) to specify which part of the image should be used in video generation. You can optionally set the "frame_index" to the frame in the generated video where your image should appear (the default is 0, the start of the video). The "n_variants" parameter specifies how many different video variations to generate from the same prompt (1 to 4). Each variant provides a unique interpretation of your input.
    # 1. Create a video generation job with image inpainting (multipart upload)
    create_url = f"{endpoint}/openai/v1/video/generations/jobs?api-version={api_version}"
    
    # Flatten the body for multipart/form-data
    data = {
        "prompt": "A serene forest scene transitioning into autumn",
        "height": str(1080),
        "width": str(1920),
        "n_seconds": str(10),
        "n_variants": str(1),
        "model": deployment_name,
        # inpaint_items must be JSON string
        "inpaint_items": json.dumps([
            {
                "frame_index": 0,
                "type": "image",
                "file_name": "dog_swimming.jpg",
                "crop_bounds": {
                    "left_fraction": 0.1,
                    "top_fraction": 0.1,
                    "right_fraction": 0.9,
                    "bottom_fraction": 0.9
                }
            }
        ])
    }
    
    # Replace with your own image file path
    with open("dog_swimming.jpg", "rb") as image_file:
        files = [
            ("files", ("dog_swimming.jpg", image_file, "image/jpeg"))
        ]
        multipart_headers = {k: v for k, v in headers.items() if k.lower() != "content-type"}
        response = requests.post(
            create_url,
            headers=multipart_headers,
            data=data,
            files=files
        )
    
    if not response.ok:
        print("Error response:", response.status_code, response.text)
        response.raise_for_status()
    print("Full response JSON:", response.json())
    job_id = response.json()["id"]
    print(f"Job created: {job_id}")
    
    # 2. Poll for job status
    status_url = f"{endpoint}/openai/v1/video/generations/jobs/{job_id}?api-version={api_version}"
    status = None
    while status not in ("succeeded", "failed", "cancelled"):
        time.sleep(5)
        status_response = requests.get(status_url, headers=headers).json()
        status = status_response.get("status")
        print(f"Job status: {status}")
    
    # 3. Retrieve generated video
    if status == "succeeded":
        generations = status_response.get("generations", [])
        if generations:
            generation_id = generations[0].get("id")
            video_url = f"{endpoint}/openai/v1/video/generations/{generation_id}/content/video?api-version={api_version}"
            video_response = requests.get(video_url, headers=headers)
            if video_response.ok:
                output_filename = "output.mp4"
                with open(output_filename, "wb") as file:
                    file.write(video_response.content)
                    print(f'✅ Generated video saved as "{output_filename}"')
        else:
            raise Exception("No generations found in job result.")
    else:
        raise Exception(f"Job didn't succeed. Status: {status}")
    
    Video prompt Replace the "file_name" field in "inpaint_items" with the name of your input video file. Also replace the construction of the files array, which associates the path to the actual file with the filename that the API uses. Use the "crop_bounds" data (image crop distances, from each direction, as a fraction of the total frame dimensions) to specify which part of the video frame should be used in video generation. You can optionally set the "frame_index" to the frame in the generated video where your input video should start (the default is 0, the beginning).
    # 1. Create a video generation job with video inpainting (multipart upload)
    create_url = f"{endpoint}/openai/v1/video/generations/jobs?api-version={api_version}"
    
    # Flatten the body for multipart/form-data
    data = {
        "prompt": "A serene forest scene transitioning into autumn",
        "height": str(1080),
        "width": str(1920),
        "n_seconds": str(10),
        "n_variants": str(1),
        "model": deployment_name,
        # inpaint_items must be JSON string
        "inpaint_items": json.dumps([
            {
                "frame_index": 0,
                "type": "video",
                "file_name": "dog_swimming.mp4",
                "crop_bounds": {
                    "left_fraction": 0.1,
                    "top_fraction": 0.1,
                    "right_fraction": 0.9,
                    "bottom_fraction": 0.9
                }
            }
        ])
    }
    
    # Replace with your own video file path
    with open("dog_swimming.mp4", "rb") as video_file:
        files = [
            ("files", ("dog_swimming.mp4", video_file, "video/mp4"))
        ]
        multipart_headers = {k: v for k, v in headers.items() if k.lower() != "content-type"}
        response = requests.post(
            create_url,
            headers=multipart_headers,
            data=data,
            files=files
        )
    
    if not response.ok:
        print("Error response:", response.status_code, response.text)
        response.raise_for_status()
    print("Full response JSON:", response.json())
    job_id = response.json()["id"]
    print(f"Job created: {job_id}")
    
    # 2. Poll for job status
    status_url = f"{endpoint}/openai/v1/video/generations/jobs/{job_id}?api-version={api_version}"
    status = None
    while status not in ("succeeded", "failed", "cancelled"):
        time.sleep(5)
        status_response = requests.get(status_url, headers=headers).json()
        status = status_response.get("status")
        print(f"Job status: {status}")
    
    # 3. Retrieve generated video
    if status == "succeeded":
        generations = status_response.get("generations", [])
        if generations:
            generation_id = generations[0].get("id")
            video_url = f"{endpoint}/openai/v1/video/generations/{generation_id}/content/video?api-version={api_version}"
            video_response = requests.get(video_url, headers=headers)
            if video_response.ok:
                output_filename = "output.mp4"
                with open(output_filename, "wb") as file:
                    file.write(video_response.content)
                    print(f'✅ Generated video saved as "{output_filename}"')
        else:
            raise Exception("No generations found in job result.")
    else:
        raise Exception(f"Job didn't succeed. Status: {status}")
    
  2. Run the Python file.
    python sora-quickstart.py
    
    Video generation typically takes 1 to 5 minutes depending on the resolution and duration. You should see status updates in your terminal as the job progresses through queued, preprocessing, running, processing, and finally succeeded.

Output

The output will show the full response JSON from the video generation job creation request, including the job ID and status.
{
    "object": "video.generation.job",
    "id": "task_01jwcet0eje35tc5jy54yjax5q",
    "status": "queued",
    "created_at": 1748469875,
    "finished_at": null,
    "expires_at": null,
    "generations": [],
    "prompt": "A cat playing piano in a jazz bar.",
    "model": "<your-deployment-name>",
    "n_variants": 1,
    "n_seconds": 5,
    "height": 480,
    "width": 480,
    "failure_reason": null
}
The generated video will be saved as output.mp4 in the current directory.
Job created: task_01jwcet0eje35tc5jy54yjax5q
Job status: preprocessing
Job status: running
Job status: processing
Job status: succeeded
✅ Video generation succeeded.
Generated video saved as "output.mp4"

Troubleshooting

If you encounter issues, check the following common problems and solutions:
ErrorCauseSolution
401 UnauthorizedInvalid or expired credentialsFor Microsoft Entra ID, run az login to refresh your token. For API key, verify AZURE_OPENAI_API_KEY is correct.
403 ForbiddenMissing role assignmentAssign the Cognitive Services User role to your account in the Azure portal.
404 Not FoundIncorrect endpoint or deployment nameVerify AZURE_OPENAI_ENDPOINT includes your resource name and AZURE_OPENAI_DEPLOYMENT_NAME matches your Sora deployment.
429 Too Many RequestsRate limit exceededWait and retry, or request a quota increase in the Azure portal.
400 Bad Request with dimension errorUnsupported resolutionUse a supported resolution: 480×480, 720×720, 1080×1080, 1280×720, or 1920×1080.
Job status failedContent policy violation or internal errorCheck failure_reason in the response. Modify your prompt if it triggered content filtering.
Timeout during pollingLong generation timeVideos can take up to 5 minutes. Increase your polling timeout or check job status manually.
To debug authentication issues, test your credentials with a simple API call first:
# Test endpoint connectivity
test_url = f"{endpoint}/openai/deployments?api-version=2024-02-01"
response = requests.get(test_url, headers=headers)
print(response.status_code, response.text)

Prerequisites

Go to Microsoft Foundry portal

Browse to the Foundry portal and sign in with the credentials associated with your Azure OpenAI resource. During or after the sign-in workflow, select the appropriate directory, Azure subscription, and Azure OpenAI resource. From the Foundry landing page, create or select a new project. Navigate to the Models + endpoints page on the left nav. Select Deploy model and then choose the Sora video generation model from the list. Complete the deployment process. On the model’s page, select Open in playground.

Try out video generation

Start exploring Sora video generation with a no-code approach through the Video playground. Enter your prompt into the text box and select Generate. Video generation typically takes 1 to 5 minutes depending on your settings. When the AI-generated video is ready, it appears on the page.
The content generation APIs come with a content moderation filter. If Azure OpenAI recognizes your prompt as harmful content, it doesn’t return a generated video. For more information, see Content filtering.
In the Video playground, you can also view Python and cURL code samples, which are prefilled according to your settings. Select the code button at the top of your video playback pane. You can use this code to write an application that completes the same task.

Clean up resources

If you want to clean up and remove an Azure OpenAI resource, you can delete the resource. Before deleting the resource, you must first delete any deployed models.