AI Models Overview

Azure AI Foundry provides access to a comprehensive catalog of AI models from Microsoft, leading AI companies, and the open-source community. Understanding the different types of models and their capabilities helps you choose the right tools for your applications.

Model categories

Language models

Large Language Models (LLMs) that understand and generate human-like text: Chat completion models:

GPT-4o, GPT-4o mini - Advanced reasoning and conversation
Claude 3.5 Sonnet - Strong analytical capabilities
Llama 3.1 series - Open-source alternatives

Specialized language models:

Code generation models for programming tasks
Domain-specific models for legal, medical, and financial text
Multilingual models for global applications

Vision models

Models that process and understand visual content: Image understanding:

GPT-4 Vision - Analyze and describe images
Florence models - Object detection and recognition
Custom vision models for specific use cases

Image generation:

DALL-E 3 - Create images from text descriptions
Stable Diffusion - Open-source image generation
Custom image models for branded content

Multimodal models

Models that work across text, images, audio, and video: Vision-language models:

GPT-4o - Combined text and image understanding
LLaVA models - Open-source vision-language capabilities

Audio processing:

Whisper - Speech-to-text transcription
Azure Speech Services - Text-to-speech synthesis
Audio classification and analysis models

Model deployment patterns

Serverless APIs

Best for:

Variable workloads
Getting started quickly
Cost-effective experimentation

Characteristics:

Pay-per-use pricing
Automatic scaling
Shared infrastructure
Managed by Azure

Managed compute

Best for:

Consistent workloads
Predictable performance
Enhanced security requirements

Characteristics:

Dedicated resources
Customizable configurations
Predictable costs
Customer-controlled scaling

Choosing the right model

Consider your use case

Content generation:

Blog posts, marketing copy → GPT-4o, Claude 3.5
Code development → GPT-4o, CodeLlama
Creative writing → GPT-4o, Claude 3.5

Analysis and extraction:

Document processing → GPT-4o with vision
Data analysis → GPT-4o, Claude 3.5
Sentiment analysis → Specialized language models

Interactive applications:

Chatbots → GPT-4o mini (cost-effective)
Virtual assistants → GPT-4o (high capability)
Customer support → Fine-tuned models

Performance considerations

Latency requirements:

Real-time chat → Smaller, faster models
Batch processing → Larger, more capable models
Streaming responses → Models with streaming support

Quality needs:

High-stakes applications → GPT-4o, Claude 3.5
General purpose → GPT-4o mini
Specialized domains → Fine-tuned models

Cost constraints:

High volume, simple tasks → Smaller models
Complex reasoning → Larger models
Mixed workloads → Model routing strategies

Model capabilities and limitations

Understanding model strengths

GPT-4o series:

Excellent reasoning and problem-solving
Strong code generation capabilities
Good multilingual support
Vision understanding (GPT-4o)

Claude 3.5 series:

Strong analytical thinking
Excellent for complex reasoning
Good safety characteristics
Large context windows

Open-source models:

Customizable and fine-tunable
No vendor lock-in
Community-driven improvements
Cost-effective for specific use cases

Common limitations

Context length:

Most models have token limits
Longer conversations may lose context
Consider conversation management strategies

Knowledge cutoffs:

Models trained on data up to specific dates
May not know recent events
Consider RAG for current information

Bias and fairness:

Models reflect training data biases
Important for sensitive applications
Use evaluation tools to assess bias

Model lifecycle management

Versioning and updates

Model versions:

Each model has multiple versions
Newer versions often improve capabilities
Plan for version transitions

Deployment strategies:

Blue-green deployments for zero downtime
Gradual rollouts for risk mitigation
A/B testing for performance comparison

Monitoring and optimization

Performance metrics:

Latency and throughput
Error rates and availability
Cost per request/token

Quality assessment:

Response relevance and accuracy
User satisfaction scores
Automated evaluation metrics

Getting started with models

Exploration approach

Start with the playground - Test models interactively
Try different prompts - Understand model behavior
Compare models - Find the best fit for your use case
Measure performance - Establish baseline metrics
Scale gradually - Move from testing to production

Best practices

Prompt engineering:

Write clear, specific instructions
Provide examples for better results
Use consistent formatting
Test different approaches

Safety and governance:

Implement content filtering
Monitor for inappropriate outputs
Set up usage alerts and limits
Document model usage policies

Understanding these fundamentals helps you make informed decisions about which models to use and how to deploy them effectively in your applications.

Getting Started

Tutorials

How-to Guides

Concepts

Core Concepts

Ai models overview

AI Models Overview

Model categories

Language models

Vision models

Multimodal models

Model deployment patterns

Serverless APIs

Managed compute

Choosing the right model

Consider your use case

Performance considerations

Model capabilities and limitations

Understanding model strengths

Common limitations

Model lifecycle management

Versioning and updates

Monitoring and optimization

Getting started with models

Exploration approach

Best practices

Getting Started

Tutorials

How-to Guides

Concepts

Core Concepts

​AI Models Overview

​Model categories

​Language models

​Vision models

​Multimodal models

​Model deployment patterns

​Serverless APIs

​Managed compute

​Choosing the right model

​Consider your use case

​Performance considerations

​Model capabilities and limitations

​Understanding model strengths

​Common limitations

​Model lifecycle management

​Versioning and updates

​Monitoring and optimization

​Getting started with models

​Exploration approach

​Best practices

AI Models Overview

Model categories

Language models

Vision models

Multimodal models

Model deployment patterns

Serverless APIs

Managed compute

Choosing the right model

Consider your use case

Performance considerations

Model capabilities and limitations

Understanding model strengths

Common limitations

Model lifecycle management

Versioning and updates

Monitoring and optimization

Getting started with models

Exploration approach

Best practices