Available Foundation and Embedding Models for DigitalOcean Gradient™ AI Platform
Validated on 18 Dec 2025 • Last edited on 18 Dec 2025
DigitalOcean Gradient™ AI Platform lets you build fully-managed AI agents with knowledge bases for retrieval-augmented generation, multi-agent routing, guardrails, and more, or use serverless inference to make direct requests to popular foundation models.
The following foundation and embedding models are available for Gradient AI Platform. For pricing, see Gradient AI Platform’s pricing page.
Foundation Models
Gradient AI Platform supports both open source and commercial foundation models. You can use these models for:
Open source models are generally published by research labs, available under open licenses, and offered using DigitalOcean API access keys. Commercial models are proprietary and require the provider’s API keys to access, such as OpenAI API keys and Anthropic API keys.
For agents you build and deploy using ADK, you can use any model key, even if the model isn’t hosted on DigitalOcean or the model key is not provided by DigitalOcean.
We offer the following foundation models:
| Model | Model ID | Parameters | Max Output Tokens | Usage Notes |
|---|---|---|---|---|
| Qwen3-32B | alibaba-qwen3-32b |
32 billion | 40,960 | Only for serverless inference and ADK. |
All Anthropic models available on the Gradient AI Platform support tool (function) calling. Refer to provider documentation for other supported features.
Claude Sonnet 4.5 and Sonnet 4 models support input context window of up to 1M tokens.
| Model | Model ID | Parameters | Max Output Tokens | Usage Notes |
|---|---|---|---|---|
| Claude Sonnet 4.5 | anthropic-claude-4.5-sonnet |
Not published | 64,000 | Only for serverless inference and ADK. |
| Claude Sonnet 4 | anthropic-claude-sonnet-4 |
Not published | 64,000 | |
| Claude 3.7 Sonnet | anthropic-claude-3.7-sonnet |
Not published | 128,000 | |
| Claude 3.5 Sonnet | anthropic-claude-3.5-sonnet |
Not published | 8,192 | |
| Claude 3.5 Haiku | anthropic-claude-3.5-haiku |
Not published | 8,000 | |
| Claude Opus 4.5 | anthropic-claude-opus-4.5 |
Not published | 64,000 | Only for serverless inference and ADK. |
| Claude Opus 4.1 | anthropic-claude-4.1-opus |
Not published | 32,000 | Only for serverless inference and ADK. |
| Claude Opus 4 | anthropic-claude-opus-4 |
Not published | 32,000 | |
| Claude 3 Opus | anthropic-claude-3-opus |
Not published | 4,096 |
| Model | Model ID | Parameters | Max Tokens | Usage Notes |
|---|---|---|---|---|
| DeepSeek R1 Distill Llama 70B | deepseek-r1-distill-llama-70b |
70 billion | 32,768 | When using DeepSeek models in a user-facing agent, we strongly recommend adding all available guardrails for a safer conversational experience. |
| Model | Model ID | Type | Usage Notes |
|---|---|---|---|
| Fast SDXL | fal-ai/fast-sdxl |
Image generation | Multimodal and generative model, only for serverless inference. |
| Flux Schnell | fal-ai/flux/schnell |
Image generation | Multimodal and generative model, only for serverless inference. |
| Stable Audio 2.5 (Text-to-Audio) | fal-ai/stable-audio-25/text-to-audio |
Text-to-audio | Multimodal and generative model, only for serverless inference. |
| Multilingual TTS v2 | fal-ai/elevenlabs/tts/multilingual-v2 |
Text-to-speech | Multimodal and generative model, only for serverless inference. |
| Model | Model ID | Parameters | Max Tokens |
|---|---|---|---|
| Llama 3.3 Instruct-70B | llama3.3-70b-instruct |
70 billion | 128,000 |
| Llama 3.1 Instruct-8B | llama3-8b-instruct |
8 billion | 128,000 |
| Model | Model ID | Parameters | Max Tokens |
|---|---|---|---|
| NeMo | mistral-nemo-instruct-2407 |
12 billion | 128,000 |
All OpenAI models available on the Gradient AI Platform support tool (function) calling. Refer to provider documentation for other supported features.
| Model | Model ID | Parameters | Max Output Tokens |
|---|---|---|---|
| gpt-oss-120b | openai-gpt-oss-120b |
117 billion | 131,072 |
| gpt-oss-20b | openai-gpt-oss-20b |
21 billion | 131,072 |
| GPT-5 | openai-gpt-5 |
Not published | Not published |
| GPT-5 mini | openai-gpt-5-mini |
Not published | Not published |
| GPT-5 nano | openai-gpt-5-nano |
Not published | Not published |
| GPT-4.1 | openai-gpt-4.1 |
Not published | Not published |
| GPT-4o | openai-gpt-4o |
Not published | Not published |
| GPT-4o mini | openai-gpt-4o-mini |
Not published | Not published |
| o1 | openai-o1 |
Not published | Not published |
| o3 | openai-o3 |
Not published | Not published |
| o3-mini | openai-o3-mini |
Not published | Not published |
| GPT-image-1 | openai-gpt-image-1 |
Not published | Not published |
Embedding Models
An embedding model converts data into vector embeddings. Gradient AI Platform stores vector embeddings in an OpenSearch database cluster for use with agent knowledge bases. The following embedding models are available on the platform, along with their token windows and recommended chunking ranges.
| Provider | Model | Parameters | Token Window | Chunk Size Range | Parent Chunk Range | Child Chunk Range |
|---|---|---|---|---|---|---|
| Tongyi Lab, Alibaba | GTE Large (v1.5) | Not available | 8192 tokens | 0-750 | 500-1000 | 300-500 |
| UKP Lab, Technical University of Darmstadt | All-MiniLM-L6-v2 | 22 million | 256 tokens | 0-256 | 100-256 | 100-200 |
| UKP Lab, Technical University of Darmstadt | Multi-QA-mpnet-base-dot-v1 | 109 million | 512 tokens | 0-512 | 100-512 | 100-500 |
| Alibaba Qwen | Qwen3 Embedding 0.6B (Multilingual) | 600 million | 8000 tokens | 0-750 | 500-1000 | 300-500 |