Hub documentation

Inference Providers

Hugging Face's logo
Join the Hugging Face community

and get access to the augmented documentation experience

to get started

Inference Providers

Hugging Face’s model pages have pay-as-you-go inference for thousands of models, so you can try them all out right in the browser. Service is powered by Inference Providers and includes a free-tier.

Inference Providers give developers streamlined, unified access to hundreds of machine learning models, powered by the best serverless inference partners. 👉 For complete documentation, visit the Inference Providers Documentation.

Inference Providers on the Hub

Inference Providers is deeply integrated with the Hugging Face Hub, and you can use it in a few different ways:

  • Interactive Widgets - Test models directly on model pages with interactive widgets that use Inference Providers under the hood. Check out the DeepSeek-R1-0528 model page for an example.
  • Inference Playground - Easily test and compare chat completion models with your prompts. Check out the Inference Playground to get started.
  • Search - Filter models by inference provider on the models page to find models available through specific providers.
  • Data Studio - Use AI to explore datasets on the Hub. Check out Data Studio on your favorite dataset.

Build with Inference Providers

You can integrate Inference Providers into your own applications using our SDKs or HTTP clients. Here’s a quick start with Python and JavaScript, for more details, check out the Inference Providers Documentation.

python
javascript

You can use our Python SDK to interact with Inference Providers.

from huggingface_hub import InferenceClient

import os

client = InferenceClient(
    api_key=os.environ["HF_TOKEN"],
    provider="auto",   # Automatically selects best provider
)

# Chat completion
completion = client.chat.completions.create(
    model="deepseek-ai/DeepSeek-V3-0324",
    messages=[{"role": "user", "content": "A story about hiking in the mountains"}]
)

# Image generation
image = client.text_to_image(
    prompt="A serene lake surrounded by mountains at sunset, photorealistic style",
    model="black-forest-labs/FLUX.1-dev"
)

Or, you can just use the OpenAI API compatible client.

import os
from openai import OpenAI

client = OpenAI(
    base_url="https://router.huggingface.co/v1",
    api_key=os.environ["HF_TOKEN"],
)

completion = client.chat.completions.create(
    model="deepseek-ai/DeepSeek-V3-0324",
    messages=[
        {
            "role": "user",
            "content": "A story about hiking in the mountains"
        }
    ],
)

The OpenAI API compatible client is not supported for image generation.

You’ll need a Hugging Face token with inference permissions. Create one at Settings > Tokens.

How Inference Providers works

To dive deeper into Inference Providers, check out the Inference Providers Documentation. Here are some key resources:

What was the HF-Inference API?

HF-Inference API is one of the providers available through Inference Providers. It was previously called “Inference API (serverless)” and is powered by Inference Endpoints under the hood.

For more details about the HF-Inference provider specifically, check out its dedicated page.

< > Update on GitHub