huggingface · Wauplin · Sep 11, 2025 · Sep 10, 2025 · Sep 10, 2025 · Sep 10, 2025
diff --git a/docs/source/de/guides/inference.md b/docs/source/de/guides/inference.md
@@ -8,7 +8,6 @@ Inferenz ist der Prozess, bei dem ein trainiertes Modell verwendet wird, um Vorh
 - [Inferenz API](https://huggingface.co/docs/api-inference/index): ein Service, der Ihnen ermöglicht, beschleunigte Inferenz auf der Infrastruktur von Hugging Face kostenlos auszuführen. Dieser Service ist eine schnelle Möglichkeit, um anzufangen, verschiedene Modelle zu testen und AI-Produkte zu prototypisieren.
 - [Inferenz Endpunkte](https://huggingface.co/inference-endpoints/index): ein Produkt zur einfachen Bereitstellung von Modellen im Produktivbetrieb. Die Inferenz wird von Hugging Face in einer dedizierten, vollständig verwalteten Infrastruktur auf einem Cloud-Anbieter Ihrer Wahl durchgeführt.
 
-Diese Dienste können mit dem [`InferenceClient`] Objekt aufgerufen werden. Dieser fungiert als Ersatz für den älteren [`InferenceApi`] Client und fügt spezielle Unterstützung für Aufgaben und das Ausführen von Inferenz hinzu, sowohl auf [Inferenz API](https://huggingface.co/docs/api-inference/index) als auch auf [Inferenz Endpunkten](https://huggingface.co/docs/inference-endpoints/index). Im Abschnitt [Legacy InferenceAPI client](#legacy-inferenceapi-client) erfahren Sie, wie Sie zum neuen Client migrieren können.
 
 <Tip>
 
@@ -89,34 +88,34 @@ Die Authentifizierung ist NICHT zwingend erforderlich, wenn Sie die Inferenz API
 
 Das Ziel von [`InferenceClient`] ist es, die einfachste Schnittstelle zum Ausführen von Inferenzen auf Hugging Face-Modellen bereitzustellen. Es verfügt über eine einfache API, die die gebräuchlichsten Aufgaben unterstützt. Hier ist eine Liste der derzeit unterstützten Aufgaben:
 
-| Domäne | Aufgabe                           | Unterstützt   | Dokumentation                             |
-|--------|--------------------------------|--------------|------------------------------------|
-| Audio | [Audio Classification](https://huggingface.co/tasks/audio-classification)           | ✅ | [`~InferenceClient.audio_classification`] |
-| | [Automatic Speech Recognition](https://huggingface.co/tasks/automatic-speech-recognition)   | ✅ | [`~InferenceClient.automatic_speech_recognition`] |
-| | [Text-to-Speech](https://huggingface.co/tasks/text-to-speech)                 | ✅ | [`~InferenceClient.text_to_speech`] |
-| Computer Vision | [Image Classification](https://huggingface.co/tasks/image-classification)           | ✅ | [`~InferenceClient.image_classification`] |
-| | [Image Segmentation](https://huggingface.co/tasks/image-segmentation)             | ✅ | [`~InferenceClient.image_segmentation`] |
-| | [Image-to-Image](https://huggingface.co/tasks/image-to-image)                 | ✅ | [`~InferenceClient.image_to_image`] |
-| | [Image-to-Text](https://huggingface.co/tasks/image-to-text)                  | ✅ | [`~InferenceClient.image_to_text`] |
-| | [Object Detection](https://huggingface.co/tasks/object-detection)            | ✅ | [`~InferenceClient.object_detection`] |
-| | [Text-to-Image](https://huggingface.co/tasks/text-to-image)                  | ✅ | [`~InferenceClient.text_to_image`] |
-| | [Zero-Shot-Image-Classification](https://huggingface.co/tasks/zero-shot-image-classification)                  | ✅ | [`~InferenceClient.zero_shot_image_classification`] |
-| Multimodal | [Documentation Question Answering](https://huggingface.co/tasks/document-question-answering) | ✅ | [`~InferenceClient.document_question_answering`] |
-| | [Visual Question Answering](https://huggingface.co/tasks/visual-question-answering)      | ✅ | [`~InferenceClient.visual_question_answering`] |
-| NLP | [Conversational](https://huggingface.co/tasks/conversational)                 | ✅ | [`~InferenceClient.conversational`] |
-| | [Feature Extraction](https://huggingface.co/tasks/feature-extraction)             | ✅ | [`~InferenceClient.feature_extraction`] |
-| | [Fill Mask](https://huggingface.co/tasks/fill-mask)                      | ✅ | [`~InferenceClient.fill_mask`] |
-| | [Question Answering](https://huggingface.co/tasks/question-answering)             | ✅ | [`~InferenceClient.question_answering`] |
-| | [Sentence Similarity](https://huggingface.co/tasks/sentence-similarity) | ✅ | [`~InferenceClient.sentence_similarity`] |
-| | [Summarization](https://huggingface.co/tasks/summarization)                  | ✅ | [`~InferenceClient.summarization`] |
-| | [Table Question Answering](https://huggingface.co/tasks/table-question-answering)       | ✅ | [`~InferenceClient.table_question_answering`] |
-| | [Text Classification](https://huggingface.co/tasks/text-classification)            | ✅ | [`~InferenceClient.text_classification`] |
-| | [Text Generation](https://huggingface.co/tasks/text-generation)   | ✅ | [`~InferenceClient.text_generation`] |
-| | [Token Classification](https://huggingface.co/tasks/token-classification)           | ✅ | [`~InferenceClient.token_classification`] |
-| | [Translation](https://huggingface.co/tasks/translation)       | ✅ | [`~InferenceClient.translation`] |
-| | [Zero Shot Classification](https://huggingface.co/tasks/zero-shot-classification)       | ✅ | [`~InferenceClient.zero_shot_classification`] |
-| Tabular | [Tabular Classification](https://huggingface.co/tasks/tabular-classification)         | ✅ | [`~InferenceClient.tabular_classification`] |
-| | [Tabular Regression](https://huggingface.co/tasks/tabular-regression)             | ✅ | [`~InferenceClient.tabular_regression`] |
+| Domäne          | Aufgabe                                                                                       | Unterstützt | Dokumentation                                       |
+| --------------- | --------------------------------------------------------------------------------------------- | ----------- | --------------------------------------------------- |
+| Audio           | [Audio Classification](https://huggingface.co/tasks/audio-classification)                     | ✅           | [`~InferenceClient.audio_classification`]           |
+|                 | [Automatic Speech Recognition](https://huggingface.co/tasks/automatic-speech-recognition)     | ✅           | [`~InferenceClient.automatic_speech_recognition`]   |
+|                 | [Text-to-Speech](https://huggingface.co/tasks/text-to-speech)                                 | ✅           | [`~InferenceClient.text_to_speech`]                 |
+| Computer Vision | [Image Classification](https://huggingface.co/tasks/image-classification)                     | ✅           | [`~InferenceClient.image_classification`]           |
+|                 | [Image Segmentation](https://huggingface.co/tasks/image-segmentation)                         | ✅           | [`~InferenceClient.image_segmentation`]             |
+|                 | [Image-to-Image](https://huggingface.co/tasks/image-to-image)                                 | ✅           | [`~InferenceClient.image_to_image`]                 |
+|                 | [Image-to-Text](https://huggingface.co/tasks/image-to-text)                                   | ✅           | [`~InferenceClient.image_to_text`]                  |
+|                 | [Object Detection](https://huggingface.co/tasks/object-detection)                             | ✅           | [`~InferenceClient.object_detection`]               |
+|                 | [Text-to-Image](https://huggingface.co/tasks/text-to-image)                                   | ✅           | [`~InferenceClient.text_to_image`]                  |
+|                 | [Zero-Shot-Image-Classification](https://huggingface.co/tasks/zero-shot-image-classification) | ✅           | [`~InferenceClient.zero_shot_image_classification`] |
+| Multimodal      | [Documentation Question Answering](https://huggingface.co/tasks/document-question-answering)  | ✅           | [`~InferenceClient.document_question_answering`]    |
+|                 | [Visual Question Answering](https://huggingface.co/tasks/visual-question-answering)           | ✅           | [`~InferenceClient.visual_question_answering`]      |
+| NLP             | [Conversational](https://huggingface.co/tasks/conversational)                                 | ✅           | [`~InferenceClient.conversational`]                 |
+|                 | [Feature Extraction](https://huggingface.co/tasks/feature-extraction)                         | ✅           | [`~InferenceClient.feature_extraction`]             |
+|                 | [Fill Mask](https://huggingface.co/tasks/fill-mask)                                           | ✅           | [`~InferenceClient.fill_mask`]                      |
+|                 | [Question Answering](https://huggingface.co/tasks/question-answering)                         | ✅           | [`~InferenceClient.question_answering`]             |
+|                 | [Sentence Similarity](https://huggingface.co/tasks/sentence-similarity)                       | ✅           | [`~InferenceClient.sentence_similarity`]            |
+|                 | [Summarization](https://huggingface.co/tasks/summarization)                                   | ✅           | [`~InferenceClient.summarization`]                  |
+|                 | [Table Question Answering](https://huggingface.co/tasks/table-question-answering)             | ✅           | [`~InferenceClient.table_question_answering`]       |
+|                 | [Text Classification](https://huggingface.co/tasks/text-classification)                       | ✅           | [`~InferenceClient.text_classification`]            |
+|                 | [Text Generation](https://huggingface.co/tasks/text-generation)                               | ✅           | [`~InferenceClient.text_generation`]                |
+|                 | [Token Classification](https://huggingface.co/tasks/token-classification)                     | ✅           | [`~InferenceClient.token_classification`]           |
+|                 | [Translation](https://huggingface.co/tasks/translation)                                       | ✅           | [`~InferenceClient.translation`]                    |
+|                 | [Zero Shot Classification](https://huggingface.co/tasks/zero-shot-classification)             | ✅           | [`~InferenceClient.zero_shot_classification`]       |
+| Tabular         | [Tabular Classification](https://huggingface.co/tasks/tabular-classification)                 | ✅           | [`~InferenceClient.tabular_classification`]         |
+|                 | [Tabular Regression](https://huggingface.co/tasks/tabular-regression)                         | ✅           | [`~InferenceClient.tabular_regression`]             |
 
 
 <Tip>
@@ -190,93 +189,3 @@ Einige Aufgaben erfordern binäre Eingaben, zum Beispiel bei der Arbeit mit Bild
 [{'score': 0.9779096841812134, 'label': 'Blenheim spaniel'}, ...]
 ```
 
-## Legacy InferenceAPI client
-
-Der [`InferenceClient`] dient als Ersatz für den veralteten [`InferenceApi`]-Client. Er bietet spezifische Unterstützung für Aufgaben und behandelt Inferenz sowohl auf der [Inferenz API](https://huggingface.co/docs/api-inference/index) als auch auf den [Inferenz Endpunkten](https://huggingface.co/docs/inference-endpoints/index).
-
-Hier finden Sie eine kurze Anleitung, die Ihnen hilft, von [`InferenceApi`] zu [`InferenceClient`] zu migrieren.
-
-### Initialisierung
-
-Ändern Sie von
-
-```python
->>> from huggingface_hub import InferenceApi
->>> inference = InferenceApi(repo_id="bert-base-uncased", token=API_TOKEN)
-```
-
-zu
-
-```python
->>> from huggingface_hub import InferenceClient
->>> inference = InferenceClient(model="bert-base-uncased", token=API_TOKEN)
-```
-
-### Ausführen einer bestimmten Aufgabe
-
-Ändern Sie von
-
-```python
->>> from huggingface_hub import InferenceApi
->>> inference = InferenceApi(repo_id="paraphrase-xlm-r-multilingual-v1", task="feature-extraction")
->>> inference(...)
-```
-
-zu
-
-```python
->>> from huggingface_hub import InferenceClient
->>> inference = InferenceClient()
->>> inference.feature_extraction(..., model="paraphrase-xlm-r-multilingual-v1")
-```
-
-<Tip>
-
-Dies ist der empfohlene Weg, um Ihren Code an [`InferenceClient`] anzupassen. Dadurch können Sie von den aufgabenspezifischen Methoden wie `feature_extraction` profitieren.
-
-</Tip>
-
-### Eigene Anfragen ausführen
-
-Ändern Sie von
-
-```python
->>> from huggingface_hub import InferenceApi
->>> inference = InferenceApi(repo_id="bert-base-uncased")
->>> inference(inputs="The goal of life is [MASK].")
-[{'sequence': 'the goal of life is life.', 'score': 0.10933292657136917, 'token': 2166, 'token_str': 'life'}]
-```
-zu
-
-```python
->>> from huggingface_hub import InferenceClient
->>> client = InferenceClient()
->>> response = client.post(json={"inputs": "The goal of life is [MASK]."}, model="bert-base-uncased")
->>> response.json()
-[{'sequence': 'the goal of life is life.', 'score': 0.10933292657136917, 'token': 2166, 'token_str': 'life'}]
-```
-
-### Mit Parametern ausführen
-
-Ändern Sie von
-
-```python
->>> from huggingface_hub import InferenceApi
->>> inference = InferenceApi(repo_id="typeform/distilbert-base-uncased-mnli")
->>> inputs = "Hi, I recently bought a device from your company but it is not working as advertised and I would like to get reimbursed!"
->>> params = {"candidate_labels":["refund", "legal", "faq"]}
->>> inference(inputs, params)
-{'sequence': 'Hi, I recently bought a device from your company but it is not working as advertised and I would like to get reimbursed!', 'labels': ['refund', 'faq', 'legal'], 'scores': [0.9378499388694763, 0.04914155602455139, 0.013008488342165947]}
-```
-
-zu
-
-```python
->>> from huggingface_hub import InferenceClient
->>> client = InferenceClient()
->>> inputs = "Hi, I recently bought a device from your company but it is not working as advertised and I would like to get reimbursed!"
->>> params = {"candidate_labels":["refund", "legal", "faq"]}
->>> response = client.post(json={"inputs": inputs, "parameters": params}, model="typeform/distilbert-base-uncased-mnli")
->>> response.json()
-{'sequence': 'Hi, I recently bought a device from your company but it is not working as advertised and I would like to get reimbursed!', 'labels': ['refund', 'faq', 'legal'], 'scores': [0.9378499388694763, 0.04914155602455139, 0.013008488342165947]}
-```
diff --git a/docs/source/en/guides/inference.md b/docs/source/en/guides/inference.md
@@ -11,10 +11,6 @@ The `huggingface_hub` library provides a unified interface to run inference acro
 2.  [Inference Endpoints](https://huggingface.co/docs/inference-endpoints/index): a product to easily deploy models to production. Inference is run by Hugging Face in a dedicated, fully managed infrastructure on a cloud provider of your choice.
 3.  Local endpoints: you can also run inference with local inference servers like [llama.cpp](https://github.com/ggerganov/llama.cpp), [Ollama](https://ollama.com/), [vLLM](https://github.com/vllm-project/vllm), [LiteLLM](https://docs.litellm.ai/docs/simple_proxy), or [Text Generation Inference (TGI)](https://github.com/huggingface/text-generation-inference) by connecting the client to these local endpoints.
 
-These services can all be called from the [`InferenceClient`] object. It acts as a replacement for the legacy
-[`InferenceApi`] client, adding specific support for tasks and third-party providers.
-Learn how to migrate to the new client in the [Legacy InferenceAPI client](#legacy-inferenceapi-client) section.
-
 <Tip>
 
 [`InferenceClient`] is a Python client making HTTP calls to our APIs. If you want to make the HTTP calls directly using

diff --git a/docs/source/en/package_reference/inference_client.md b/docs/source/en/package_reference/inference_client.md
@@ -34,16 +34,3 @@ pip install --upgrade huggingface_hub[inference]
 ## InferenceTimeoutError
 
 [[autodoc]] InferenceTimeoutError
-
-## InferenceAPI
-
-[`InferenceAPI`] is the legacy way to call the Inference API. The interface is more simplistic and requires knowing
-the input parameters and output format for each task. It also lacks the ability to connect to other services like
-Inference Endpoints or AWS SageMaker. [`InferenceAPI`] will soon be deprecated so we recommend using [`InferenceClient`]
-whenever possible. Check out [this guide](../guides/inference#legacy-inferenceapi-client) to learn how to switch from
-[`InferenceAPI`] to [`InferenceClient`] in your scripts.
-
-[[autodoc]] InferenceApi
-    - __init__
-    - __call__
-    - all