OpenAI Batch Interface - Alibaba Cloud Model Studio - Alibaba Cloud Documentation Center

Model Studio offers a Batch interface that is compatible with OpenAI. It allows for the batch submission of tasks as files and supports asynchronous execution. This service processes large-scale data offline during non-peak hours and delivers results upon task completion or when the maximum wait time is reached, at 50% of the cost of real-time API calls.

To use this feature in the console, see Batch Inference.

Prerequisites

You have activated Alibaba Cloud Model Studio and obtained an API key.
We recommend that you set your API key as an environment variable to reduce the risk of API key leakage.
To use the OpenAI Python SDK, you must first run the following command to install it.
```
pip3 install -U openai
```

Supported models

Text generation models: qwen-max, qwen-plus, qwen-turbo

Billing

Batch calls are charged at 50% of the price for real-time calls. For specific pricing, see Text generation - Qwen.

Batch calling does not support discounts such as free quota or context cache.

Get started

Before starting a batch, you can use batch-test-model to perform an end-to-end test, including: validating input data, creating a task, querying task results, and downloading result files. Note:

The test file must meet the input file format requirements. Also, it must not exceed 1 MB in size and contain no more than 100 lines.
Concurrency limit: Up to 2 parallel tasks.
Resource usage: The test model will not perform inference, so it does not incur model inference fees.

Perform the following steps:

Prepare the test file

Download the sample file test_model.jsonl that contains request information. Make sure it is in the same directory as the Python script below

Sample content: The model parameter is set to batch-test-model, and the url is set to the /v1/chat/ds-test endpoint.

{"custom_id":"1","method":"POST","url":"/v1/chat/ds-test","body":{"model":"batch-test-model","messages":[{"role":"system","content":"You are a helpful assistant."},{"role":"user","content":"Hello! How can I help you?"}]}}
{"custom_id":"2","method":"POST","url":"/v1/chat/ds-test","body":{"model":"batch-test-model","messages":[{"role":"system","content":"You are a helpful assistant."},{"role":"user","content":"What is 2+2?"}]}}

Run the script

Execute this Python script

Edit file paths or other parameters as needed.

import os
from pathlib import Path
from openai import OpenAI
import time

# Initialize the client
client = OpenAI(
    # If you haven't configured environment variables, you can replace the line below with api_key="sk-xxx" using your Alibaba Cloud Model Studio API Key, but it's not recommended to hardcode API Keys directly in your code in production environments to reduce the risk of API key leakage.
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1"  # Alibaba Cloud Model Studio service base_url
)

def upload_file(file_path):
    print(f"Uploading JSONL file containing request information...")
    file_object = client.files.create(file=Path(file_path), purpose="batch")
    print(f"File uploaded successfully. File ID obtained: {file_object.id}\n")
    return file_object.id

def create_batch_job(input_file_id):
    print(f"Creating a batch based on the file ID...")
    # Note: The endpoint parameter value here must be consistent with the url field in the input file. For the test model (batch-test-model), fill in /v1/chat/ds-test, for other models fill in the `/v1/chat/completions` endpoint
    batch = client.batches.create(input_file_id=input_file_id, endpoint="/v1/chat/ds-test", completion_window="24h")
    print(f"batch creation completed. batch ID obtained: {batch.id}\n")
    return batch.id

def check_job_status(batch_id):
    print(f"Checking batch status...")
    batch = client.batches.retrieve(batch_id=batch_id)
    print(f"Batch status: {batch.status}\n")
    return batch.status

def get_output_id(batch_id):
    print(f"Getting the output file ID for successfully executed requests in the batch...")
    batch = client.batches.retrieve(batch_id=batch_id)
    print(f"Output file ID: {batch.output_file_id}\n")
    return batch.output_file_id

def get_error_id(batch_id):
    print(f"Getting the output file ID for failed requests in the batch...")
    batch = client.batches.retrieve(batch_id=batch_id)
    print(f"Error file ID: {batch.error_file_id}\n")
    return batch.error_file_id

def download_results(output_file_id, output_file_path):
    print(f"Printing and downloading the successful request results of the batch...")
    content = client.files.content(output_file_id)
    # Print part of the content for testing
    print(f"Printing the first 1000 characters of the successful request results: {content.text[:1000]}...\n")
    # Save the result file locally
    content.write_to_file(output_file_path)
    print(f"Complete output results have been saved to the local output file result.jsonl\n")

def download_errors(error_file_id, error_file_path):
    print(f"Printing and downloading the failed request information of the batch...")
    content = client.files.content(error_file_id)
    # Print part of the content for testing
    print(f"Printing the first 1000 characters of the failed request information: {content.text[:1000]}...\n")
    # Save the error information file locally
    content.write_to_file(error_file_path)
    print(f"Complete failed request information has been saved to the local error file error.jsonl\n")

def main():
    # File paths
    input_file_path = "test_model.jsonl"  # Can be replaced with your input file path
    output_file_path = "result.jsonl"  # Can be replaced with your output file path
    error_file_path = "error.jsonl"  # Can be replaced with your error file path
    try:
        # Step 1: Upload the JSONL file containing request information to get the input file ID
        input_file_id = upload_file(input_file_path)
        # Step 2: Create a Batch based on the input file ID
        batch_id = create_batch_job(input_file_id)
        # Step 3: Check the Batch status until it ends
        status = ""
        while status not in ["completed", "failed", "expired", "cancelled"]:
            status = check_job_status(batch_id)
            print(f"Waiting for task completion...")
            time.sleep(10)  # Wait 10 seconds before checking the status again
        # If the task fails, print the error message and exit
        if status == "failed":
            batch = client.batches.retrieve(batch_id)
            print(f"batch failed. Error message: {batch.errors}\n")
            print(f"See error code documentation: https://www.alibabacloud.com/help/en/model-studio/developer-reference/error-code")
            return
        # Step 4: Download results: If the output file ID is not empty, print the first 1000 characters of the successful request results and download the complete successful request results to the local output file;
        # If the error file ID is not empty, print the first 1000 characters of the failed request information and download the complete failed request information to the local error file.
        output_file_id = get_output_id(batch_id)
        if output_file_id:
            download_results(output_file_id, output_file_path)
        error_file_id = get_error_id(batch_id)
        if error_file_id:
            download_errors(error_file_id, error_file_path)
            print(f"See error code documentation: https://www.alibabacloud.com/help/en/model-studio/developer-reference/error-code")
    except Exception as e:
        print(f"An error occurred: {e}")
        print(f"See error code documentation: https://www.alibabacloud.com/help/en/model-studio/developer-reference/error-code")

if __name__ == "__main__":
    main()

Verify the test results

The task status is completed

result.jsonl contains the fixed response {"content":"This is a test result."}

{"id":"a2b1ae25-21f4-4d9a-8634-99a29926486c","custom_id":"1","response":{"status_code":200,"request_id":"a2b1ae25-21f4-4d9a-8634-99a29926486c","body":{"created":1743562621,"usage":{"completion_tokens":6,"prompt_tokens":20,"total_tokens":26},"model":"batch-test-model","id":"chatcmpl-bca7295b-67c3-4b1f-8239-d78323bb669f","choices":[{"finish_reason":"stop","index":0,"message":{"content":"This is a test result."}}],"object":"chat.completion"}},"error":null}
{"id":"39b74f09-a902-434f-b9ea-2aaaeebc59e0","custom_id":"2","response":{"status_code":200,"request_id":"39b74f09-a902-434f-b9ea-2aaaeebc59e0","body":{"created":1743562621,"usage":{"completion_tokens":6,"prompt_tokens":20,"total_tokens":26},"model":"batch-test-model","id":"chatcmpl-1e32a8ba-2b69-4dc4-be42-e2897eac9e84","choices":[{"finish_reason":"stop","index":0,"message":{"content":"This is a test result."}}],"object":"chat.completion"}},"error":null}

In case of errors, see Error messages for solutions.

After the test, follow these steps to execute a batch.

Prepare an input file according to the input file format requirements. In the file, set the model parameter to a supported model, and set the url field to: /v1/chat/completions
Replace the endpoint parameter in the Python script above to match the url in the input file
Run the script and wait for the task to complete. If the task is successful, an output result file result.jsonl will be generated in the same directory
If the task fails, the program will exit and print the error message.
If an error file ID is returned, the error file error.jsonl will be generated in the same directory for troubleshooting.
Exceptions that occur during the process will be caught and error messages will be printed.

File format

Input file format

The input file for a batch is a JSONL file with the following requirements:

Each line contains a request in the JSON format.
A single batch task can contain up to 50,000 requests.
The batch file's maximum size is 500 MB.
The maximum size for an individual line within the file is 1 MB.
Each line's content must comply with the context length limits specific to each model.

Set the url field in the file and the endpoint parameter in your code to /v1/chat/completions.

Single-line request example:

{"custom_id":"1","method":"POST","url":"/v1/chat/completions","body":{"model":"qwen-max","messages":[{"role":"system","content":"You are a helpful assistant."},{"role":"user","content":"What is 2+2?"}]}}

Multi-line request example:

{"custom_id":"1","method":"POST","url":"/v1/chat/completions","body":{"model":"qwen-max","messages":[{"role":"system","content":"You are a helpful assistant."},{"role":"user","content":"Hi, how can I help you?"}]}}
{"custom_id":"2","method":"POST","url":"/v1/chat/completions","body":{"model":"qwen-max","messages":[{"role":"system","content":"You are a helpful assistant."},{"role":"user","content":"What is 2+2?"}]}}

Request parameters

Field	Type	Required	Description
custom_id	String	Yes	The user-defined request ID. Each line represents a request with a unique `custom_id`. After the batch ends, you can find the result corresponding to the `custom_id` in the result file.
method	String	Yes	The request method. Currently, only POST is supported.
url	String	Yes	The base URL. Must be consistent with the endpoint field when creating a batch. For `batch-test-model`, set to `/v1/chat/ds-test` For other models, set to `/v1/chat/completions`
body	Object	Yes	The request body.
body.model	String	Yes	The model used for this batch. Important Requests in a task must use the same model.
body.messages	Array	Yes	The messages array. `[ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "What is 2+2?"} ]`

Convert CSV to JSONL

If you have a CSV file where the first column is custom_id and the second column is content, you can quickly create a JSONL file that meets the requirements using the Python code below. The CSV file must reside in the same directory as the Python script.

You can also use the template file provided in this topic. The specific steps are as follows:

Download the template file and place it in the same directory as the Python script below;
The CSV template file has the first column as request ID (custom_id) and the second column as content. You can paste your queries into this file.

After running the Python script code below, a JSONL file named input_demo.jsonl that meets the file format requirements will be generated in the same directory.

Edit file paths or other parameters as needed.

import csv
import json
def messages_builder_example(content):
    messages = [{"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": content}]
    return messages

with open("input_demo.csv", "r") as fin:
    with open("input_demo.jsonl", 'w', encoding='utf-8') as fout:
        csvreader = csv.reader(fin)
        for row in csvreader:
            body = {"model": "qwen-turbo", "messages": messages_builder_example(row[1])}
            # Use the `/v1/chat/completions` endpoint.
            request = {"custom_id": row[0], "method": "POST", "url": "/v1/chat/completions", "body": body}
            fout.write(json.dumps(request, separators=(',', ':'), ensure_ascii=False) + "\n", )

Output file format

The output is a JSONL file, with one JSON per line, corresponding to one request result.

Sample response

Single-line result example:

{"id":"73291560-xxx","custom_id":"1","response":{"status_code":200,"request_id":"73291560-7616-97bf-87f2-7d747bbe84fd","body":{"created":1742303743,"usage":{"completion_tokens":7,"prompt_tokens":26,"total_tokens":33},"model":"qwen-max","id":"chatcmpl-73291560-7616-97bf-87f2-7d747bbe84fd","choices":[{"finish_reason":"stop","index":0,"message":{"content":"2+2 equals 4."}}],"object":"chat.completion"}},"error":null}

Multi-line result example:

{"id":"c308ef7f-xxx","custom_id":"1","response":{"status_code":200,"request_id":"c308ef7f-0824-9c46-96eb-73566f062426","body":{"created":1742303743,"usage":{"completion_tokens":35,"prompt_tokens":26,"total_tokens":61},"model":"qwen-max","id":"chatcmpl-c308ef7f-0824-9c46-96eb-73566f062426","choices":[{"finish_reason":"stop","index":0,"message":{"content":"Hello! Of course I can help. Whether you need information queries, learning materials, methods to solve problems, or any other assistance, I'm here to support you. Please tell me what kind of help you need?"}}],"object":"chat.completion"}},"error":null}
{"id":"73291560-xxx","custom_id":"2","response":{"status_code":200,"request_id":"73291560-7616-97bf-87f2-7d747bbe84fd","body":{"created":1742303743,"usage":{"completion_tokens":7,"prompt_tokens":26,"total_tokens":33},"model":"qwen-max","id":"chatcmpl-73291560-7616-97bf-87f2-7d747bbe84fd","choices":[{"finish_reason":"stop","index":0,"message":{"content":"2+2 equals 4."}}],"object":"chat.completion"}},"error":null}

Response parameters

Field	Type	Required	Description
id	String	Yes	The request ID.
custom_id	String	Yes	The user-defined request ID.
response	Object	No	The request result.
error	Object	No	The error response result.
error.code	String	No	The error code.
error.message	String	No	The error message.
completion_tokens	Integer	No	The tokens in the completion.
prompt_tokens	Integer	No	The tokens in the prompt.
model	String	No	The model used in this task.

Convert JSONL to CSV

Compared to JSONL, CSV files usually contain only the necessary data values without additional key names or other metadata, making them suitable for automated scripts and batches. If you need to convert a batch output JSONL file into a CSV file, you can use the following Python code.

Ensure that result.jsonl is in the same directory as the Python script below. After running the code below, a CSV file named result.csv will be generated.

If you need to adjust file paths or other parameters, please modify the code as needed.

import json
import csv
columns = ["custom_id",
           "model",
           "request_id",
           "status_code",
           "error_code",
           "error_message",
           "created",
           "content",
           "usage"]

def dict_get_string(dict_obj, path):
    obj = dict_obj
    try:
        for element in path:
            obj = obj[element]
        return obj
    except:
        return None

with open("result.jsonl", "r") as fin:
    with open("result.csv", 'w', encoding='utf-8') as fout:
        rows = [columns]
        for line in fin:
            request_result = json.loads(line)
            row = [dict_get_string(request_result, ["custom_id"]),
                   dict_get_string(request_result, ["response", "body", "model"]),
                   dict_get_string(request_result, ["response", "request_id"]),
                   dict_get_string(request_result, ["response", "status_code"]),
                   dict_get_string(request_result, ["error", "error_code"]),
                   dict_get_string(request_result, ["error", "error_message"]),
                   dict_get_string(request_result, ["response", "body", "created"]),
                   dict_get_string(request_result, ["response", "body", "choices", 0, "message", "content"]),
                   dict_get_string(request_result, ["response", "body", "usage"])]
            rows.append(row)
        writer = csv.writer(fout)
        writer.writerows(rows)

When a CSV file contains Chinese characters and you encounter garbled text when opening it with Excel, you can use a text editor (such as Sublime) to convert the CSV file's encoding to GBK, and then open it with Excel. Another method is to create a new Excel file and specify the correct encoding format (UTF-8) when importing the data.

Detailed process

1. Prepare and upload files

Before creating a Batch, you need to upload a JSONL file that meets the input file format requirements through the following file upload interface. Then, obtain the file_id field and set the purpose parameter to batch

You can upload a single file of up to 500 MB. The Model Studio storage under each Alibaba Cloud account supports up to 10,000 files, with a total size limit of 100 GB. The files currently have no expiration date.

OpenAI Python SDK

Sample request

import os
from pathlib import Path
from openai import OpenAI

client = OpenAI(
    # If you haven't configured environment variables, you can replace the line below with api_key="sk-xxx" using your Alibaba Cloud Model Studio API key. However, it's not recommended to hardcode API Keys directly in your code in production environments to reduce the risk of API key leakage.
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1",  # Alibaba Cloud Model Studio service base_url
)

# test.jsonl is a local example file, purpose must be batch
file_object = client.files.create(file=Path("test.jsonl"), purpose="batch")

print(file_object.model_dump_json())

Content of test.jsonl:

{"custom_id":"1","method":"POST","url":"/v1/chat/completions","body":{"model":"qwen-plus","messages":[{"role":"system","content":"You are a helpful assistant."},{"role":"user","content":"Hello! How can I help you?"}]}}
{"custom_id":"2","method":"POST","url":"/v1/chat/completions","body":{"model":"qwen-plus","messages":[{"role":"system","content":"You are a helpful assistant."},{"role":"user","content":"What is 2+2?"}]}}

Sample response

{
    "id": "file-batch-xxx",
    "bytes": 437,
    "created_at": 1742304153,
    "filename": "test.jsonl",
    "object": "file",
    "purpose": "batch",
    "status": "processed",
    "status_details": null
}

curl

Sample request

curl -X POST https://dashscope-intl.aliyuncs.com/compatible-mode/v1/files \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
--form 'file=@"test.jsonl"' \
--form 'purpose="batch"'

Content of test.jsonl:

{"custom_id": "request-1", "method": "POST", "url": "/v1/chat/completions", "body": {"model": "qwen-turbo", "messages": [{"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "What is 2+2?"}]}}

Sample response

{
    "id": "file-batch-xxx",
    "bytes": 231,
    "created_at": 1729065815,
    "filename": "test.jsonl",
    "object": "file",
    "purpose": "batch",
    "status": "processed",
    "status_details": null
}

2. Create a batch

Use the input_file_id parameter returned in 1. Prepare and upload files to create a batch.

Rate limit: For each Alibaba Cloud account, 100 requests per minute, with a maximum of 100 running tasks (including unfinished tasks). If you exceed the maximum number of tasks, you must wait for tasks to complete before creating new ones.

OpenAI Python SDK

Sample request

import os
from openai import OpenAI

client = OpenAI(
    # If you haven't configured environment variables, you can replace the line below with api_key="sk-xxx" using your Alibaba Cloud Model Studio API key. However, it's not recommended to hardcode API Keys directly in your code in production environments to reduce the risk of API key leakage.
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1",  # Alibaba Cloud Model Studio service base_url
)

batch = client.batches.create(
    input_file_id="file-batch-xxx",  # File ID returned from upload
    endpoint="/v1/chat/completions",  # For test model batch-test-model set to /v1/chat/ds-test, for other models set to the `/v1/chat/completions` endpoint
    completion_window="24h",
    metadata={'ds_name':"Task Name",'ds_description':'Task Description'} # metadata, optional field, used to create task name and description
)
print(batch)

curl

Sample request

curl -X POST https://dashscope-intl.aliyuncs.com/compatible-mode/v1/batches \
  -H "Authorization: Bearer $DASHSCOPE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "input_file_id": "file-batch-xxx",
    "endpoint": "/v1/chat/completions",
    "completion_window": "24h",
    "metadata":{"ds_name":"Task Name","ds_description":"Task Description"}
  }'

Replace the value of input_file_id with the actual value.

Request parameters

Field	Type	Passing method	Required	Description
input_file_id	String	Body	Yes	The ID of the input file for the batch. Use the file ID returned by the prepare and upload files interface, such as `file-batch-xxx`.
endpoint	String	Body	Yes	The path, which must be consistent with the url field in the input file. For `batch-test-model`, set the endpoint parameter to `/v1/chat/ds-test` For other models, set the endpoint parameter to `/v1/chat/completions`
completion_window	String	Body	Yes	The wait time, from 24 hours to 336 hours, only integers are supported. Units: "h" (hour) and "d" (day), such as "24h" or "14d".
metadata	Map	Body	No	The task extension metadata, additional information in key-value pairs.
metadata.ds_name	String	Body	No	The task name. Example: `"ds_name": "Batch"` The name can be up to 20 characters. If this field is defined multiple times, the last value passed will be used.
metadata.ds_description	String	Body	No	The task description. Example: `"ds_description": "Batch inference task test"` The description can be up to 200 characters. If this field is defined multiple times, the last value passed will be used.

Sample response

{
    "id": "batch_xxx",
    "object": "batch",
    "endpoint": "/v1/chat/completions",
    "errors": null,
    "input_file_id": "file-batch-xxx",
    "completion_window": "24h",
    "status": "validating",
    "output_file_id": null,
    "error_file_id": null,
    "created_at": 1742367779,
    "in_progress_at": null,
    "expires_at": null,
    "finalizing_at": null,
    "completed_at": null,
    "failed_at": null,
    "expired_at": null,
    "cancelling_at": null,
    "cancelled_at": null,
    "request_counts": {
        "total": 0,
        "completed": 0,
        "failed": 0
    },
    "metadata": {
        "ds_name": "Task Name",
        "ds_description": "Task Description"
    }
}

Response parameters

Field	Type	Description
id	String	The batch ID.
object	String	The object type, fixed to `batch`.
endpoint	String	The endpoint field.
errors	Map	The error message.
input_file_id	String	The input file ID field.
completion_window	String	The processing time, from 24 hours to 336 hours, only integers are supported. Units: "h" (hour) and "d" (day), such as "24h" or "14d".
status	String	Task status, including: validating, failed, in_progress, finalizing, completed, expired, cancelling, cancelled.
output_file_id	String	The output file ID field for successful requests.
error_file_id	String	The output file ID field for failed requests.
created_at	Integer	The timestamp (in seconds) when the task was created.
in_progress_at	Integer	The timestamp (in seconds) when the task started running.
expires_at	Integer	The timestamp (in seconds) when the task started to expire.
finalizing_at	Integer	The timestamp (in seconds) when the task started to finalize.
completed_at	Integer	The timestamp (in seconds) when the task was completed.
failed_at	Integer	The timestamp (in seconds) when the task failed.
expired_at	Integer	The timestamp (in seconds) when the task expired.
cancelling_at	Integer	The timestamp (in seconds) when the task was set to cancelling.
cancelled_at	Integer	The timestamp (in seconds) when the task was cancelled.
request_counts	Map	The number of requests in different statuses.
metadata	Map	The metadata information, in key-value pairs.
metadata.ds_name	String	The name of the current task.
metadata.ds_description	String	The description of the current task.

3. Query and manage a batch

Query task details

Query the details of a batch by providing the task ID obtained from 2. Create a batch. Only batches created within 30 days can be queried.

Rate limit: 300 requests per minute per Alibaba Cloud account. Because batch takes some time, you can query once per minute after creating a batch.

OpenAI Python SDK

Sample request

import os
from openai import OpenAI

client = OpenAI(
    # If you haven't configured environment variables, you can replace the line below with api_key="sk-xxx" using your Alibaba Cloud Model Studio API key. However, it's not recommended to hardcode API Keys directly in your code in production environments to reduce the risk of API key leakage.
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1",  # Alibaba Cloud Model Studio service base_url
)
batch = client.batches.retrieve("batch_id")  # Replace batch_id with the Batch ID
print(batch)

curl

Sample request

curl --request GET 'https://dashscope-intl.aliyuncs.com/compatible-mode/v1/batches/batch_id' \
 -H "Authorization: Bearer $DASHSCOPE_API_KEY"

Replace batch_id with the actual value.

Request parameters

Field	Type	Passing method	Required	Description
batch_id	String	Path	Yes	The ID of the batch to be queried, returned from 2. Create a batch. The ID starts with "batch", such as "batch_xxx".

Sample response

See the Sample response for creating a batch.

Response parameters

See the response parameters for creating a batch.

Use the returned output_file_id and error_file_id to download result files.

Query task list

Use the batches.list() method to query a list of batches and gradually retrieve the complete list with a pagination mechanism. Provide the last batch ID from the previous query result as the after parameter value, and you can get the next page of data. You can also use the limit parameter to limit the number of tasks to return.

Rate limit: 100 requests per minute per Alibaba Cloud account.

OpenAI Python SDK

Sample request

import os
from openai import OpenAI

client = OpenAI(
    # If you haven't configured environment variables, you can replace the line below with api_key="sk-xxx" using your Alibaba Cloud Model Studio API key. However, it's not recommended to hardcode API Keys directly in your code in production environments to reduce the risk of API key leakage.
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1",  # Alibaba Cloud Model Studio service base_url
)
batches = client.batches.list(after="batch_id", limit=20)  # Replace batch_id with the Batch task ID
print(batches)

curl

Sample request

curl --request GET  'https://dashscope-intl.aliyuncs.com/compatible-mode/v1/batches?limit=20&after=batch_id' \
 -H "Authorization: Bearer $DASHSCOPE_API_KEY"

Replace batch_id in after=batch_id with the actual value, and set limit to the desired number of tasks to return.

Request parameters

Field

Type

Passing method

Required

Description

after

String

Query

The cursor for pagination. Set after parameter to a batch ID to query data after this task.

In paged queries, assign the last batch ID (last_id) to after to obtain the next page of data.

For example, if this query returns 20 lines of data and last_id is batch_xxx, set after=batch_xxx in the next query.

limit

Integer

Query

The number of tasks returned per query. Valid range: [1,100]. Default value: 20.

Sample response

{
  "object": "list",
  "data": [
    {
      "id": "batch_xxx",
      "object": "batch",
      "endpoint": "/v1/chat/completions",
      "errors": null,
      "input_file_id": "file-batch-xxx",
      "completion_window": "24h",
      "status": "completed",
      "output_file_id": "file-batch_output-xxx",
      "error_file_id": null,
      "created_at": 1722234109,
      "in_progress_at": 1722234109,
      "expires_at": null,
      "finalizing_at": 1722234165,
      "completed_at": 1722234165,
      "failed_at": null,
      "expired_at": null,
      "cancelling_at": null,
      "cancelled_at": null,
      "request_counts": {
        "total": 100,
        "completed": 95,
        "failed": 5
      },
      "metadata": {}
    },
    { ... }
  ],
  "first_id": "batch_xxx",
  "last_id": "batch_xxx",
  "has_more": true
}

Response parameters

Field	Type	Description
object	String	The object type, fixed to list.
data	Array	Batch objects, same as the response parameters of 2. Create a batch.
first_id	String	The first task ID on the current page.
last_id	String	The last task ID on the current page.
has_more	Boolean	Indicates whether the current page is followed by another page.

Cancel a task

Cancel a specific task by providing its task ID returned from 2. Create a batch.

Rate limit: 100 requests per minute per Alibaba Cloud account.

OpenAI Python SDK

Sample request

import os
from openai import OpenAI

client = OpenAI(
    # If you haven't configured environment variables, you can replace the line below with api_key="sk-xxx" using your Alibaba Cloud Model Studio API key. However, it's not recommended to hardcode API Keys directly in your code in production environments to reduce the risk of API key leakage.
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1",  # Alibaba Cloud Model Studio service base_url
)
batch = client.batches.cancel("batch_id")  # Replace batch_id with the Batch ID
print(batch)

curl

Sample request

curl --request POST 'https://dashscope-intl.aliyuncs.com/compatible-mode/v1/batches/batch_id/cancel' \
 -H "Authorization: Bearer $DASHSCOPE_API_KEY"

Replace batch_id with the actual value.

Request parameters

Field	Type	Passing method	Required	Description
batch_id	String	Path	Yes	The ID of the task to cancel, starting with "batch", such as "batch_xxx".

Sample response

See the sample response for creating a batch.

Response parameters

See the response parameters for creating a batch.

4. Download result file

After a task is completed, you can download the result file.

To download the result file, you need the file_id field, which is the output_file_id parameter from querying task details or querying task list. Only files with file_id values starting with file-batch_output can be downloaded.

OpenAI Python SDK

Use the content method to obtain the content of result file. Use the write_to_file method to save it locally.

Sample request

import os
from openai import OpenAI

client = OpenAI(
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1",
)
content = client.files.content(file_id="file-batch_output-xxx")
# Print the result file content
print(content.text)
# Save the result file to local
content.write_to_file("result.jsonl")

Sample response

{"id":"c308ef7f-xxx","custom_id":"1","response":{"status_code":200,"request_id":"c308ef7f-0824-9c46-96eb-73566f062426","body":{"created":1742303743,"usage":{"completion_tokens":35,"prompt_tokens":26,"total_tokens":61},"model":"qwen-plus","id":"chatcmpl-c308ef7f-0824-9c46-96eb-73566f062426","choices":[{"finish_reason":"stop","index":0,"message":{"content":"Hello! Of course. Whether you need information queries, learning materials, methods to solve problems, or any other assistance, I am here to provide support. Please tell me what kind of help you need?"}}],"object":"chat.completion"}},"error":null}
{"id":"73291560-xxx","custom_id":"2","response":{"status_code":200,"request_id":"73291560-7616-97bf-87f2-7d747bbe84fd","body":{"created":1742303743,"usage":{"completion_tokens":7,"prompt_tokens":26,"total_tokens":33},"model":"qwen-plus","id":"chatcmpl-73291560-7616-97bf-87f2-7d747bbe84fd","choices":[{"finish_reason":"stop","index":0,"message":{"content":"2+2 equals 4."}}],"object":"chat.completion"}},"error":null}

curl

Use the GET method and specify the file_id in the URL to download the result file.

Sample request

curl -X GET https://dashscope-intl.aliyuncs.com/compatible-mode/v1/files/file-batch_output-xxx/content \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" > result.jsonl

Request parameters

Field	Type	Passing method	Required	Description
file_id	string	Path	Yes	The ID of the file to be downloaded, which is `output_file_id` returned from querying task details or querying task list.

Sample response

The JSONL file containing the batch results, see Output file format.

Extended features

Filter and query task list

OpenAI Python SDK

Sample request

import os
from openai import OpenAI

client = OpenAI(
    # If environment variables are not configured, you can replace the line below with api_key="sk-xxx" using your Alibaba Cloud Model Studio API key. However, it is not recommended to hard-code API Keys directly in your code in production environments to reduce the risk of API key leakage.
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1",  # Alibaba Cloud Model Studio service base_url
)
batches = client.batches.list(after="batch_xxx", limit=2,extra_query={'ds_name':'Task Name','input_file_ids':'file-batch-xxx,file-batch-xxx','status':'completed,expired','create_after':'20250304000000','create_before':'20250306123000'})
print(batches)

curl

Sample request

curl --request GET  'https://dashscope-intl.aliyuncs.com/compatible-mode/v1/batches?after=batch_xxx&limit=2&ds_name=Batch&input_file_ids=file-batch-xxx,file-batch-xxx&status=completed,failed&create_after=20250303000000&create_before=20250320000000' \
 -H "Authorization: Bearer $DASHSCOPE_API_KEY"

Replace batch_id in after=batch_id with the actual value. Set limit to the number of tasks to return. Set ds_name to a substring of the task name for fuzzy matching. The value of input_file_ids can include multiple file IDs. Fill status with multiple batch statuses. Fill the values of the create_after and create_before fields with time points.

Request parameters

Field	Type	Passing method	Required	Description
ds_name	String	Query	No	Filter tasks by name using fuzzy matching. Specify a substring to match task names containing that text. For example, entering "Batch" can match "Batch", "Batch_20240319", and others.
input_file_ids	String	Query	No	Filter up to 20 file IDs, separated by commas. The file IDs are returned from Prepare and upload files.
status	String	Query	No	Filter multiple statuses, separated by commas, including: validating, failed, in_progress, finalizing, completed, expired, cancelling, cancelled.
create_after	String	Query	No	Filter tasks created after this time point, format: `yyyyMMddHHmmss`. For example, if you want to filter tasks created after March 4, 2025, 00:00:00, set to `20250304000000`.
create_before	String	Query	No	Filter tasks created before this time point, format: `yyyyMMddHHmmss`. For example, if you want to filter tasks created before March 4, 2025, 12:30:00, set to `20250304123000`.

Sample response

See the sample response in Query task list.

Response parameters

See the response parameters in Query task list.

Error codes

If the call fails and returns an error message, refer to the Error Documentation.

FAQ

Does throttling rate limits apply to batch requests for models?
A: No, only real-time requests have RPM (Requests Per Minute) limits. Batch calls do not.
Do I have to subscribe to batch calls, and where to place it?
A: No, you do not need to place a separate order. You pay directly for the use of the batch interface on a pay-as-you-go basis.
How does the backend process submitted batch requests? Are they executed in the order of submission?
A: No, it is not a queuing mechanism but a scheduling mechanism. Batch are scheduled and executed based on resource availability.
How long does it take to complete the execution of submitted batch requests?
A: The processing time for batch is determined by the system's allocation of resources.
When system resources are constrained, tasks might not finish within the specified maximum processing time.
Therefore, for strict real-time requirements, we recommend real-time calls. For processing large-scale data with flexible timing, we recommend batch calls.