Create an Contextual AI inference endpoint
Generally available; Added in 9.2.0
Create an inference endpoint to perform an inference task with the contexualai
service.
To review the available rerank
models, refer to https://docs.contextual.ai/api-reference/rerank/rerank#body-model.
Required authorization
- Cluster privileges:
manage_inference
Path parameters
-
The type of the inference task that the model will perform.
Value is
rerank
. -
The unique identifier of the inference endpoint.
Query parameters
-
Specifies the amount of time to wait for the inference endpoint to be created.
External documentation
Body
-
The chunking configuration object.
-
The type of service supported for the specified task type. In this case,
contextualai
.Value is
contextualai
. -
Settings used to install the inference model. These settings are specific to the
contextualai
service. -
Settings to configure the inference task. These settings are specific to the task type you specified.
PUT
/_inference/{task_type}/{contextualai_inference_id}
Console
PUT _inference/rerank/contextualai-rerank
{
"service": "contextualai",
"service_settings": {
"api_key": "ContextualAI-Api-key",
"model_id": "ctxl-rerank-v2-instruct-multilingual-mini"
},
"task_settings": {
"instruction": "Rerank the following documents based on their relevance to the query.",
"top_k": 3
}
}
resp = client.inference.put(
task_type="rerank",
inference_id="contextualai-rerank",
inference_config={
"service": "contextualai",
"service_settings": {
"api_key": "ContextualAI-Api-key",
"model_id": "ctxl-rerank-v2-instruct-multilingual-mini"
},
"task_settings": {
"instruction": "Rerank the following documents based on their relevance to the query.",
"top_k": 3
}
},
)
const response = await client.inference.put({
task_type: "rerank",
inference_id: "contextualai-rerank",
inference_config: {
service: "contextualai",
service_settings: {
api_key: "ContextualAI-Api-key",
model_id: "ctxl-rerank-v2-instruct-multilingual-mini",
},
task_settings: {
instruction:
"Rerank the following documents based on their relevance to the query.",
top_k: 3,
},
},
});
response = client.inference.put(
task_type: "rerank",
inference_id: "contextualai-rerank",
body: {
"service": "contextualai",
"service_settings": {
"api_key": "ContextualAI-Api-key",
"model_id": "ctxl-rerank-v2-instruct-multilingual-mini"
},
"task_settings": {
"instruction": "Rerank the following documents based on their relevance to the query.",
"top_k": 3
}
}
)
$resp = $client->inference()->put([
"task_type" => "rerank",
"inference_id" => "contextualai-rerank",
"body" => [
"service" => "contextualai",
"service_settings" => [
"api_key" => "ContextualAI-Api-key",
"model_id" => "ctxl-rerank-v2-instruct-multilingual-mini",
],
"task_settings" => [
"instruction" => "Rerank the following documents based on their relevance to the query.",
"top_k" => 3,
],
],
]);
curl -X PUT -H "Authorization: ApiKey $ELASTIC_API_KEY" -H "Content-Type: application/json" -d '{"service":"contextualai","service_settings":{"api_key":"ContextualAI-Api-key","model_id":"ctxl-rerank-v2-instruct-multilingual-mini"},"task_settings":{"instruction":"Rerank the following documents based on their relevance to the query.","top_k":3}}' "$ELASTICSEARCH_URL/_inference/rerank/contextualai-rerank"
Request example
Run `PUT _inference/rerank/contextualai-rerank` to create an inference endpoint for rerank tasks using the Contextual AI service.
{
"service": "contextualai",
"service_settings": {
"api_key": "ContextualAI-Api-key",
"model_id": "ctxl-rerank-v2-instruct-multilingual-mini"
},
"task_settings": {
"instruction": "Rerank the following documents based on their relevance to the query.",
"top_k": 3
}
}