Skip to content

[ML] Removing custom service from service api #130739

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

jonathan-buttner
Copy link
Contributor

This PR removes the custom service from the Services API so it is not exposed to the UI.

The reasoning is that the custom service requires a lot of configures that are supported yet in the UI.

Example services response without custom service

[
    {
        "service": "alibabacloud-ai-search",
        "name": "AlibabaCloud AI Search",
        "task_types": [
            "text_embedding",
            "sparse_embedding",
            "rerank",
            "completion"
        ],
        "configurations": {
            "workspace": {
                "description": "The name of the workspace used for the {infer} task.",
                "label": "Workspace",
                "required": true,
                "sensitive": false,
                "updatable": false,
                "type": "str",
                "supported_task_types": [
                    "text_embedding",
                    "sparse_embedding",
                    "rerank",
                    "completion"
                ]
            },
            "api_key": {
                "description": "A valid API key for the AlibabaCloud AI Search API.",
                "label": "API Key",
                "required": true,
                "sensitive": true,
                "updatable": true,
                "type": "str",
                "supported_task_types": [
                    "text_embedding",
                    "sparse_embedding",
                    "rerank",
                    "completion"
                ]
            },
            "service_id": {
                "description": "The name of the model service to use for the {infer} task.",
                "label": "Project ID",
                "required": true,
                "sensitive": false,
                "updatable": false,
                "type": "str",
                "supported_task_types": [
                    "text_embedding",
                    "sparse_embedding",
                    "rerank",
                    "completion"
                ]
            },
            "host": {
                "description": "The name of the host address used for the {infer} task. You can find the host address at https://opensearch.console.aliyun.com/cn-shanghai/rag/api-key[ the API keys section] of the documentation.",
                "label": "Host",
                "required": true,
                "sensitive": false,
                "updatable": false,
                "type": "str",
                "supported_task_types": [
                    "text_embedding",
                    "sparse_embedding",
                    "rerank",
                    "completion"
                ]
            },
            "rate_limit.requests_per_minute": {
                "description": "Minimize the number of rate limit errors.",
                "label": "Rate Limit",
                "required": false,
                "sensitive": false,
                "updatable": false,
                "type": "int",
                "supported_task_types": [
                    "text_embedding",
                    "sparse_embedding",
                    "rerank",
                    "completion"
                ]
            },
            "http_schema": {
                "description": "",
                "label": "HTTP Schema",
                "required": false,
                "sensitive": false,
                "updatable": false,
                "type": "str",
                "supported_task_types": [
                    "text_embedding",
                    "sparse_embedding",
                    "rerank",
                    "completion"
                ]
            }
        }
    },
    {
        "service": "amazon_sagemaker",
        "name": "Amazon SageMaker",
        "task_types": [
            "text_embedding",
            "sparse_embedding",
            "rerank",
            "completion",
            "chat_completion"
        ],
        "configurations": {
            "batch_size": {
                "description": "The maximum size a single chunk of input can be when chunking input for semantic text.",
                "label": "Batch Size",
                "required": false,
                "sensitive": false,
                "updatable": false,
                "type": "int",
                "supported_task_types": [
                    "text_embedding",
                    "sparse_embedding",
                    "rerank",
                    "completion",
                    "chat_completion"
                ]
            },
            "endpoint_name": {
                "description": "The name specified when creating the SageMaker Endpoint.",
                "label": "Endpoint Name",
                "required": true,
                "sensitive": false,
                "updatable": false,
                "type": "str",
                "supported_task_types": [
                    "text_embedding",
                    "sparse_embedding",
                    "rerank",
                    "completion",
                    "chat_completion"
                ]
            },
            "target_model": {
                "description": "The model to request when calling a SageMaker multi-model Endpoint.",
                "label": "Target Model",
                "required": false,
                "sensitive": false,
                "updatable": false,
                "type": "str",
                "supported_task_types": [
                    "text_embedding",
                    "sparse_embedding",
                    "rerank",
                    "completion",
                    "chat_completion"
                ]
            },
            "enable_explanations": {
                "description": "JMESPath expression overriding the ClarifyingExplainerConfig in the SageMaker Endpoint Configuration.",
                "label": "Enable Explanations",
                "required": false,
                "sensitive": false,
                "updatable": false,
                "type": "str",
                "supported_task_types": [
                    "text_embedding",
                    "sparse_embedding",
                    "rerank",
                    "completion",
                    "chat_completion"
                ]
            },
            "session_id": {
                "description": "Creates or reuses an existing Session for SageMaker stateful models.",
                "label": "Session ID",
                "required": false,
                "sensitive": false,
                "updatable": false,
                "type": "str",
                "supported_task_types": [
                    "text_embedding",
                    "sparse_embedding",
                    "rerank",
                    "completion",
                    "chat_completion"
                ]
            },
            "custom_attributes": {
                "description": "An opaque informational value forwarded as-is to the model within SageMaker.",
                "label": "Custom Attributes",
                "required": false,
                "sensitive": false,
                "updatable": false,
                "type": "str",
                "supported_task_types": [
                    "text_embedding",
                    "sparse_embedding",
                    "rerank",
                    "completion",
                    "chat_completion"
                ]
            },
            "secret_key": {
                "description": "A valid AWS secret key that is paired with the access_key.",
                "label": "Secret Key",
                "required": true,
                "sensitive": true,
                "updatable": true,
                "type": "str",
                "supported_task_types": [
                    "text_embedding",
                    "sparse_embedding",
                    "rerank",
                    "completion",
                    "chat_completion"
                ]
            },
            "inference_id": {
                "description": "Informational identifying for auditing requests within the SageMaker Endpoint.",
                "label": "Inference ID",
                "required": false,
                "sensitive": false,
                "updatable": false,
                "type": "str",
                "supported_task_types": [
                    "text_embedding",
                    "sparse_embedding",
                    "rerank",
                    "completion",
                    "chat_completion"
                ]
            },
            "access_key": {
                "description": "A valid AWS access key that has permissions to use Amazon Bedrock.",
                "label": "Access Key",
                "required": true,
                "sensitive": true,
                "updatable": true,
                "type": "str",
                "supported_task_types": [
                    "text_embedding",
                    "sparse_embedding",
                    "rerank",
                    "completion",
                    "chat_completion"
                ]
            },
            "target_variant": {
                "description": "The production variant when calling the SageMaker Endpoint",
                "label": "Target Variant",
                "required": false,
                "sensitive": false,
                "updatable": false,
                "type": "str",
                "supported_task_types": [
                    "text_embedding",
                    "sparse_embedding",
                    "rerank",
                    "completion",
                    "chat_completion"
                ]
            },
            "api": {
                "description": "The API format that your SageMaker Endpoint expects.",
                "label": "API",
                "required": true,
                "sensitive": false,
                "updatable": false,
                "type": "str",
                "supported_task_types": [
                    "text_embedding",
                    "sparse_embedding",
                    "rerank",
                    "completion",
                    "chat_completion"
                ]
            },
            "region": {
                "description": "The AWS region that your model or ARN is deployed in.",
                "label": "Region",
                "required": true,
                "sensitive": false,
                "updatable": false,
                "type": "str",
                "supported_task_types": [
                    "text_embedding",
                    "sparse_embedding",
                    "rerank",
                    "completion",
                    "chat_completion"
                ]
            },
            "target_container_hostname": {
                "description": "The hostname of the container when calling a SageMaker multi-container Endpoint.",
                "label": "Target Container Hostname",
                "required": false,
                "sensitive": false,
                "updatable": false,
                "type": "str",
                "supported_task_types": [
                    "text_embedding",
                    "sparse_embedding",
                    "rerank",
                    "completion",
                    "chat_completion"
                ]
            }
        }
    },
    {
        "service": "amazonbedrock",
        "name": "Amazon Bedrock",
        "task_types": [
            "text_embedding",
            "completion"
        ],
        "configurations": {
            "secret_key": {
                "description": "A valid AWS secret key that is paired with the access_key.",
                "label": "Secret Key",
                "required": true,
                "sensitive": true,
                "updatable": true,
                "type": "str",
                "supported_task_types": [
                    "text_embedding",
                    "completion"
                ]
            },
            "provider": {
                "description": "The model provider for your deployment.",
                "label": "Provider",
                "required": true,
                "sensitive": false,
                "updatable": false,
                "type": "str",
                "supported_task_types": [
                    "text_embedding",
                    "completion"
                ]
            },
            "access_key": {
                "description": "A valid AWS access key that has permissions to use Amazon Bedrock.",
                "label": "Access Key",
                "required": true,
                "sensitive": true,
                "updatable": true,
                "type": "str",
                "supported_task_types": [
                    "text_embedding",
                    "completion"
                ]
            },
            "model": {
                "description": "The base model ID or an ARN to a custom model based on a foundational model.",
                "label": "Model",
                "required": true,
                "sensitive": false,
                "updatable": false,
                "type": "str",
                "supported_task_types": [
                    "text_embedding",
                    "completion"
                ]
            },
            "rate_limit.requests_per_minute": {
                "description": "By default, the amazonbedrock service sets the number of requests allowed per minute to 240.",
                "label": "Rate Limit",
                "required": false,
                "sensitive": false,
                "updatable": false,
                "type": "int",
                "supported_task_types": [
                    "text_embedding",
                    "completion"
                ]
            },
            "region": {
                "description": "The region that your model or ARN is deployed in.",
                "label": "Region",
                "required": true,
                "sensitive": false,
                "updatable": false,
                "type": "str",
                "supported_task_types": [
                    "text_embedding",
                    "completion"
                ]
            },
            "dimensions": {
                "description": "The number of dimensions the resulting embeddings should have. For more information refer to https://docs.aws.amazon.com/bedrock/latest/userguide/model-parameters-titan-embed-text.html.",
                "label": "Dimensions",
                "required": false,
                "sensitive": false,
                "updatable": false,
                "type": "int",
                "supported_task_types": [
                    "text_embedding"
                ]
            }
        }
    },
    {
        "service": "anthropic",
        "name": "Anthropic",
        "task_types": [
            "completion"
        ],
        "configurations": {
            "api_key": {
                "description": "API Key for the provider you're connecting to.",
                "label": "API Key",
                "required": true,
                "sensitive": true,
                "updatable": true,
                "type": "str",
                "supported_task_types": [
                    "completion"
                ]
            },
            "rate_limit.requests_per_minute": {
                "description": "By default, the anthropic service sets the number of requests allowed per minute to 50.",
                "label": "Rate Limit",
                "required": false,
                "sensitive": false,
                "updatable": false,
                "type": "int",
                "supported_task_types": [
                    "completion"
                ]
            },
            "model_id": {
                "description": "The name of the model to use for the inference task.",
                "label": "Model ID",
                "required": true,
                "sensitive": false,
                "updatable": false,
                "type": "str",
                "supported_task_types": [
                    "completion"
                ]
            }
        }
    },
    {
        "service": "azureaistudio",
        "name": "Azure AI Studio",
        "task_types": [
            "text_embedding",
            "completion"
        ],
        "configurations": {
            "endpoint_type": {
                "description": "Specifies the type of endpoint that is used in your model deployment.",
                "label": "Endpoint Type",
                "required": true,
                "sensitive": false,
                "updatable": false,
                "type": "str",
                "supported_task_types": [
                    "text_embedding",
                    "completion"
                ]
            },
            "provider": {
                "description": "The model provider for your deployment.",
                "label": "Provider",
                "required": true,
                "sensitive": false,
                "updatable": false,
                "type": "str",
                "supported_task_types": [
                    "text_embedding",
                    "completion"
                ]
            },
            "api_key": {
                "description": "API Key for the provider you're connecting to.",
                "label": "API Key",
                "required": true,
                "sensitive": true,
                "updatable": true,
                "type": "str",
                "supported_task_types": [
                    "text_embedding",
                    "completion"
                ]
            },
            "rate_limit.requests_per_minute": {
                "description": "Minimize the number of rate limit errors.",
                "label": "Rate Limit",
                "required": false,
                "sensitive": false,
                "updatable": false,
                "type": "int",
                "supported_task_types": [
                    "text_embedding",
                    "completion"
                ]
            },
            "target": {
                "description": "The target URL of your Azure AI Studio model deployment.",
                "label": "Target",
                "required": true,
                "sensitive": false,
                "updatable": false,
                "type": "str",
                "supported_task_types": [
                    "text_embedding",
                    "completion"
                ]
            },
            "dimensions": {
                "description": "The number of dimensions the resulting embeddings should have. For more information refer to https://learn.microsoft.com/en-us/azure/ai-studio/reference/reference-model-inference-embeddings.",
                "label": "Dimensions",
                "required": false,
                "sensitive": false,
                "updatable": false,
                "type": "int",
                "supported_task_types": [
                    "text_embedding"
                ]
            }
        }
    },
    {
        "service": "azureopenai",
        "name": "Azure OpenAI",
        "task_types": [
            "text_embedding",
            "completion"
        ],
        "configurations": {
            "api_key": {
                "description": "You must provide either an API key or an Entra ID.",
                "label": "API Key",
                "required": false,
                "sensitive": true,
                "updatable": true,
                "type": "str",
                "supported_task_types": [
                    "text_embedding",
                    "completion"
                ]
            },
            "entra_id": {
                "description": "You must provide either an API key or an Entra ID.",
                "label": "Entra ID",
                "required": false,
                "sensitive": true,
                "updatable": true,
                "type": "str",
                "supported_task_types": [
                    "text_embedding",
                    "completion"
                ]
            },
            "rate_limit.requests_per_minute": {
                "description": "The azureopenai service sets a default number of requests allowed per minute depending on the task type.",
                "label": "Rate Limit",
                "required": false,
                "sensitive": false,
                "updatable": false,
                "type": "int",
                "supported_task_types": [
                    "text_embedding",
                    "completion"
                ]
            },
            "deployment_id": {
                "description": "The deployment name of your deployed models.",
                "label": "Deployment ID",
                "required": true,
                "sensitive": false,
                "updatable": false,
                "type": "str",
                "supported_task_types": [
                    "text_embedding",
                    "completion"
                ]
            },
            "resource_name": {
                "description": "The name of your Azure OpenAI resource.",
                "label": "Resource Name",
                "required": true,
                "sensitive": false,
                "updatable": false,
                "type": "str",
                "supported_task_types": [
                    "text_embedding",
                    "completion"
                ]
            },
            "api_version": {
                "description": "The Azure API version ID to use.",
                "label": "API Version",
                "required": true,
                "sensitive": false,
                "updatable": false,
                "type": "str",
                "supported_task_types": [
                    "text_embedding",
                    "completion"
                ]
            },
            "dimensions": {
                "description": "The number of dimensions the resulting embeddings should have. For more information refer to https://learn.microsoft.com/en-us/azure/ai-services/openai/reference#request-body-1.",
                "label": "Dimensions",
                "required": false,
                "sensitive": false,
                "updatable": false,
                "type": "int",
                "supported_task_types": [
                    "text_embedding"
                ]
            }
        }
    },
    {
        "service": "cohere",
        "name": "Cohere",
        "task_types": [
            "text_embedding",
            "rerank",
            "completion"
        ],
        "configurations": {
            "api_key": {
                "description": "API Key for the provider you're connecting to.",
                "label": "API Key",
                "required": true,
                "sensitive": true,
                "updatable": true,
                "type": "str",
                "supported_task_types": [
                    "text_embedding",
                    "rerank",
                    "completion"
                ]
            },
            "rate_limit.requests_per_minute": {
                "description": "Minimize the number of rate limit errors.",
                "label": "Rate Limit",
                "required": false,
                "sensitive": false,
                "updatable": false,
                "type": "int",
                "supported_task_types": [
                    "text_embedding",
                    "rerank",
                    "completion"
                ]
            },
            "model_id": {
                "description": "The name of the model to use for the inference task.",
                "label": "Model ID",
                "required": false,
                "sensitive": false,
                "updatable": false,
                "type": "str",
                "supported_task_types": [
                    "text_embedding",
                    "rerank",
                    "completion"
                ]
            }
        }
    },
    {
        "service": "deepseek",
        "name": "DeepSeek",
        "task_types": [
            "completion",
            "chat_completion"
        ],
        "configurations": {
            "api_key": {
                "description": "The DeepSeek API authentication key. For more details about generating DeepSeek API keys, refer to https://api-docs.deepseek.com.",
                "label": "API Key",
                "required": true,
                "sensitive": true,
                "updatable": true,
                "type": "str",
                "supported_task_types": [
                    "completion",
                    "chat_completion"
                ]
            },
            "model_id": {
                "description": "The name of the model to use for the inference task.",
                "label": "Model ID",
                "required": true,
                "sensitive": false,
                "updatable": false,
                "type": "str",
                "supported_task_types": [
                    "completion",
                    "chat_completion"
                ]
            },
            "url": {
                "default_value": "https://api.deepseek.com/chat/completions",
                "description": "The URL endpoint to use for the requests.",
                "label": "URL",
                "required": false,
                "sensitive": false,
                "updatable": false,
                "type": "str",
                "supported_task_types": [
                    "completion",
                    "chat_completion"
                ]
            }
        }
    },
    {
        "service": "elasticsearch",
        "name": "Elasticsearch",
        "task_types": [
            "text_embedding",
            "sparse_embedding",
            "rerank"
        ],
        "configurations": {
            "num_allocations": {
                "default_value": 1,
                "description": "The total number of allocations this model is assigned across machine learning nodes.",
                "label": "Number Allocations",
                "required": true,
                "sensitive": false,
                "updatable": true,
                "type": "int",
                "supported_task_types": [
                    "text_embedding",
                    "sparse_embedding",
                    "rerank"
                ]
            },
            "num_threads": {
                "default_value": 2,
                "description": "Sets the number of threads used by each model allocation during inference.",
                "label": "Number Threads",
                "required": true,
                "sensitive": false,
                "updatable": false,
                "type": "int",
                "supported_task_types": [
                    "text_embedding",
                    "sparse_embedding",
                    "rerank"
                ]
            },
            "model_id": {
                "description": "The name of the model to use for the inference task.",
                "label": "Model ID",
                "required": true,
                "sensitive": false,
                "updatable": false,
                "type": "str",
                "supported_task_types": [
                    "text_embedding",
                    "sparse_embedding",
                    "rerank"
                ]
            }
        }
    },
    {
        "service": "googleaistudio",
        "name": "Google AI Studio",
        "task_types": [
            "text_embedding",
            "completion"
        ],
        "configurations": {
            "api_key": {
                "description": "API Key for the provider you're connecting to.",
                "label": "API Key",
                "required": true,
                "sensitive": true,
                "updatable": true,
                "type": "str",
                "supported_task_types": [
                    "text_embedding",
                    "completion"
                ]
            },
            "rate_limit.requests_per_minute": {
                "description": "Minimize the number of rate limit errors.",
                "label": "Rate Limit",
                "required": false,
                "sensitive": false,
                "updatable": false,
                "type": "int",
                "supported_task_types": [
                    "text_embedding",
                    "completion"
                ]
            },
            "model_id": {
                "description": "ID of the LLM you're using.",
                "label": "Model ID",
                "required": true,
                "sensitive": false,
                "updatable": false,
                "type": "str",
                "supported_task_types": [
                    "text_embedding",
                    "completion"
                ]
            }
        }
    },
    {
        "service": "googlevertexai",
        "name": "Google Vertex AI",
        "task_types": [
            "text_embedding",
            "rerank",
            "completion",
            "chat_completion"
        ],
        "configurations": {
            "service_account_json": {
                "description": "API Key for the provider you're connecting to.",
                "label": "Credentials JSON",
                "required": true,
                "sensitive": true,
                "updatable": true,
                "type": "str",
                "supported_task_types": [
                    "text_embedding",
                    "rerank",
                    "completion",
                    "chat_completion"
                ]
            },
            "project_id": {
                "description": "The GCP Project ID which has Vertex AI API(s) enabled. For more information on the URL, refer to the {geminiVertexAIDocs}.",
                "label": "GCP Project",
                "required": true,
                "sensitive": false,
                "updatable": false,
                "type": "str",
                "supported_task_types": [
                    "text_embedding",
                    "rerank",
                    "completion",
                    "chat_completion"
                ]
            },
            "location": {
                "description": "Please provide the GCP region where the Vertex AI API(s) is enabled. For more information, refer to the {geminiVertexAIDocs}.",
                "label": "GCP Region",
                "required": true,
                "sensitive": false,
                "updatable": false,
                "type": "str",
                "supported_task_types": [
                    "text_embedding",
                    "completion",
                    "chat_completion"
                ]
            },
            "rate_limit.requests_per_minute": {
                "description": "Minimize the number of rate limit errors.",
                "label": "Rate Limit",
                "required": false,
                "sensitive": false,
                "updatable": false,
                "type": "int",
                "supported_task_types": [
                    "text_embedding",
                    "rerank",
                    "completion",
                    "chat_completion"
                ]
            },
            "model_id": {
                "description": "ID of the LLM you're using.",
                "label": "Model ID",
                "required": true,
                "sensitive": false,
                "updatable": false,
                "type": "str",
                "supported_task_types": [
                    "text_embedding",
                    "rerank",
                    "completion",
                    "chat_completion"
                ]
            }
        }
    },
    {
        "service": "hugging_face",
        "name": "Hugging Face",
        "task_types": [
            "text_embedding",
            "sparse_embedding",
            "rerank",
            "completion",
            "chat_completion"
        ],
        "configurations": {
            "api_key": {
                "description": "API Key for the provider you're connecting to.",
                "label": "API Key",
                "required": true,
                "sensitive": true,
                "updatable": true,
                "type": "str",
                "supported_task_types": [
                    "text_embedding",
                    "sparse_embedding",
                    "rerank",
                    "completion",
                    "chat_completion"
                ]
            },
            "rate_limit.requests_per_minute": {
                "description": "Minimize the number of rate limit errors.",
                "label": "Rate Limit",
                "required": false,
                "sensitive": false,
                "updatable": false,
                "type": "int",
                "supported_task_types": [
                    "text_embedding",
                    "sparse_embedding",
                    "rerank",
                    "completion",
                    "chat_completion"
                ]
            },
            "url": {
                "description": "The URL endpoint to use for the requests.",
                "label": "URL",
                "required": true,
                "sensitive": false,
                "updatable": false,
                "type": "str",
                "supported_task_types": [
                    "text_embedding",
                    "sparse_embedding",
                    "rerank",
                    "completion",
                    "chat_completion"
                ]
            }
        }
    },
    {
        "service": "jinaai",
        "name": "Jina AI",
        "task_types": [
            "text_embedding",
            "rerank"
        ],
        "configurations": {
            "api_key": {
                "description": "API Key for the provider you're connecting to.",
                "label": "API Key",
                "required": true,
                "sensitive": true,
                "updatable": true,
                "type": "str",
                "supported_task_types": [
                    "text_embedding",
                    "rerank"
                ]
            },
            "rate_limit.requests_per_minute": {
                "description": "Minimize the number of rate limit errors.",
                "label": "Rate Limit",
                "required": false,
                "sensitive": false,
                "updatable": false,
                "type": "int",
                "supported_task_types": [
                    "text_embedding",
                    "rerank"
                ]
            },
            "model_id": {
                "description": "The name of the model to use for the inference task.",
                "label": "Model ID",
                "required": true,
                "sensitive": false,
                "updatable": false,
                "type": "str",
                "supported_task_types": [
                    "text_embedding",
                    "rerank"
                ]
            },
            "dimensions": {
                "description": "The number of dimensions the resulting embeddings should have. For more information refer to https://api.jina.ai/redoc#tag/embeddings/operation/create_embedding_v1_embeddings_post.",
                "label": "Dimensions",
                "required": false,
                "sensitive": false,
                "updatable": false,
                "type": "int",
                "supported_task_types": [
                    "text_embedding"
                ]
            }
        }
    },
    {
        "service": "mistral",
        "name": "Mistral",
        "task_types": [
            "text_embedding",
            "completion",
            "chat_completion"
        ],
        "configurations": {
            "api_key": {
                "description": "API Key for the provider you're connecting to.",
                "label": "API Key",
                "required": true,
                "sensitive": true,
                "updatable": true,
                "type": "str",
                "supported_task_types": [
                    "text_embedding",
                    "completion",
                    "chat_completion"
                ]
            },
            "model": {
                "description": "Refer to the Mistral models documentation for the list of available text embedding models.",
                "label": "Model",
                "required": true,
                "sensitive": false,
                "updatable": false,
                "type": "str",
                "supported_task_types": [
                    "text_embedding",
                    "completion",
                    "chat_completion"
                ]
            },
            "rate_limit.requests_per_minute": {
                "description": "Minimize the number of rate limit errors.",
                "label": "Rate Limit",
                "required": false,
                "sensitive": false,
                "updatable": false,
                "type": "int",
                "supported_task_types": [
                    "text_embedding",
                    "completion",
                    "chat_completion"
                ]
            },
            "max_input_tokens": {
                "description": "Allows you to specify the maximum number of tokens per input.",
                "label": "Maximum Input Tokens",
                "required": false,
                "sensitive": false,
                "updatable": false,
                "type": "int",
                "supported_task_types": [
                    "text_embedding",
                    "completion",
                    "chat_completion"
                ]
            }
        }
    },
    {
        "service": "openai",
        "name": "OpenAI",
        "task_types": [
            "text_embedding",
            "completion",
            "chat_completion"
        ],
        "configurations": {
            "api_key": {
                "description": "The OpenAI API authentication key. For more details about generating OpenAI API keys, refer to the https://platform.openai.com/account/api-keys.",
                "label": "API Key",
                "required": true,
                "sensitive": true,
                "updatable": true,
                "type": "str",
                "supported_task_types": [
                    "text_embedding",
                    "completion",
                    "chat_completion"
                ]
            },
            "organization_id": {
                "description": "The unique identifier of your organization.",
                "label": "Organization ID",
                "required": false,
                "sensitive": false,
                "updatable": false,
                "type": "str",
                "supported_task_types": [
                    "text_embedding",
                    "completion",
                    "chat_completion"
                ]
            },
            "rate_limit.requests_per_minute": {
                "description": "Default number of requests allowed per minute. For text_embedding is 3000. For completion is 500.",
                "label": "Rate Limit",
                "required": false,
                "sensitive": false,
                "updatable": false,
                "type": "int",
                "supported_task_types": [
                    "text_embedding",
                    "completion",
                    "chat_completion"
                ]
            },
            "model_id": {
                "description": "The name of the model to use for the inference task.",
                "label": "Model ID",
                "required": true,
                "sensitive": false,
                "updatable": false,
                "type": "str",
                "supported_task_types": [
                    "text_embedding",
                    "completion",
                    "chat_completion"
                ]
            },
            "url": {
                "description": "The absolute URL of the external service to send requests to.",
                "label": "URL",
                "required": false,
                "sensitive": false,
                "updatable": false,
                "type": "str",
                "supported_task_types": [
                    "text_embedding",
                    "completion",
                    "chat_completion"
                ]
            },
            "dimensions": {
                "description": "The number of dimensions the resulting embeddings should have. For more information refer to https://platform.openai.com/docs/api-reference/embeddings/create#embeddings-create-dimensions.",
                "label": "Dimensions",
                "required": false,
                "sensitive": false,
                "updatable": false,
                "type": "int",
                "supported_task_types": [
                    "text_embedding"
                ]
            }
        }
    },
    {
        "service": "voyageai",
        "name": "Voyage AI",
        "task_types": [
            "text_embedding",
            "rerank"
        ],
        "configurations": {
            "api_key": {
                "description": "API Key for the provider you're connecting to.",
                "label": "API Key",
                "required": true,
                "sensitive": true,
                "updatable": true,
                "type": "str",
                "supported_task_types": [
                    "text_embedding",
                    "rerank"
                ]
            },
            "rate_limit.requests_per_minute": {
                "description": "Minimize the number of rate limit errors.",
                "label": "Rate Limit",
                "required": false,
                "sensitive": false,
                "updatable": false,
                "type": "int",
                "supported_task_types": [
                    "text_embedding",
                    "rerank"
                ]
            },
            "model_id": {
                "description": "The name of the model to use for the inference task.",
                "label": "Model ID",
                "required": true,
                "sensitive": false,
                "updatable": false,
                "type": "str",
                "supported_task_types": [
                    "text_embedding",
                    "rerank"
                ]
            }
        }
    },
    {
        "service": "watsonxai",
        "name": "IBM watsonx",
        "task_types": [
            "text_embedding",
            "completion",
            "chat_completion"
        ],
        "configurations": {
            "project_id": {
                "description": "",
                "label": "Project ID",
                "required": true,
                "sensitive": false,
                "updatable": false,
                "type": "str",
                "supported_task_types": [
                    "text_embedding",
                    "completion",
                    "chat_completion"
                ]
            },
            "model_id": {
                "description": "The name of the model to use for the inference task.",
                "label": "Model ID",
                "required": true,
                "sensitive": false,
                "updatable": false,
                "type": "str",
                "supported_task_types": [
                    "text_embedding",
                    "completion",
                    "chat_completion"
                ]
            },
            "api_version": {
                "description": "The IBM watsonx API version ID to use.",
                "label": "API Version",
                "required": true,
                "sensitive": false,
                "updatable": false,
                "type": "str",
                "supported_task_types": [
                    "text_embedding",
                    "completion",
                    "chat_completion"
                ]
            },
            "max_input_tokens": {
                "description": "Allows you to specify the maximum number of tokens per input.",
                "label": "Maximum Input Tokens",
                "required": false,
                "sensitive": false,
                "updatable": false,
                "type": "int",
                "supported_task_types": [
                    "text_embedding"
                ]
            },
            "url": {
                "description": "",
                "label": "URL",
                "required": true,
                "sensitive": false,
                "updatable": false,
                "type": "str",
                "supported_task_types": [
                    "text_embedding",
                    "completion",
                    "chat_completion"
                ]
            }
        }
    }
]

@jonathan-buttner jonathan-buttner added >non-issue :ml Machine learning Team:ML Meta label for the ML team auto-backport Automatically create backport pull requests when merged v8.19.0 v9.1.0 v9.2.0 labels Jul 7, 2025
@jonathan-buttner jonathan-buttner marked this pull request as ready for review July 7, 2025 19:53
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/ml-core (Team:ML)

@jonathan-buttner jonathan-buttner merged commit 02b2f5e into elastic:main Jul 7, 2025
33 checks passed
@elasticsearchmachine
Copy link
Collaborator

💔 Backport failed

Status Branch Result
8.19 Commit could not be cherrypicked due to conflicts
9.1

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 130739

jonathan-buttner added a commit to jonathan-buttner/elasticsearch that referenced this pull request Jul 7, 2025
* Removing custom service from service api

* Fixing tests
@jonathan-buttner
Copy link
Contributor Author

💚 All backports created successfully

Status Branch Result
8.19

Questions ?

Please refer to the Backport tool documentation

elasticsearchmachine pushed a commit that referenced this pull request Jul 7, 2025
* Removing custom service from service api

* Fixing tests
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto-backport Automatically create backport pull requests when merged backport pending :ml Machine learning >non-issue Team:ML Meta label for the ML team v8.19.0 v9.1.0 v9.2.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants