This processor integrates the Nvidia Prompt Task and Complexity Classifier model into the F5 AI Gateway ecosystem. It analyzes incoming prompts (specifically, non-system messages) to classify them based on task type and various complexity dimensions.
The results of the classification are added as tags to the request, which can then be used within the AI Gateway configuration for routing, policy enforcement, or observability.
- Input Processing: Receives the request object containing the prompt messages.
- Text Extraction: Filters out messages with the
SYSTEMrole and concatenates the content of the remaining messages.- Configurable History Window: By default, only the last 2 non-system messages are used for classification. This can be changed by setting the
COMPLEXITY_HISTORY_LENenvironment variable (see below).
- Configurable History Window: By default, only the last 2 non-system messages are used for classification. This can be changed by setting the
- Tokenization: Uses the tokenizer associated with the
nvidia/prompt-task-and-complexity-classifiermodel to prepare the text for the model. - Model Inference: Feeds the tokenized input into a custom model (
CustomModel) which includes:- A
microsoft/DeBERTa-v3-basebackbone. - Multiple classification heads tailored for the specific tasks and complexity dimensions defined by the Nvidia model.
- A
- Result Processing: Calculates the final scores and labels from the model's output logits. This includes:
- Identifying the primary task type (e.g., "Text Generation", "Summarization").
- Calculating scores for complexity dimensions (Creativity, Reasoning, etc.).
- Computing an overall complexity score based on a weighted sum of the dimensions.
- Tagging: If the processor parameter
annotateistrue(default), it adds the following tags to the request:Task: The predicted primary task type.Complexity: The overall calculated complexity score.Creativity: The creativity score.Reasoning: The reasoning score.Contextual Knowledge: The contextual knowledge score.Domain Knowledge: The domain knowledge score.Constraints: The constraints score.# of Few Shots: The score related to the number of few-shot examples detected (often 0 if none are present).History Length: The number of non-system messages actually used for classification in this request.
- Hugging Face Model:
nvidia/prompt-task-and-complexity-classifier - Backbone:
microsoft/DeBERTa-v3-base
f5-ai-gateway-sdktransformerstorchnumpyhuggingface_hubstarlette(for running the processor service)- An ASGI server (e.g.,
uvicorn)
By default, the classifier uses only the last 2 non-system messages from the chat history for its analysis.
You can override this by setting the COMPLEXITY_HISTORY_LEN environment variable when running the processor, for example:
# Use the last 5 non-system messages for classification
export COMPLEXITY_HISTORY_LEN=5
python -m uvicorn complexity-classifier:app --host 127.0.0.1 --port 9999If running in Docker, you can override the default in your docker run command:
docker run -e COMPLEXITY_HISTORY_LEN=4 -p 9999:9999 your-image-nameThe actual number of messages used for each request is included in both the JSON result and as a tag (History Length).
The processor is built as a standard ASGI application using Starlette. You can run it locally using an ASGI server like Uvicorn:
# Ensure you are in the directory containing complexity-classifier.py
pip install uvicorn transformers torch numpy huggingface_hub f5-ai-gateway-sdk starlette
# Run the server (adjust host/port as needed)
python -m uvicorn complexity-classifier:app --host 127.0.0.1 --port 9999To use this processor within F5 AI Gateway, configure it in your aigw.yml file under the processors section:
processors:
- name: complexity-classifier # Or any name you prefer
type: external
config:
endpoint: "localhost:9999" # Adjust if running elsewhere
namespace: f5 # Must match the namespace in the processor code
version: 1 # Must match the version in the processor code
# Optional: Add params if you need to override defaults (e.g., disable annotation)
# params:
# annotate: falseYou can then reference this processor by its name (complexity-classifier in this example) in the steps section of your policies.