|
| 1 | +# Prompt Customization |
| 2 | + |
| 3 | +**NOTE**: this documentation is intended for developers that want to extend/improve the support for different LLM engines. |
| 4 | + |
| 5 | +## Task-oriented Prompting |
| 6 | + |
| 7 | +The interaction with the LLM is designed in a task-oriented way, i.e., each time the LLM is called, it must perform a specific task. The most important tasks, which are part of the [guardrails process](../../architecture/README.md#the-guardrails-process), are: |
| 8 | + |
| 9 | +1. `generate_user_intent`: generate the canonical user message from the raw utterance (e.g., "Hello there" -> `express greeting`); |
| 10 | +2. `generate_next_steps`: decide what the bot should say or what action should be executed (e.g., `bot express greeting`, `bot respond to question`); |
| 11 | +3. `generate_bot_message`: decide the exact bot message that should be returned. |
| 12 | + |
| 13 | +Check out the [Task type](../../../nemoguardrails/llm/types.py) for the complete list of tasks. |
| 14 | + |
| 15 | +## Prompt Configuration |
| 16 | + |
| 17 | +The toolkit provides predefined prompts for each task and for certain LLM models. They are located in the [nemoguardrails/llm/prompts](../../../nemoguardrails/llm/prompts) folder. You can customize the prompts further by including a `prompts.yml` file in a guardrails configuration (technically, the file name is not essential, and you can also include the `prompts` key in the general `config.yml` file). |
| 18 | + |
| 19 | +To override the prompt for a specific model, you need to specify the `models` key: |
| 20 | + |
| 21 | +```yaml |
| 22 | +prompts: |
| 23 | + - task: general |
| 24 | + models: |
| 25 | + - databricks/dolly-v2-3b |
| 26 | + content: |- |
| 27 | + ... |
| 28 | +
|
| 29 | + - task: generate_user_intent |
| 30 | + models: |
| 31 | + - databricks/dolly-v2-3b |
| 32 | + content: |- |
| 33 | + ... |
| 34 | +
|
| 35 | + - ... |
| 36 | +``` |
| 37 | +
|
| 38 | +You can associate a prompt for a specific task with multiple LLM models: |
| 39 | +
|
| 40 | +```yaml |
| 41 | +prompts: |
| 42 | + - task: generate_user_intent |
| 43 | + models: |
| 44 | + - openai/gpt-3.5-turbo |
| 45 | + - openai/gpt-4 |
| 46 | + |
| 47 | +... |
| 48 | +``` |
| 49 | + |
| 50 | +### Prompt Templates |
| 51 | + |
| 52 | +Depending on the type of LLM, there are two types of templates you can define: **completion** and **chat**. For completion models (e.g., `text-davinci-003`), you need to include the `content` key in the configuration of a prompt: |
| 53 | + |
| 54 | +```yaml |
| 55 | +prompts: |
| 56 | + - task: generate_user_intent |
| 57 | + models: |
| 58 | + - openai/text-davinci-003 |
| 59 | + content: |- |
| 60 | + ... |
| 61 | +``` |
| 62 | +
|
| 63 | +For chat models (e.g., `gpt-3.5-turbo`), you need to include the `messages` key in the configuration of a prompt: |
| 64 | + |
| 65 | +```yaml |
| 66 | +prompts: |
| 67 | + - task: generate_user_intent |
| 68 | + models: |
| 69 | + - openai/gpt-3.5-turbo |
| 70 | + messages: |
| 71 | + - type: system |
| 72 | + content: ... |
| 73 | + - type: user |
| 74 | + content: ... |
| 75 | + - type: bot |
| 76 | + content: ... |
| 77 | + # ... |
| 78 | +``` |
| 79 | + |
| 80 | +### Content Template |
| 81 | + |
| 82 | +The content for a completion prompt or the body for a message in a chat prompt is a string that can also include variables and potentially other types of constructs. NeMo Guardrails uses [Jinja2](https://jinja.palletsprojects.com/) as the templating engine. Check out the [Jinja Synopsis](https://jinja.palletsprojects.com/en/3.1.x/templates/#synopsis) for a quick introduction. |
| 83 | + |
| 84 | +As an example, the default template for the `generate_user_intent` task is the following: |
| 85 | + |
| 86 | +``` |
| 87 | +""" |
| 88 | +{{ general_instruction }} |
| 89 | +""" |
| 90 | +
|
| 91 | +# This is how a conversation between a user and the bot can go: |
| 92 | +{{ sample_conversation }} |
| 93 | +
|
| 94 | +# This is how the user talks: |
| 95 | +{{ examples }} |
| 96 | +
|
| 97 | +# This is the current conversation between the user and the bot: |
| 98 | +{{ sample_conversation | first_turns(2) }} |
| 99 | +{{ history | colang }} |
| 100 | +``` |
| 101 | + |
| 102 | +#### Variables |
| 103 | + |
| 104 | +There are three types of variables available to be included in the prompt: |
| 105 | + |
| 106 | +1. System variables |
| 107 | +2. Prompt variables |
| 108 | +3. Context variables |
| 109 | + |
| 110 | +##### System Variables |
| 111 | + |
| 112 | +The following is the list of system variables: |
| 113 | + |
| 114 | +- `general_instruction`: the content corresponds to the [general instructions](../../user_guide/configuration-guide.md#general-instruction) specified in the configuration; |
| 115 | +- `sample_conversation`: the content corresponds to the [sample conversation](../../user_guide/configuration-guide.md#sample-conversation) specified in the configuration; |
| 116 | +- `examples`: depending on the task, this variable will contain the few-shot examples that the LLM should take into account; |
| 117 | +- `history`: contains the history of events (see the [complete example](../../architecture/README.md#complete-example)) |
| 118 | +- `relevant_chunks`: (only available for the `generate_bot_message` task) if a knowledge base is used, this variable will contain the most relevant chunks of text based on the user query. |
| 119 | + |
| 120 | +##### Prompt Variables |
| 121 | + |
| 122 | +Prompt variables can be registered using the `LLMRails.register_prompt_context(name, value_or_fn)` method. If a function is provided, the value of the variable will be computed for each rendering. |
| 123 | + |
| 124 | +##### Context Variables |
| 125 | + |
| 126 | +Flows included in a guardrails configuration can define (and update) various [context variables](../../../docs/user_guide/colang-language-syntax-guide.md#variables). These can also be included in a prompt if needed. |
| 127 | + |
| 128 | +#### Filters |
| 129 | + |
| 130 | +The concept of filters is the same as in Jinja (see [Jinja filters](https://jinja.palletsprojects.com/en/3.1.x/templates/#filters)). Filters can modify the content of a variable, and you can apply multiple filters using the pipe symbol (`|`). |
| 131 | + |
| 132 | +The list of predefined filters is the following: |
| 133 | + |
| 134 | +- `colang`: transforms an array of events into the equivalent colang representation; |
| 135 | +- `remove_text_messages`: removes the text messages from a colang history (leaving only the user intents, bot intents and other actions); |
| 136 | +- `first_turns(n)`: limits a colang history to the first `n` turns; |
| 137 | +- `user_assistant_sequence`: transforms an array of events into a sequence of "User: .../Assistant: ..." sequence; |
| 138 | +- `to_messages`: transforms a colang history of into a sequence of user and bot messages (intended for chat models); |
| 139 | +- `verbose_v1`: transforms a colang history into a more verbose and explicit form. |
| 140 | + |
| 141 | +#### Output Parsers |
| 142 | + |
| 143 | +Optionally, the output from the LLM can be parsed using an *output parser*. The list of predefined parsers is the following: |
| 144 | + |
| 145 | +- `user_intent`: parse the user intent, i.e., removes the "User intent:" prefix if present; |
| 146 | +- `bot_intent`: parse the bot intent, i.e., removes the "Bot intent:" prefix if present; |
| 147 | +- `bot_message`: parse the bot message, i.e., removes the "Bot message:" prefix if present; |
| 148 | +- `verbose_v1`: parse the output of the `verbose_v1` filter. |
| 149 | + |
| 150 | + |
| 151 | +## Predefined Prompts |
| 152 | + |
| 153 | +Currently, the NeMo Guardrails toolkit includes prompts for `openai/text-davinci-003`, `openai/gpt-3.5-turbo`, `openai/gpt-4`, `databricks/dolly-v2-3b`, `cohere/command`, `cohere/command-light`, `cohere/command-light-nightly`. |
| 154 | + |
| 155 | +**DISCLAIMER**: Evaluating and improving the provided prompts is a work in progress. We do not recommend deploying this alpha version using these prompts in a production setting. |
0 commit comments