Skip to content

Commit fd075d3

Browse files
changed flows and moved files according to suggested changes
1 parent 849e23d commit fd075d3

File tree

13 files changed

+115
-141
lines changed

13 files changed

+115
-141
lines changed

docs/user_guides/guardrails-library.md

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -717,6 +717,28 @@ define flow
717717
bot provide report answer
718718
```
719719

720+
721+
### AutoGuard
722+
723+
NeMo Guardrails provides an interface for using the AutoAlign AI's AutoGuard guardrails
724+
(you need to have the `AUTOGUARD_API_KEY` environment variable set).
725+
726+
727+
Following is the list of guardrails that are currently supported:
728+
1. Gender bias Detection
729+
2. Harm Detection
730+
3. Jailbreak Detection
731+
4. Confidential Detection
732+
5. Intellectual property detection
733+
6. Racial bias Detection
734+
7. Tonal Detection
735+
8. Toxicity detection
736+
9. PII
737+
10. Factcheck
738+
739+
More details regarding the configuration and usage of these can be found [here](../../nemoguardrails/library/autoguard/README.md).
740+
741+
720742
## Other
721743

722744
### Jailbreak Detection Heuristics

examples/configs/autoguard/README.md

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -5,8 +5,6 @@ This example showcases the use of AutoGuard guardrails.
55
The structure of the config folders is the following:
66
- `autoguard_config` - example configuration folder for all guardrails (except factcheck)
77
- `config.yml` - The config file holding all the configuration options.
8-
- `prompts.yml` - The config file holding the adjustable content categories to use with AutoGuard.
98
- `autoguard_factcheck_config` - example configuration folder for AutoGuard's factcheck
109
- `kb` - The folder containing documents that form the knowledge base.
1110
- `config.yml` - The config file holding all the configuration options.
12-
- `prompts.yml` - The config file holding the adjustable content categories to use with AutoGuard's factcheck endpoint.

examples/configs/autoguard/autoguard_config/config.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -67,7 +67,7 @@ rails:
6767
}
6868
input:
6969
flows:
70-
- call autoguard input
70+
- autoguard check input
7171
output:
7272
flows:
73-
- call autoguard output
73+
- autoguard check output

examples/configs/autoguard/autoguard_config/flows.co

Lines changed: 0 additions & 28 deletions
This file was deleted.

examples/configs/autoguard/autoguard_factcheck_config/config.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ rails:
99
fact_check_endpoint: "https://nvidia.autoalign.ai/factcheck"
1010
input:
1111
flows:
12-
- input autoguard factcheck
12+
- autoguard factcheck input
1313
output:
1414
flows:
15-
- output autoguard factcheck
15+
- autoguard factcheck output

examples/configs/autoguard/autoguard_factcheck_config/flows.co

Lines changed: 0 additions & 18 deletions
This file was deleted.

nemoguardrails/library/autoguard/README.md

Lines changed: 33 additions & 34 deletions
Original file line numberDiff line numberDiff line change
@@ -9,11 +9,11 @@ AutoGuard comes with a library of built-in guardrails that you can easily use:
99
3. [Jailbreak Detection](#jailbreak-detection)
1010
4. [Confidential Detection](#confidential-detection)
1111
5. [Intellectual property detection](#intellectual-property-detection)
12-
5. [Racial bias Detection](#racial-bias-detection)
13-
6. [Tonal Detection](#tonal-detection)
14-
7. [Toxicity detection](#toxicity-extraction)
15-
8. [PII](#pii)
16-
9. [Factcheck](#factcheck)
12+
6. [Racial bias Detection](#racial-bias-detection)
13+
7. [Tonal Detection](#tonal-detection)
14+
8. [Toxicity detection](#toxicity-extraction)
15+
9. [PII](#pii)
16+
10. [Factcheck](#factcheck-or-groundness-check)
1717

1818

1919
Note: Factcheck is implemented a bit differently, compared to other guardrails.
@@ -25,14 +25,14 @@ Please have a look at its description within this document to understand its usa
2525
In order to use AutoGuard's guardrails you need to set `AUTOGUARD_API_KEY` as an environment variable in your system,
2626
with the API key as its value.
2727

28-
Please contact [[email protected]](mailto:[email protected]) for your own API key. Please mention NeMo and AutoGuard in the subject line in order to receive quick responses fron the AutoAlign team.
28+
Please contact [[email protected]](mailto:[email protected]) for your own API key. Please mention NeMo and AutoGuard in the subject line in order to receive quick responses from the AutoAlign team.
2929

3030

3131
## Usage (AutoGuard)
3232

3333
To use the autoguard's guardrails:
3434

35-
You have to configure the guardrails in a dictionary under `guardrails_config` section, which you can provide for both `input`
35+
You have to configure the guardrails using the `guardrails_config` section, which you can provide for both `input`
3636
section and `output` sections that come under `autoguard` section in `config.yml` file:
3737

3838
```yaml
@@ -261,38 +261,38 @@ rails:
261261
}
262262
input:
263263
flows:
264-
- call autoguard input
264+
- autoguard check input
265265
output:
266266
flows:
267-
- call autoguard output
267+
- autoguard check output
268268
```
269269
We also have to add the autoguard's endpoint in parameters.
270270
271-
One of the advanced configs is matching score (ranging between 0 to 1) which is a threshold that determines whether the guardrail will block the input/output or not.
271+
One of the advanced configs is matching score (ranging between 0 and 1) which is a threshold that determines whether the guardrail will block the input/output or not.
272272
If the matching score is higher (i.e. close to 1) then the guardrail will be more strict.
273273
Some guardrails have very different format of `matching_scores` config,
274274
in each guardrail's description we have added an example to show how `matching_scores`
275275
has been implemented for that guardrail.
276276
PII has some more advanced config like `contextual_rules` and `enabled_types`, more details can be read in the PII section
277277
given below.
278278

279-
**Please note that** all the additional configs such as `matching_scores`, `contextual_rules`, and `enabled_types` are optiona; if they are not specified then the default valus will be applied.
279+
**Please note that** all the additional configs such as `matching_scores`, `contextual_rules`, and `enabled_types` are optional; if they are not specified then the default values will be applied.
280280

281281
The config for the guardrails has to be defined separately for both input and output side, as shown in the above example.
282282

283283

284-
The colang file has to be in the following format:
284+
The colang file has been implemented in the following format in the library:
285285

286286
```colang
287-
define flow call autoguard input
287+
define flow autoguard check input
288288
$input_result = execute autoguard_input_api(show_autoguard_message=True)
289289
290290
if $input_result["guardrails_triggered"]
291291
$autoguard_input_response = $input_result['combined_response']
292292
bot refuse to respond
293293
stop
294294
295-
define flow call autoguard output
295+
define flow autoguard check output
296296
$pre_rail_bot_message = $bot_message
297297
$output_result = execute autoguard_output_api(show_autoguard_message=True)
298298
@@ -305,7 +305,6 @@ define flow call autoguard output
305305
bot respond pii output
306306
stop
307307
308-
309308
define bot respond pii output
310309
"$pii_message_output"
311310
@@ -332,6 +331,8 @@ Now coming to the additional keys, one of the key `guardrails_triggered` whose v
332331
us whether any guardrail apart from PII got triggered or not. Another key is `combined_response` whose value
333332
provides a combined guardrail message for all the guardrails that got triggered.
334333

334+
Users can create their own flows and make use of AutoGuard's guardrails by using the actions
335+
`execute autoguard_input_api` and `execute autoguard_output_api` in their flow.
335336

336337
### Gender bias detection
337338

@@ -566,40 +567,38 @@ rails:
566567
fact_check_endpoint: "https://nvidia.autoalign.ai/factcheck"
567568
input:
568569
flows:
569-
- input autoguard factcheck
570+
- autoguard factcheck input
570571
output:
571572
flows:
572-
- output autoguard factcheck
573+
- autoguard factcheck output
573574
```
574575

575576
Specify the factcheck endpoint the parameters section of autoguard's config.
576577
Then, you have to call the corresponding subflows for input and output factcheck guardrails.
577578

578579
Following is the format of the colang file:
579580
```colang
580-
define subflow input autoguard factcheck
581-
execute autoguard_retrieve_relevant_chunks
582-
$input_result = execute autoguard_factcheck_input_api
583-
if $input_result < 0.5
584-
bot inform autoguard factcheck input violation
585-
stop
586-
587-
define subflow output autoguard factcheck
588-
execute autoguard_retrieve_relevant_chunks
589-
$output_result = execute autoguard_factcheck_output_api
590-
if $output_result < 0.5
591-
bot inform autoguard factcheck output violation
592-
stop
581+
define flow autoguard factcheck input
582+
execute autoguard_retrieve_relevant_chunks_input
583+
$input_result = execute autoguard_factcheck_input_api
584+
585+
define flow autoguard factcheck output
586+
$output_result = execute autoguard_factcheck_output_api
587+
if $input_result < 0.5
588+
bot inform autoguard factcheck input violation
589+
if $output_result < 0.5
590+
bot inform autoguard factcheck output violation
591+
stop
593592
594593
define bot inform autoguard factcheck input violation
595-
"Factcheck input violation has been detected by AutoGuard."
594+
"Factcheck input violation has been detected by AutoGuard."
596595
597596
define bot inform autoguard factcheck output violation
598-
"$bot_message Factcheck output violation has been detected by AutoGuard."
597+
"$bot_message Factcheck output violation has been detected by AutoGuard."
599598
```
600599

601-
Within the subflow you have to execute a custom relevant chunk extraction action `autoguard_retrieve_relevant_chunks`,
602-
so that the documents are passed in the context for the guardrail.
600+
Within the flow you can see we have an action for custom relevant chunk extraction, `autoguard_retrieve_relevant_chunks_input`,
601+
which ensures that the documents are passed in the context for the guardrail while using it for user input.
603602

604603
The output of the factcheck endpoint provides you with a factcheck score against which we can add a threshold which determines whether the given output is factually correct or not.
605604

nemoguardrails/library/autoguard/actions.py

Lines changed: 8 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -138,7 +138,7 @@ async def autoguard_infer(
138138
raise ValueError("AUTOGUARD_API_KEY environment variable not set.")
139139

140140
headers = {"x-api-key": api_key}
141-
config = DEFAULT_CONFIG
141+
config = DEFAULT_CONFIG.copy()
142142
# enable the select guardrail
143143
for task in task_config.keys():
144144
if task != "factcheck":
@@ -321,13 +321,17 @@ async def autoguard_factcheck_output_api(
321321
raise ValueError("Provide relevant documents in proper format")
322322

323323

324-
@action(name="autoguard_retrieve_relevant_chunks")
325-
async def autoguard_retrieve_relevant_chunks(
324+
@action(name="autoguard_retrieve_relevant_chunks_input")
325+
async def autoguard_retrieve_relevant_chunks_input(
326+
context: Optional[dict] = None,
326327
kb: Optional[KnowledgeBase] = None,
327328
):
328329
"""Retrieve knowledge chunks from knowledge base and update the context."""
330+
user_message = context.get("user_message")
329331
context_updates = {}
330-
chunks = [chunk["body"] for chunk in kb.chunks]
332+
chunks = await kb.search_relevant_chunks(user_message)
333+
chunks = [chunk["body"] for chunk in chunks]
334+
# 💡 Store the chunks for fact-checking
331335

332336
context_updates["relevant_chunks"] = "\n".join(chunks)
333337
context_updates["relevant_chunks_sep"] = chunks
Lines changed: 44 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,44 @@
1+
define flow autoguard check input
2+
$input_result = execute autoguard_input_api(show_autoguard_message=True)
3+
4+
if $input_result["guardrails_triggered"]
5+
$autoguard_input_response = $input_result['combined_response']
6+
bot refuse to respond
7+
stop
8+
9+
define flow autoguard check output
10+
$pre_rail_bot_message = $bot_message
11+
$output_result = execute autoguard_output_api(show_autoguard_message=True)
12+
13+
if $output_result["guardrails_triggered"]
14+
bot refuse to respond
15+
stop
16+
else
17+
$pii_message_output = $output_result["pii_fast"]["response"]
18+
if $output_result["pii_fast"]["guarded"]
19+
bot respond pii output
20+
stop
21+
22+
define flow autoguard factcheck input
23+
execute autoguard_retrieve_relevant_chunks_input
24+
$input_result = execute autoguard_factcheck_input_api
25+
26+
define flow autoguard factcheck output
27+
$output_result = execute autoguard_factcheck_output_api
28+
if $input_result < 0.5
29+
bot inform autoguard factcheck input violation
30+
if $output_result < 0.5
31+
bot inform autoguard factcheck output violation
32+
stop
33+
34+
define bot inform autoguard factcheck input violation
35+
"Factcheck violation in user input has been detected by AutoGuard."
36+
37+
define bot inform autoguard factcheck output violation
38+
"Factcheck violation in llm response has been detected by AutoGuard."
39+
40+
define bot respond pii output
41+
"$pii_message_output"
42+
43+
define bot refuse to respond
44+
"I'm sorry I can't respond."

tests/test_configs/autoguard/autoguard.co

Lines changed: 0 additions & 27 deletions
This file was deleted.

0 commit comments

Comments
 (0)