Zenguard node

Zenguard is a guardrail node which passes input text to the Zenguard detectors one after another and produces either a “passthrough” output or a “block” output. Some of the detectors may sanitize the message passed to “passthrough” branch. The following detectors are available:

  • Prompt Injection
  • PII (Personally Identifiable Information)
  • Allowed Topics
  • Banned Topics
  • Keywords
  • Secrets

It has conditional output handles; that is, it will redirect the flow to only one of its outputs.

You can see the order in which detectors are applied in logs. However, if you need more control over the order, then you can just chain Zenguard nodes and enable only one type of detector on each of them.

The node directs the flow to the “block” branch if any of the detectors — which are applied one after another — detect undesired behavior; if there is nothing “detected,” then it will direct the flow to “passthrough” (Read more).

Note that “block” branch, if the flow goes to this branch, always gets the unmodified input to the node regardless of how many detectors it has passed through.

This node requires ZENGUARD_API_KEY to be set in the “env variables” page.

Passthrough or block

Zenguard node lets the input text pass through all of the detectors one by one. If any of the detectors “detect” or output that we should block, the flow is immediately directed to “block” and the subsequent remaining detectors are not executed.

Even though all detectors in Zenguard are different, what matters for us in the graph execution flow is whether to let it pass through or block.

Even if some of the detectors result in “warn” or “redact” outputs, the message after redacting (if applicable) is still valid and can pass through. However, if the detector “detects” or results in “block,” then the flow definitely must be directed to the “block” branch.

Zenguard has two different types of detectors:

  1. Passthrough/Fail detectors (Prompt Injection, Allowed Topics, Banned Topics, Secrets)
  2. Detectors with the following 4 output types: Block, Redact, Warn, and Passthrough (PII, Keywords).

For different types of detectors, input goes through the following steps.

Passthrough/Fail detectors

  1. Send a request to the detector with the input.
  2. If is_detected, then pass the node’s original input to the “block” output and don’t execute the remaining detectors. Essentially, the “block” branch always gets the unmodified input to the node if the flow goes to this branch. (For the Allowed Topics detector, is_detected must be false).
  3. If not detected, then if the response’s sanitized_message is not null, it becomes the input for the next detector. Otherwise, the input for the next detector is the input to this detector, i.e., it goes unmodified. (For the Allowed Topics detector, is_detected must be true).
  • For Allowed Topics detector, is_detected must be true to go to passthrough.
  • For Secrets detector, all inputs go to passthrough. However, the output will have secrets sanitized/replaced with ****.

Block, Redact, Warn, Passthrough detectors

  1. Send a request to the detector with the input.
  2. If block list or dict is non-empty, then pass the node’s original input to the “block” output and don’t execute the remaining detectors. Essentially, the “block” branch always gets the unmodified input to the node if the flow goes to this branch.
  3. If the block list or dict is empty, then pass the sanitized_message as the input to the next detector.

Sample pipeline

Let’s consider the sample pipeline to understand more.

Say, I want to use the pipeline with Zenguard node which has Prompt Injection and Banned Topics detectors. So any input with banned topics (configured in Zenguard console) will go to “undesired_input” output node.

Additionally, I’ll add a Zenguard node with PII (Personally identifiable information) detector. I will configure it to block if there’s a credit card information. I’ll also configure it to Redact, if the name is passed. So if the credit card information is passed, the flow will go to “block” branch and end up at “blocked_by_PII” node. If the name is passed, the input text will be modified, passed to passthrough branch, and end up at “suggestions” node.

Pipeline which sends description about the person's life to LLM. Before sending, it checks for Prompt Injection and Banned Topics. Then it will filter out PII and pass redacted (if it contains name) output to LLM and get its response.

For the previous example, you can configure the console as follows:

Zenguard console configuration for the previous example.

Note that only one of 3 outputs will eventually produce the result.

Zenguard API keys and configs

To get an API key, first you’ll need to create an account at Zenguard. Then you can create an API key in their console and paste it as ZENGUARD_API_KEY env variable.

You can also check their documentation on detectors.

Some of Zenguard detectors can be configured to have customized behavior. You can do it in the Zenguard Console. They provide a convenient interface to do that.

Additionally, you can use their Playground to get a sense of how detectors work.