GuidesChangelog
Log In
Guides

Custom Rules

A custom rule is a user-defined rule used to validate or search for specific patterns within text data. It enables teams to block inputs/outputs that match their customized patterns.

The Shield Approach

There are 2 types of custom rules that Shield supports.

  1. Regex rules : Detect different patterns of text using regular expressions (regex) to protect against custom types of risks such as secure API-keys, internal Account IDs, etc.
  2. Keyword or Key phrase rules : Detect specific words or phrases to identify inappropriate usage of the LLM applications such as inappropriate words, company secrets, names of competitors, etc.

Using these 2 types of rules, customers can deterministically block inappropriate usage of their LLM application and eliminate additional risk.

Requirements

Arthur Shield validates custom rules with either the Validate Prompt or Validate Response endpoint. While we typically recommend testing for many common checks (such as forbidden topics) in both prompt and response, there are situations where you would choose to check only one endpoint.

PromptResponseContext
Custom Rules

Enabling Governance

Custom regex checks are some of the most common checks teams begin to enable themselves when implementing LLMs. One of the key differences we've seen with teams utilizing Shield is stronger governance into all of the blocked patterns, keywords, and key phrases across the organization (globally or by use-case).

Required Rule Configurations

There are two specific types of custom rules available, and choosing which one to use will depend on what you are trying to block.

If you are trying to block an unknown pattern of characters (i.e., an account number), you will need to create a RegexRule rule. To do this, teams will provide regex strings in the regex_patterns field of the config section of the Create Task Rule endpoint or Create Default Rule endpoint.

{
  "name": "Custom Regex Rule",
  "type": "RegexRule",
  "apply_to_prompt": true,
  "apply_to_response": true,
  "config": {
    "regex_patterns": [
      "\\d{3}-\\d{2}-\\d{4}",
      "\\d{5}-\\d{6}-\\d{7}"
    ]
  }
}

If you are trying to block a static string (i.e., a specific word or phrase), you can create a KeywordRule rule. To do this, teams will provide the list of keywords in the keywords field of the config section of the Create Task Rule endpoint or Create Default Rule endpoint.

{
  "name": "Custom Keyword Rule",
  "type": "KeywordRule",
  "apply_to_prompt": true,
  "apply_to_response": true,
  "config": {
    "keywords": [
      "Blocked_Keyword_1",
      "Blocked_Keyword_2"
    ]
  }
}

For more information on how to add or enable/disable a custom rule by default or for a specific Task, please refer to our Rule Configuration Guide.