Skip to main content

Content Guardrails

WorldFlow AI guardrails intercept LLM requests and responses to detect PII, moderate content, block prohibited topics, and validate output formats. Each guardrail consists of one or more rules that are evaluated against the input, the output, or both.

Concepts

ConceptDescription
Guard typeWhere the guardrail runs: INPUT (before the LLM call), OUTPUT (after), or BOTH.
CategoryOrganizational label: PII, CONTENT_MODERATION, SAFETY, or CUSTOM.
ActionWhat happens when a rule matches: BLOCK (reject the request), REDACT (mask the matched text), WARN (add a warning header), or LOG (record silently).
Rule typeMatching strategy: REGEX, KEYWORD, ML_MODEL, or CUSTOM.
PriorityInteger that controls evaluation order. Lower values run first.

List Guardrails

Retrieve a paginated list of guardrails with optional filters.

GET /api/v1/guardrails

Query parameters

ParameterTypeDescription
guardTypestringFilter by INPUT, OUTPUT, or BOTH
categorystringFilter by PII, CONTENT_MODERATION, SAFETY, or CUSTOM
enabledbooleanFilter by enabled status
searchstringFree-text search in name and description
pageintegerPage number (1-indexed)
pageSizeintegerItems per page (default 20)

Example

curl -H "Authorization: Bearer $API_KEY" \
"https://gateway.example.com/api/v1/guardrails?category=PII&enabled=true&pageSize=10"
{
"guardrails": [
{
"id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
"name": "SSN Detector",
"description": "Detects US Social Security Numbers",
"guardType": "BOTH",
"category": "PII",
"enabled": true,
"action": "REDACT",
"priority": 1,
"rules": [
{
"id": "r1a2b3c4-...",
"guardrailId": "a1b2c3d4-...",
"ruleType": "REGEX",
"config": {
"pattern": "\\b\\d{3}-\\d{2}-\\d{4}\\b",
"description": "SSN pattern"
},
"createdAt": "2025-01-15T10:00:00Z"
}
],
"createdBy": "user-001",
"createdAt": "2025-01-15T10:00:00Z",
"updatedAt": "2025-01-15T10:00:00Z"
}
],
"total": 1,
"page": 1,
"pageSize": 10,
"totalPages": 1
}

Get a Guardrail

GET /api/v1/guardrails/{id}

Returns the full guardrail including all attached rules.

curl -H "Authorization: Bearer $API_KEY" \
"https://gateway.example.com/api/v1/guardrails/a1b2c3d4-e5f6-7890-abcd-ef1234567890"

Error responses

StatusTypeDescription
404not_foundGuardrail does not exist

Create a Guardrail

POST /api/v1/guardrails

Request body

FieldTypeRequiredDescription
namestringYesDisplay name
descriptionstringNoHuman-readable description
guardTypestringNoINPUT, OUTPUT, or BOTH
categorystringNoPII, CONTENT_MODERATION, SAFETY, or CUSTOM
enabledbooleanNoWhether the guardrail is active (default true)
actionstringNoBLOCK, REDACT, WARN, or LOG
priorityintegerNoEvaluation order (lower runs first)
rulesarrayNoInitial rules to attach

Each rule in the rules array requires:

FieldTypeRequiredDescription
ruleTypestringYesREGEX, KEYWORD, ML_MODEL, or CUSTOM
configobjectYesRule-specific configuration

Example: PII Detector (Regex)

curl -X POST -H "Authorization: Bearer $API_KEY" \
-H "Content-Type: application/json" \
"https://gateway.example.com/api/v1/guardrails" \
-d '{
"name": "PII Detector",
"description": "Detects personally identifiable information",
"guardType": "BOTH",
"category": "PII",
"action": "REDACT",
"rules": [
{
"ruleType": "REGEX",
"config": {
"pattern": "\\b\\d{3}-\\d{2}-\\d{4}\\b",
"description": "SSN pattern"
}
},
{
"ruleType": "REGEX",
"config": {
"pattern": "[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,}",
"description": "Email address"
}
}
]
}'

Example: Content Filter (Keywords)

curl -X POST -H "Authorization: Bearer $API_KEY" \
-H "Content-Type: application/json" \
"https://gateway.example.com/api/v1/guardrails" \
-d '{
"name": "Profanity Filter",
"category": "CONTENT_MODERATION",
"guardType": "BOTH",
"action": "BLOCK",
"rules": [
{
"ruleType": "KEYWORD",
"config": {
"keywords": ["prohibited_term_1", "prohibited_term_2"],
"caseSensitive": false
}
}
]
}'

Error responses

StatusTypeDescription
400validation_errorInvalid request body (missing required fields, bad enum values)

Update a Guardrail

PUT /api/v1/guardrails/{id}

Partial updates are supported. Only provided fields are changed. When rules is included, the full rules array is replaced.

curl -X PUT -H "Authorization: Bearer $API_KEY" \
-H "Content-Type: application/json" \
"https://gateway.example.com/api/v1/guardrails/a1b2c3d4-..." \
-d '{
"action": "BLOCK",
"priority": 5
}'

Toggle Enabled Status

curl -X PUT -H "Authorization: Bearer $API_KEY" \
-H "Content-Type: application/json" \
"https://gateway.example.com/api/v1/guardrails/a1b2c3d4-..." \
-d '{"enabled": false}'

Error responses

StatusTypeDescription
400validation_errorInvalid update payload
404not_foundGuardrail does not exist

Delete a Guardrail

DELETE /api/v1/guardrails/{id}

Permanently removes the guardrail and all associated rules.

curl -X DELETE -H "Authorization: Bearer $API_KEY" \
"https://gateway.example.com/api/v1/guardrails/a1b2c3d4-..."

Error responses

StatusTypeDescription
404not_foundGuardrail does not exist

Test a Guardrail

Run a guardrail against sample text without affecting live traffic. The response tells you whether the guardrail would trigger, which action would apply, and where in the input each match was found.

POST /api/v1/guardrails/{id}/test

Request body

FieldTypeRequiredDescription
inputstringYesText to test against
directionstringNoINPUT or OUTPUT (defaults to INPUT)

Example

curl -X POST -H "Authorization: Bearer $API_KEY" \
-H "Content-Type: application/json" \
"https://gateway.example.com/api/v1/guardrails/a1b2c3d4-.../test" \
-d '{
"input": "My SSN is 123-45-6789 and my email is test@example.com",
"direction": "INPUT"
}'
{
"triggered": true,
"action": "REDACT",
"matches": [
{
"ruleId": "r1a2b3c4-...",
"ruleType": "REGEX",
"matchedText": "123-45-6789",
"startIndex": 10,
"endIndex": 21,
"confidence": 1.0
},
{
"ruleId": "r2b3c4d5-...",
"ruleType": "REGEX",
"matchedText": "test@example.com",
"startIndex": 39,
"endIndex": 55,
"confidence": 1.0
}
],
"processingTimeMs": 2
}

Error responses

StatusTypeDescription
404not_foundGuardrail does not exist

Configuration and Tuning

Rule Types

REGEX -- Match text against a regular expression. Use config.pattern for the regex and an optional config.description label.

KEYWORD -- Match against a list of terms. Set config.keywords (array of strings) and config.caseSensitive (boolean, default false).

ML_MODEL -- Delegate to a machine-learning classifier. Set config.modelId to identify the model and config.threshold for the confidence cutoff.

CUSTOM -- Provide arbitrary JSON configuration consumed by a custom handler registered in your deployment.

Action Selection

ActionBehavior
BLOCKReject the entire request with a 400 response. Use for hard safety requirements.
REDACTReplace matched text with a placeholder (e.g., [REDACTED]) and continue processing. Use for PII.
WARNAllow the request but add a X-Guardrail-Warning response header. Use for soft policy enforcement.
LOGAllow the request silently. The match is recorded in guardrail statistics for auditing.

Priority Ordering

Guardrails execute in ascending priority order. If a BLOCK guardrail triggers, no subsequent guardrails run and the request is immediately rejected. Assign the strictest guardrails the lowest priority values.

Best Practices

  1. Layer defenses. Combine a REGEX rule for structured PII patterns (SSN, credit card) with an ML_MODEL rule for unstructured sensitive content.
  2. Test before enabling. Use the /test endpoint to validate rules against representative inputs before setting enabled: true.
  3. Start with LOG. Deploy new guardrails with action: "LOG" first to measure false-positive rates, then escalate to WARN or BLOCK.
  4. Keep rules focused. Create separate guardrails per concern (PII, profanity, topic blocking) rather than one guardrail with many mixed rules. This makes statistics more actionable.