Skip to content
Glossary

LLM Guardrails

LLM guardrails are controls that constrain a language model's inputs and outputs — filtering prompts, validating responses, and blocking unsafe content — to keep generated text within acceptable bounds.

Definition

LLM guardrails are controls that constrain a language model's inputs and outputs — filtering prompts, validating responses, and blocking unsafe content — to keep generated text within acceptable bounds.

What guardrails do

Guardrails typically operate on text: they screen incoming prompts for injection attempts, validate that outputs match an expected schema, and filter responses for toxic, off-topic, or policy-violating content. They are an important layer for chat and content-generation systems, where the risk is what the model says. Tools in this category validate the words; they generally do not govern the actions those words trigger.

Guardrails versus action governance

When an LLM is wired to tools and can take real-world actions, text validation alone is insufficient. A response can be perfectly well-formed and still command a destructive operation. Action governance — evaluating the tool call, the data accessed, and the side effect before it executes — operates at a different layer than output guardrails. Cordum complements guardrails: guardrails keep the language safe, while the control plane keeps the actions safe. Most production agent stacks need both.

Frequently asked questions

Are LLM guardrails enough to secure an AI agent?

Not on their own. Guardrails validate text, but an agent's risk comes from the actions it takes — tool calls, data access, side effects. Securing an agent also requires governing those actions before they execute, which is a separate layer from output validation.

How do guardrails and an agent control plane fit together?

They sit at different layers. Guardrails constrain what the model says; the control plane constrains what the agent does. Run guardrails for content safety and a control plane for action governance — they are complementary, not interchangeable.

Related reading

Govern your AI agents with Cordum

Cordum is the agent control plane: policy-before-dispatch enforcement, human approvals, and a tamper-evident audit trail for autonomous AI agents.