> ## Documentation Index > Fetch the complete documentation index at: https://docs.stackone.com/llms.txt > Use this file to discover all available pages before exploring further. # Defender > Protect your AI agents from prompt injection attacks by scanning API tool call responses before they reach your LLM. Defender protects your AI agents from prompt injection attacks by scanning API tool call responses before they reach your LLM. When an MCP tool returns data from a third-party provider — emails, CRM records, documents — that data could contain instructions designed to hijack your agent's behavior. Defender intercepts and classifies those responses, and can block high-risk content before it causes harm. ## How It Works In the default **Both** detection mode, Defender runs a two-stage pipeline on tool call responses: * **Tier 1 — Pattern matching**: Fast rule-based scan that checks for known prompt injection signatures and risky field patterns. Runs on every response with negligible latency. * **Tier 2 — AI classification**: A local ML model (MiniLM) scores the content for novel or subtle attacks that pattern matching would miss. Runs in parallel with Tier 1 on every response, scanning the SFE-filtered payload (or the `tier2Fields` subset when configured). You can configure Detection Mode to run only one stage, or skip scanning entirely for responses that exceed the configured size limits (see [Advanced Settings](#advanced-settings) below). ```mermaid theme={null} flowchart TD A[Tool call response] --> B(Tier 1: Pattern scan + sanitize) B --> D(Tier 2: ML classification) B --> R{Combined risk level} D --> R R -->|Low / medium| C[Allowed — passed to agent] R -->|High + block enabled| E[Blocked — error returned to agent] R -->|High + block disabled| F[Allowed — flagged in metadata] ``` Risk level and scan metadata are returned alongside every response so you can observe what Defender is seeing, even when not blocking. ## Configuration Navigate to your project in the StackOne dashboard, then open the **Defender** tab in project settings.

Defender settings apply project-wide. Per-account and per-request overrides take precedence where supported. ### Core Settings | Setting | Description | Default | | --------------------------- | ----------------------------------------------------------------- | ----------------- | | **Status** | Master switch — enables scanning for this project | On (new projects) | | **Block High Risk Content** | Automatically block responses classified as high or critical risk | Off | ### Advanced Settings Defender Advanced Settings — classification

Defender Advanced Settings — classification

Defender Advanced Settings — response handling

| Setting | Description | Default | | ---------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---------------- | | **Detection Mode** | `Both` runs pattern + AI. `Pattern only` skips the ML model. `AI only` skips pattern matching. `Both` is recommended. | Both | | **High Risk Threshold** | Score (0–1) above which content is classified as high risk | 0.8 | | **Medium Risk Threshold** | Score (0–1) above which content is classified as medium risk | 0.5 | | **Large Response Behavior** | What to do when a response exceeds the size limits: `Skip scanning` (default), `Block the response`, or `Scan anyway` | Skip scanning | | **Max Response Size** | Byte threshold that triggers large response behavior | 1,048,576 (1 MB) | | **Max Response Words** | Word count threshold that triggers large response behavior | 10,000 | | **Annotate Tool Results** | Wrap sanitized tool results with boundary tags such as `[UD-abc123]...[/UD-abc123]` (where the suffix is a random per-response ID) so downstream prompts can reason about data boundaries. Pair this with `generateBoundaryInstructions()` from `@stackone/defender` in your system prompt when enabled. | Off | | **Semantic Field Extractor** | Skip metadata and identifier fields (UUIDs, timestamps, URLs, etc.) before classification to reduce latency and false positives. Recommended. | On | ## When to Use Defender * You are building AI agents or MCP-based workflows that process third-party API responses * Your integrations handle sensitive data such as emails, files, calendar events, or CRM records * You want to observe risk signals on tool call responses without necessarily blocking them ## SDK Configuration If you're using the Node.js SDK, you can configure Defender per-toolset directly from your code — override your project's dashboard setting, opt in with safe defaults, or forcibly disable for trusted internal flows. See [Tool Defense 101](/agents/typescript/tool-defense-101) for the SDK API. ## FAQ No. It is off by default. Enable it when your agents consume third-party data and you want protection against prompt injection. Tier 1 (pattern matching) adds negligible latency. Tier 2 (AI classification) runs in parallel on every response — the ML model runs locally, so there is no external API call. The Semantic Field Extractor preprocessor trims metadata/identifier fields before Tier 2 to keep latency low; for typical responses the added latency is under 100ms. The tool call returns an error to your agent indicating the response was blocked. The agent can handle this like any other tool error — retry, skip, or surface it to the user. Yes. Leave **Block High Risk Content** disabled. Defender still scans and returns `riskLevel`, `tier2Score`, and `detections` in the response metadata, which you can inspect in your logs. No. The classification model runs locally within StackOne's infrastructure and is never trained on your data.