> ## Documentation Index
> Fetch the complete documentation index at: https://docs.stackone.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Defender

> Protect your AI agents from prompt injection attacks by scanning API tool call responses before they reach your LLM.

Defender protects your AI agents from prompt injection attacks by scanning API tool call responses before they reach your LLM. When an MCP tool returns data from a third-party provider — emails, CRM records, documents — that data could contain instructions designed to hijack your agent's behavior. Defender intercepts and classifies those responses, and can block high-risk content before it causes harm.

## How It Works

In the default **Both** detection mode, Defender runs a two-stage pipeline on tool call responses:

* **Tier 1 — Pattern matching**: Fast rule-based scan that checks for known prompt injection signatures and risky field patterns. Runs on every response with negligible latency.
* **Tier 2 — AI classification**: A local ML model (MiniLM) scores the content for novel or subtle attacks that pattern matching would miss. Only runs when Tier 1 identifies suspicious fields.

You can configure Detection Mode to run only one stage, or skip scanning entirely for responses that exceed the configured size limits (see [Advanced Settings](#advanced-settings) below).

```mermaid theme={null}
flowchart TD
    A[Tool call response] --> B(Tier 1: Pattern scan)
    B -->|No risky fields found| C[Allowed — passed to agent]
    B -->|Risky fields detected| D(Tier 2: AI classification)
    D -->|Low / medium risk| C
    D -->|High risk + block enabled| E[Blocked — error returned to agent]
    D -->|High risk + block disabled| F[Allowed — flagged in metadata]
```

Risk level and scan metadata are returned alongside every response so you can observe what Defender is seeing, even when not blocking.

## Configuration

Navigate to your project in the StackOne dashboard, then open the **Defender** tab in project settings.

<Frame>
  <img src="https://mintcdn.com/stackone-60/z9OiejMmxQbgLP82/images/defender-settings.png?fit=max&auto=format&n=z9OiejMmxQbgLP82&q=85&s=8ad49362471b577ba81334ccb12575cb" alt="Defender Settings" width="780" height="625" data-path="images/defender-settings.png" />
</Frame>

<Info>
  Defender settings apply project-wide. Per-account and per-request overrides take precedence where supported.
</Info>

### Core Settings

| Setting                | Description                                                                                     | Default |
| ---------------------- | ----------------------------------------------------------------------------------------------- | ------- |
| **Defender Enabled**   | Master switch — enables scanning for this project                                               | Off     |
| **Block High Risk**    | Automatically block responses classified as high risk                                           | Off     |
| **Default Tool Rules** | Apply built-in per-tool risk rules (e.g. `gmail_*` tools are treated as higher risk by default) | Off     |

### Advanced Settings

<Frame>
  <img src="https://mintcdn.com/stackone-60/z9OiejMmxQbgLP82/images/defender-settings-advanced.png?fit=max&auto=format&n=z9OiejMmxQbgLP82&q=85&s=a443ed31a817ef8c7e8a640d1b90439a" alt="Defender Advanced Settings" width="784" height="1018" data-path="images/defender-settings-advanced.png" />
</Frame>

| Setting                     | Description                                                                                                           | Default          |
| --------------------------- | --------------------------------------------------------------------------------------------------------------------- | ---------------- |
| **Detection Mode**          | `Both` runs pattern + AI. `Pattern only` skips the ML model. `AI only` skips pattern matching. `Both` is recommended. | Both             |
| **High Risk Threshold**     | Score (0–1) above which content is classified as high risk                                                            | 0.8              |
| **Medium Risk Threshold**   | Score (0–1) above which content is classified as medium risk                                                          | 0.5              |
| **Large Response Behavior** | What to do when a response exceeds the size limits: `Skip scanning` (default), `Block the response`, or `Scan anyway` | Skip scanning    |
| **Max Response Size**       | Byte threshold that triggers large response behavior                                                                  | 1,048,576 (1 MB) |
| **Max Response Words**      | Word count threshold that triggers large response behavior                                                            | 10,000           |

## When to Use Defender

* You are building AI agents or MCP-based workflows that process third-party API responses
* Your integrations handle sensitive data such as emails, files, calendar events, or CRM records
* You want to observe risk signals on tool call responses without necessarily blocking them

## FAQ

<AccordionGroup>
  <Accordion title="Do I need to enable Defender?">
    No. It is off by default. Enable it when your agents consume third-party data and you want protection against prompt injection.
  </Accordion>

  <Accordion title="Does Defender add latency?">
    Tier 1 (pattern matching) adds negligible latency. Tier 2 (AI classification) only runs when Tier 1 identifies risky fields, and the ML model runs locally — no external API call is made. For typical responses, the added latency is under 50ms.
  </Accordion>

  <Accordion title="What happens when a response is blocked?">
    The tool call returns an error to your agent indicating the response was blocked. The agent can handle this like any other tool error — retry, skip, or surface it to the user.
  </Accordion>

  <Accordion title="Can I see what Defender flagged without blocking?">
    Yes. Leave **Block High Risk** disabled. Defender still scans and returns `riskLevel`, `tier2Score`, and `detections` in the response metadata, which you can inspect in your logs.
  </Accordion>

  <Accordion title="Will my data be used to train the AI model?">
    No. The classification model runs locally within StackOne's infrastructure and is never trained on your data.
  </Accordion>
</AccordionGroup>
