Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.stackone.com/llms.txt

Use this file to discover all available pages before exploring further.

Tool Defense protects your AI agents from prompt injection attacks by scanning tool call responses before they reach your LLM. Configure scanning behavior per-toolset from the SDK, or defer to your project’s dashboard setting.
The Problem: AI agents that call third-party APIs consume data they didn’t author: emails, CRM notes, calendar events, support tickets, documents. That data can contain content designed to hijack the agent’s behavior, such as hidden instructions, role markers, encoded payloads, or jailbreak phrases.
  • Indirect prompt injection: an attacker plants instructions in a document or message the agent will eventually read
  • Cross-account leaks: a malicious record in one tool can attempt to steer the agent toward exfiltrating data from another
  • Silent failures: without scanning, you never see the attack, only the agent’s surprising downstream behavior
The Solution: Tool Defense scans every tool call response, classifies risk, and surfaces annotations so you can observe, block, or both. Configuration is per-toolset, so different parts of your application can opt in or out independently of project-wide settings.

Key Features

Per-Toolset Configuration

Different toolsets can opt in or out of scanning independently, overriding your project’s dashboard setting per construction.

Observable by Default

Risk level, sanitized fields, and detection patterns come back in the response metadata even when not blocking.

Runtime Override Warning

The SDK warns once per process when it overrides your dashboard setting, so the choice is visible in your logs.

Mode Introspection

The defenderMode getter exposes the resolved mode (project / disabled / explicit) for tests and observability.

Quick Example

import { StackOneToolSet, DEFAULT_DEFENDER_CONFIG } from '@stackone/ai';

// 1. Default: defer to your project's dashboard defender setting
const dashboardToolset = new StackOneToolSet();

// 2. Explicit opt-in with safe defaults
const scanningToolset = new StackOneToolSet({
  defender: { ...DEFAULT_DEFENDER_CONFIG },
});

// 3. Block on HIGH or CRITICAL risk
const strictToolset = new StackOneToolSet({
  defender: { ...DEFAULT_DEFENDER_CONFIG, blockHighRisk: true },
});

// 4. Forcibly disabled, overrides the dashboard
const offToolset = new StackOneToolSet({ defender: null });

Inspecting the Resolved Mode

Use the defenderMode getter to assert how a toolset will behave:
const toolset = new StackOneToolSet({ defender: null });

toolset.defenderMode; // 'project' | 'disabled' | 'explicit'
When the SDK overrides your project dashboard (modes disabled or explicit), it emits a yellow console.warn once per process per distinct override shape, so the override is visible at runtime without flooding your logs. Set NO_COLOR=1 to suppress color, or FORCE_COLOR=1 to force it when piping output.

Reading the Response

When defender runs, the RPC response includes a defenderMetadata sibling next to data:
const result = await tool.execute({ body: {} });

const metadata = (result as { defenderMetadata?: unknown }).defenderMetadata;
// {
//   applied: true,
//   result: {
//     allowed: true,                                          // false → blocked when blockHighRisk
//     riskLevel: 'low' | 'medium' | 'high' | 'critical',
//     fieldsSanitized: string[],
//     patternsByField: Record<string, string[]>,
//     detections: unknown[],
//     latencyMs: number,
//   },
// }

Next Steps

  • Tool Defense for the full API reference: all four modes, defenderMetadata shape, override warning behavior
  • Defender (platform guide) for dashboard configuration, detection pipeline, and risk thresholds
  • Basic Usage for fetching and executing tools