Prompt Injection & Tool Abuse Checklist

Why this checklist exists

Prompt injection is often described like a strange model trick. In practice, it is usually much simpler: the system gives external content too much influence over what the agent believes, prioritizes, or does.

Tool abuse is the next step. Once the model can choose tools, a bad decision is no longer just a bad sentence. It can become a bad action.

This checklist is meant for real systems, not abstract demos.

The principle

Do not let untrusted content sit on the same authority path as tool selection or high-impact actions.

If an agent can read untrusted input, pull in sensitive context, and use tools in the same loop, you already have a practical abuse surface.

Where teams usually get this wrong

Bad vs better design

Bad

The application concatenates user request, system text, retrieved internal data, and page content into one prompt. The model can then both interpret and act from the same mixed context.

Better

The system keeps separate authority zones for:

Tool permissions and approvals are enforced outside the same untrusted-content path.

Review checklist

1. External content is treated as untrusted data

Check whether the system clearly separates:

If these are merged into one blob, you are creating a prompt-injection surface by design.

2. The agent cannot silently execute sensitive actions

If the system can browse, retrieve, and act in one chain, the highest-risk actions should be gated.

At minimum, review:

3. Tool permissions are scoped to the minimum needed

Avoid broad tool exposure.

A useful habit is to classify tools into:

Then remove anything the workflow does not actually need.

4. Retrieved internal data is bounded

Ask:

Prompt injection gets worse when sensitive context is freely retrievable.

5. Browser or web-agent actions are domain-scoped where possible

If a browser agent can navigate anywhere, click, authenticate, and submit, the attack surface grows quickly.

Prefer explicit constraints on:

6. Logs are useful without leaking secrets

You need evidence about tool selection and actions, but not raw secrets in logs.

A good log helps answer:

Mini-scenarios

Scenario 1: Summarize-then-send workflow

A team builds an assistant that reads inbound emails, drafts summaries, and can optionally send replies.

What to test:

Scenario 2: Browser assistant with internal document access

The browser agent reads external pages while also having access to internal docs or authenticated sessions.

What to test:

Minimum viable standard

If a workflow reads untrusted content and can call tools, you need:

Bottom line

Anything less is not “an early version.” It is optimistic automation with unclear boundaries.

Related Daily Briefs

Daily brief

AI Security Signal Brief — 2026-03-14

The practical decision is no longer whether AI belongs in security workflows at all. It is where it creates enough leverage to justify real controls, review, and ownership.

Read the full brief

Daily brief

AI Security Signal Brief — 2026-03-15

AI risk is increasingly a system-design problem, not just a model-safety problem. If an agent can read untrusted content and take action, it needs explicit boundaries.

Read the full brief

Daily brief

AI Security Signal Brief — 2026-03-16

The useful signal today is concrete: AI risk becomes easier to manage when teams review workflows through permissions, approvals, and data boundaries instead of treating governance as a policy-only exercise.

Read the full brief