Agentic systems are not just chat interfaces with better UX. They combine four things that rarely sit together in ordinary software: model reasoning, external content, tool execution, and often access to internal context. That changes the risk model.
A normal assistant can say something wrong. An agentic system can say something wrong and then act on it. The useful unit to threat-model is therefore not the model alone, but the entire chain:
If you only evaluate model quality, you will miss the highest-impact failures.
A basic agentic system usually looks like this:
1. user request arrives
2. system prompt and policy are assembled
3. extra context is retrieved from internal or external sources
4. model decides whether to call a tool
5. tool returns output or performs an action
6. result is shown to the user or committed elsewhere
The trust boundaries sit between these steps.
The most important one is usually this:
> untrusted content must not gain the same authority as system instructions or approval logic.
A browser page, retrieved document, and internal policy are all concatenated into one prompt. The model is then free to choose tools based on that combined context.
System policy, user intent, and untrusted content are handled as separate logical inputs. Tool-calling logic is constrained, and high-impact actions require confirmation outside the same untrusted context path.
Start with the smallest useful question:
What can this system read, what can it do, and what can influence its decisions?
That immediately gives you the first threat-model surface.
List every input class:
Then separate them into:
The biggest failure pattern is instruction/data confusion. A page, document, or email should not quietly become a new source of authority.
Write down the real action surface, not the marketing description.
Examples:
For each action, ask:
Identify what sensitive context can be exposed to the agent:
This matters because many failures in AI systems are not dramatic model exploits. They are quiet context leaks.
The system must distinguish between:
If this line is fuzzy, prompt injection becomes a design flaw, not an edge case.
Do not give the agent broad access just because it is convenient during development.
A good rule:
> if the agent does not need a capability every day, it probably should not have it by default.
High-impact actions should not happen silently.
Examples of actions that often deserve approval:
Browser and tool-using agents should have clear limits.
Examples:
You need enough logs to answer:
Not full chain-of-thought fantasy logs—just enough operational evidence to investigate misuse, drift, or failure.
The assistant can read tickets, summarize attached logs, and update ticket state. On paper this looks harmless. In practice it means external customer content, retrieved internal notes, and write access now sit in the same workflow.
What to review:
A browser agent can navigate an authenticated admin workflow faster than a human. It can also click the wrong thing faster than a human.
What to review:
For a first-pass review, do this in order:
1. list the inputs
2. list the actions
3. mark which inputs are untrusted
4. mark which actions have irreversible or sensitive side effects
5. verify approval gates
6. verify logging and rollback
7. test one malicious or ambiguous input path
That is enough to find most embarrassing failures early.
Before an agentic workflow should be trusted in a meaningful environment, a team should be able to answer:
If a team can describe the model but cannot describe the action boundaries, the system is not ready.
The practical decision is no longer whether AI belongs in security workflows at all. It is where it creates enough leverage to justify real controls, review, and ownership.
AI risk is increasingly a system-design problem, not just a model-safety problem. If an agent can read untrusted content and take action, it needs explicit boundaries.
The useful signal today is concrete: AI risk becomes easier to manage when teams review workflows through permissions, approvals, and data boundaries instead of treating governance as a policy-only exercise.