Prompt Injection Is Well Understood
Prompt injection attacks involve malicious instructions embedded in inputs or retrieved content that cause an LLM to deviate from its intended behaviour. The attack is well documented, widely discussed, and the subject of extensive tooling investment. OWASP's LLM Top 10 lists it first. Security vendors have built entire product lines around detecting and blocking it.
This attention is warranted. Prompt injection is a real risk. But the volume of attention it receives has created an asymmetry: enterprise security teams are investing heavily in the threat that is easiest to discuss while underinvesting in the threat that is hardest to see.
The Harder Problem: What the Agent Can Reach
Consider an internal AI assistant deployed for a financial services firm. It has access to the firm's document management system, CRM, email, and internal knowledge base. A user with standard access permissions asks a question about client onboarding procedures.
The LLM processes the query legitimately. No prompt injection occurs. The guardrails pass. The output scanning finds nothing concerning. And yet the response includes non-public information about a specific client relationship that was retrieved from the CRM because the tool call that retrieved onboarding procedures was scoped too broadly.
No attack occurred. The model behaved exactly as designed. The exposure happened because the access boundary was wrong, and there was no system to enforce it at the retrieval layer.
This is the category of risk that most enterprise LLM security stacks cannot see. It does not trigger injection detectors. It does not produce anomalous outputs that scanning tools would flag. It is not a failure of the model. It is a failure of access governance.
The Three Retrieval Risks That Get the Least Attention
Data over-retrieval is the most common form. An agent with broad tool access retrieves more information than the task requires, and that information enters the context window where it can influence subsequent responses or be inadvertently surfaced to the user.
Context accumulation happens in long agentic sessions. An agent that makes dozens of tool calls across a session progressively builds up a picture of the organisation's internal state that no single query would have produced. Individually, each retrieval looks legitimate. Cumulatively, the agent has effectively constructed a detailed internal operations manual from fragmented queries.
Cross-client contamination occurs in multi-tenant deployments or in systems where an agent serves multiple users. An agent with access to client A's data that subsequently serves client B can carry context across session boundaries if session isolation is not enforced at the retrieval layer.
What Effective LLM Security Actually Requires
Prompt injection defences and output scanning remain necessary components of an enterprise LLM security stack. But they are not sufficient for organisations running agentic AI deployments with access to sensitive data systems.
Effective security at the access layer requires identity assignment for AI agents, so that each agent has a defined role with associated access permissions rather than inheriting the permissions of the user or system that spawned it. It requires sensitivity classification for data sources, so that access policies can be enforced based on the nature of the data being requested rather than just the identity of the requester. And it requires runtime monitoring at the retrieval layer, so that anomalous access patterns are detected and risk events are raised before exposure becomes a breach.
The organisations that will manage enterprise AI risk most effectively in 2026 are those that have moved their security thinking from the output layer to the access layer. The question is no longer only what the model says. It is what the model is allowed to see.
March 16, 2026
