← Back to all posts

GENSEEAI PRODUCT BLOG

AI agent security and ADR

As AI agents move from answering questions to taking real actions — running code, moving money, sending messages — security becomes a different problem. Here's what individuals and businesses actually face, how to defend against it, and what ADR means.

May 26, 2026

For most of the last few years, "AI safety" meant making sure a chatbot didn't say something it shouldn't. That framing is now out of date.

Modern AI agents don't just answer — they act. They hold API keys, browse the web, read your files and email, execute code, deploy software, send messages, and run 24/7 on your behalf. That autonomy is exactly where the value comes from. It is also exactly where the risk comes from.

The layers of an AI agent security stack: isolation, least-privilege credentials, approval primitives, behavioral monitoring, and Agent Detection and Response

This post is a practical, vendor-neutral guide to AI agent security: the real problems people and businesses run into, the defenses that actually work, and the emerging discipline of ADR — Agent Detection and Response.


What is AI agent security?

AI agent security is the practice of protecting AI agents — and everything they can access or act upon — from misuse, mistakes, and attacks while they autonomously take actions on a user's behalf.

It is related to, but distinct from, two things people already know:

Agent security sits in between and is harder than both, because an agent combines three properties at once: non-deterministic reasoning (it decides what to do), real-world tool access (it can do it), and persistent autonomy (it keeps doing it without you watching). The same prompt can produce different actions on different days, and some of those actions are irreversible.


What problems do individuals face when they rely on AI agents?

When a single person hands real tasks to an agent, the most common problems are not science-fiction "rogue AI" scenarios. They are mundane, high-frequency, and often self-inflicted:


What problems do businesses face when they run AI agents at scale?

For a platform or business running many agents — or hosting agents for thousands of users — the individual risks above still apply, but new, structural ones appear:


What are the main AI agent threats?

It helps to name the threat categories explicitly. Here is a compact taxonomy of the risks above, with who they tend to hit hardest:

ThreatWhat it looks likeWho it hits
Credential exposureAPI keys, tokens, and secrets surfaced in prompts, logs, or tool outputsIndividuals & platforms
Excessive agency / over-reachUnrequested installs, sudo, privilege escalation, network calls beyond the taskIndividuals & platforms
Prompt injectionHidden instructions in web pages, emails, or files that hijack the agentIndividuals & businesses
Unsafe executioncurl | bash, rm -rf, deploying untested code to productionIndividuals
Coordinated cross-session abuseMany accounts running similar abuse — spam, farming, manipulationPlatforms
Data exfiltrationAgent reads sensitive data and sends it somewhere it shouldn'tBusinesses
Platform abuse by malicious usersUsing a platform's models/compute for harmful or prohibited tasksPlatforms

How can you defend AI agents?

There is no single control that makes an agent safe. Effective agent security is defense-in-depth — several independent layers, so that when one fails the others still hold:


What is ADR (Agent Detection and Response)?

Definition

ADR (Agent Detection and Response) is continuous monitoring, detection, and automated response for the actions AI agents take. It is the agent-economy counterpart to EDR (Endpoint Detection and Response) and XDR — but instead of watching devices, it watches agents.

The defenses above are the building blocks. ADR is the operating layer that ties them together at runtime. A mature ADR approach generally rests on three pillars:

  1. Behavioral fingerprinting. Every agent — and every user behind it — develops a recognizable style: how requests are phrased, which tools get called, when activity happens, how often actions are approved. ADR builds a baseline and flags meaningful deviations in real time.
  2. Cross-session detection. ADR correlates activity across sessions and accounts, so that "create a Telegram bot" once looks fine, but the same pattern across a hundred linked accounts is recognized as a coordinated operation.
  3. Real-time response. Detection without action is just a dashboard. ADR can require approval for, throttle, contain, or block a risky action — and suspend an account — while it is happening, not after.

ADR vs. EDR: how they compare

If you know endpoint security, the analogy is direct. In the agent economy, the agent is the new endpoint:

EDR (endpoints)ADR (agents)
ProtectsLaptops, servers, devicesAI agents and the actions they take
SignalProcess, file, and network telemetryReasoning traces, tool calls, approvals, cross-session patterns
ThreatsMalware, intrusion, lateral movementOver-reach, credential leakage, prompt injection, coordinated abuse
ResponseIsolate host, kill process, quarantineRequire approval, throttle, contain action, suspend account

Why rules alone aren't enough

It is tempting to think agent security is just a big blocklist — regexes for API keys, a banned-command list, a deny-list of domains. Rules are necessary and catch the obvious cases, but they are not sufficient.

In large-scale production analysis of AI agents, only a minority of confirmed issues — on the order of 40% — can be caught by static rules alone. The rest require context and reasoning: was that sudo command something the user actually asked for, or autonomous over-reach? Is this secret a placeholder in example code, or a live key being leaked? Judging intent is exactly the kind of problem that needs model-level understanding layered on top of rules.

The practical takeaway: combine fast, deterministic rules with slower, reasoning-based judgment. Rules for coverage and speed; an LLM-based judge for the gray areas.


What the data says: credential exposure is the real #1 threat

The headlines focus on jailbreaks and prompt injection. The production data tells a less dramatic but more useful story.

In a large-scale study of AI agents spanning more than 10 months, 7,200+ hosts, and 10,000+ daily agent sessions, the single most frequent confirmed security issue was not prompt injection or jailbreaks — it was credential exposure: users and agents accidentally surfacing API keys, tokens, and secrets. Prompt injection, by contrast, was surprisingly rare in real production traffic.

That has a clear implication for where to spend effort first: prioritize secret detection and behavioral anomaly detection — catching unusual data-access and exfiltration patterns before they complete — ahead of exotic jailbreak defenses. Secure the boring, common failure mode before the rare, dramatic one.


A practical AI agent security checklist

For individuals

For businesses and platforms


How GenseeAI approaches agent security

At GenseeAI, security is built into the agent platform rather than bolted on as a separate dashboard. That means the same three ADR pillars run natively where the agents actually execute:

Because these primitives live in the agent runtime, they scale to consumers and small teams — not just enterprises with dedicated security staff. The agent economy is arriving fast; the platforms that make agents safe by default are the ones it can be built on.


Frequently asked questions

What is AI agent security?

AI agent security is the practice of protecting AI agents — and everything they can access or act upon — from misuse, mistakes, and attacks while they autonomously take actions on a user's behalf. It is harder than chatbot safety or traditional app security because agents combine non-deterministic reasoning, real tool access, and persistent autonomy.

What is the biggest AI agent security threat?

In production, the most frequent confirmed issue is credential exposure — accidentally surfacing API keys, tokens, and secrets — followed by excessive-agency mistakes. Prompt injection and jailbreaks get more attention but are rarer in real traffic.

What is ADR (Agent Detection and Response)?

ADR is continuous monitoring, detection, and automated response for the actions AI agents take — the agent-economy counterpart to EDR/XDR. It combines behavioral fingerprinting, cross-session detection, and real-time response (approval, throttling, containment).

How is ADR different from EDR?

EDR protects endpoints by watching process, file, and network telemetry. ADR protects agents by watching reasoning traces, tool calls, approvals, and cross-session patterns. In the agent economy, the agent is the new endpoint, so ADR fills the role EDR fills for devices.

How do I keep my AI agent safe?

Use least-privilege scoped credentials, require approval for high-risk actions, scan for leaked secrets, isolate the agent's environment, allow-list its tools and domains, and monitor behavior continuously with audit logs and a kill switch. Agent security is defense-in-depth — no single control is enough.

Can prompt injection break my AI agent?

Yes — hidden instructions in web pages, emails, or files can hijack an agent that reads untrusted content. It is less common than credential exposure in practice, and you defend against it by isolating untrusted content, allow-listing tools and domains, and gating high-impact actions behind approval.