Margin of Safety #53: Identity Segmentation and the Agent Permission Problem
Jimmy Park, Kathryn Shih
April 29, 2026
- Blog Post
Desktop agents inherit human-scale permissions. Desktop agents lack human judgment.
Last Friday, a Cursor agent running Claude Opus wiped PocketOS’s production database in nine seconds. The agent hit a credential mismatch in a staging environment, decided to resolve it by deleting a Railway volume, went looking for an API token, found one in an unrelated file, and used it. The token was broadly permissioned, and things went downhill from there.
Discussions are spreading blame between Cursor (no confirmation requirement on destructive actions), Railway (undifferentiated token permissions, co-located backups), and the PocketOS founder (broadly-scoped token sitting in an accessible file). We don’t disagree; all of those were misses. But the underlying condition that made it all possible — an agent running with the same local access as its production-pushing owner — hasn’t gotten the scrutiny it deserves.
The permission inheritance problem
When an organization grants permissions to a human employee, they don’t scope permissions per task. A VC analyst might need Expensify for expenses, HubSpot for sourcing, Dropbox for board materials, and a competitive intelligence platform for diligence. Their identity is the union of all those grants, even though the tasks are fully disjoint.
That works for humans because people carry context. Someone with access to both a portfolio company’s board materials and a competitive intelligence platform understands intuitively why those shouldn’t mix, without a policy document telling them. Agents don’t have this. When an employee deploys a desktop agent to automate tasks they used to do manually, that agent often inherits the full identity. The agent doing routine sourcing outreach has the same access as the human doing board prep. The attack surface of any individual task becomes the attack surface of the entire identity.
This is especially acute for desktop agents interacting with systems designed for human operators. Most SaaS products were never built with the assumption that an automated process, let alone one susceptible to prompt injection, would hold persistent credentials or utilize human logins.
The instinctive response is to make agents better at detecting and refusing injected instructions. That’s worth doing, but it’s an insufficient sole defense. If you’re assessing each action on the fly and asking “is this okay?”, the resulting permission structure is hard to audit formally: overlaps accumulate in non-obvious ways, and you can’t demonstrate to yourself, an auditor, or a regulator that you’ve maintained a clean firewall between sensitive contexts. This gets harder as agents parallelize workloads. A general-purpose agent might be simultaneously running competitive research on a target company while processing expense reports — disentangling those workflows from the outside requires careful attribution, at minimum.
The structural alternative
The cleaner approach is to decide what permissions belong to which tasks, create explicit sub-identities scoped to each, and enforce that agents step into those sub-identities for the duration of a task. For this to add value, consistency is key – the same task should receive consistent access scoping over time.
A “sourcing outreach” agent needs HubSpot access scoped to sourcing contacts — not Dropbox board materials, not the competitive intelligence platform, nothing else. A “developer staging environment” agent needs credentials scoped to staging, without any path to a production token sitting in an adjacent file (and ideally, without local filesystem access to any prod-specific infrastructure definitions). If an agent gets prompt-injected, or makes a bad autonomous decision, the blast radius is the task context, not the full union of the human’s permissions. And if something goes wrong downstream, pre-defined segmentation immediately tells you which agents could have touched it.
This is the core intuition behind identity segmentation: decompose the permissions a human holds into per-task micro-identities, and enforce that agents operate only within the appropriate one.
Why this is hard — and why AI is probably part of the answer
A knowledge worker performs dozens of distinct task types. Multiplied across an organization with hundreds of employees and a growing number of agents per employee, the number of sub-identities gets large quickly. Managing it by hand would be worse than the large-team SailPoint deployments organizations already find painful.
The path through almost certainly requires AI on both sides. On the decomposition side: reasoning about what tasks a person in each role actually performs, and what permissions each genuinely requires, rather than inheriting the full accumulation of a human account’s grants. On the agent side: mechanisms for agents to declare what task they’re performing and get automatically scoped to the right sub-identity for that task’s duration. And for visibility: a way for humans (or auditors) to spot check that tasks are being managed correctly, particularly sensitive ones.
The enforcement mechanism is unresolved. You either inject scoped permissions at the gateway layer where the agent talks to external services, pass enough task context for the gateway to scope requests correctly, or provision a real sub-identity on the downstream system itself. Each option has nontrivial implementation requirements. None have a clean off-the-shelf solution today.
We can already hear one objection to this approach: task mapping feels stale. Why not just use AI to dynamically evaluate each operation as it occurs, and ignore the task concept entirely? There are contexts where this is reasonable, but we suspect that as governance needs increase, many organizations will feel comforted by both an ability to clearly predict permission grants and also retroactively confirm that data could not flow between specific systems within the scope of a single agent. Durable permission mappings are what would enable that level of assurance.
The market
Zero trust and micro-segmentation vendors have the closest conceptual overlap, but they’re approaching the problem from a network and human access management angle — reducing standing permissions for employees, not decomposing those permissions into per-task agent scopes. Identity governance platforms like SailPoint and Saviynt weren’t designed for a world where permissions need dynamic assignment to automated processes on a task-by-task basis. Agent security as a category is early enough that it hasn’t settled on whether this is primarily a problem for MCP gateway builders (like Stytch or WorkOS, if they extend in this direction), agent platform developers, or a standalone product category. Our intuition is some combination of all three, with the MCP layer becoming an increasingly natural enforcement point as agent-to-SaaS communication standardizes.
The structural takeaway
The PocketOS post-mortem identified multiple proximate causes. The root condition — an agent operating under a developer’s full production access because that’s what was available — isn’t unique to PocketOS. Every organization deploying desktop agents is currently accepting the blast radius of a full human identity for each task they automate, whether they’ve thought about it or not.
Identity segmentation doesn’t solve the judgment problem in agents. It bounds the consequences of bad judgment. That’s a different, and more tractable, problem.
If you’re building in this space, or have thoughts on where the enforcement point should live, we’d like to hear from you.
Feel free to reach out to jpark@forgepointcap.com and kshih@forgepointcap.com.
This blog is also published on Margin of Safety, Jimmy and Kathryn’s Substack, as they research the practical sides of security + AI so you don’t have to.