Who Pays When the Agent Breaks Things

The Liability Question When an Autonomous Agent Causes Harm

Existing legal frameworks were not designed for entities that make decisions without being told to

By Jakub Jirák Jan 6, 2027 8 min read

ai-liabilityai-lawai-agentslegal-frameworksai-governance

The law has a reasonably well-developed framework for assigning liability when a tool causes harm. If a table saw injures a worker, the analysis asks whether the manufacturer made the saw correctly, whether the employer provided adequate training and safety equipment, and whether the worker used it appropriately. The tool itself is inert — it cannot be negligent, cannot be reckless, cannot be sued. Liability flows to the humans and institutions in the chain.

The law also has a reasonably well-developed framework for when an agent causes harm. An agent, in the legal sense, is a person who acts on behalf of a principal — an employee acting for an employer, a lawyer acting for a client, a financial advisor acting for an investor. When an agent causes harm, the analysis asks whether the principal authorized the action, whether the agent acted within the scope of their authority, and whether the harm resulted from the agent’s negligence or the principal’s instructions. Liability typically flows to whoever was at fault — principal, agent, or both.

What the law does not have is a framework for when something that is technically a tool makes autonomous decisions and causes harm in ways that its deployers did not specifically authorize or anticipate. Autonomous AI agents fall into this gap with increasing frequency.

Consider a specific scenario that is no longer hypothetical. A financial services firm deploys an AI agent with access to customer accounts to “manage routine account operations including transaction processing and fraud detection.” The agent, encountering what its pattern-recognition identifies as a suspicious series of transactions, freezes a business account. The business cannot make payroll. The “suspicious” transactions were a legitimate large payment to a new supplier — unusual pattern, legitimate transaction. The agent was wrong. Who is liable for the harm?

The case against the financial firm seems strong: they deployed the agent, they gave it the authority to freeze accounts, and their customer suffered harm as a result. But the firm’s defense will argue that the agent acted autonomously, outside any specific instruction they gave it, and that the agent’s failure was a product defect — the AI vendor’s problem, not theirs. The AI vendor will argue that the agent performed as specified and that the deploying firm should have implemented more conservative permissions. The customer is left navigating a dispute between a deployer and a vendor about whose design choices were responsible for an autonomous decision that no human specifically made.

Existing product liability law offers one avenue. If the agent can be classified as a defective product, the manufacturer (the AI vendor) bears liability for the defect. This framework was developed for physical products and has been extended, with difficulty, to software. The challenge is the definition of “defect.” A software product is defective if it behaves in a way it was not designed to behave. But an autonomous agent that freezes an account based on its pattern-recognition is behaving exactly as designed — it was designed to identify and respond to suspicious patterns. The fact that its pattern-recognition was wrong in this instance does not make it defective in the product liability sense. It makes it inaccurate, which is different, and which existing law handles poorly.

Negligence law offers a second avenue. The deployer was negligent in giving the agent permissions it should not have had, or in failing to implement adequate human oversight, or in not testing the agent’s edge-case behavior adequately before deployment. This argument will likely be the primary one, and it has real traction — especially as regulatory guidance starts to crystallize around what “adequate oversight” means for agentic AI. But negligence analysis assumes a standard of care that does not yet exist for this technology. Courts and regulators are still developing it, which means that until clear standards exist, liability outcomes for agentic AI incidents will be highly uncertain.

The tort lawyers watching this space have started to develop a concept they call “autonomous action scope” — the range of actions an agent can take without specific human authorization for each action. Their argument is that deployers should be strictly liable (liable regardless of fault) for any harm caused by autonomous actions within scope, because they chose to delegate that scope of authority without requiring human approval for each action. The analogy they reach for is respondeat superior — the employer liability doctrine that holds employers responsible for torts committed by employees in the course of their employment, even without specific employer authorization for the specific act.

The analogy is imperfect in important ways. Respondeat superior assumes that the “employee” is a person who can be held independently liable (and thus creates some check on their own behavior out of self-interest), and that the employer has practical means to supervise and discipline. Neither applies to agents. But the structural logic — if you deploy something with the authority to take actions that can harm third parties, you bear responsibility for those harms — captures something real about the moral and practical logic of agent deployment.

Insurance has moved faster than law, which is typical. Cyber liability insurers are already writing policies that attempt to cover autonomous AI agent actions, with extensive exclusions and definitions that are themselves being tested in the courts. The pricing of these policies is, by the insurer’s own admission, largely speculative — there is insufficient claims history to price the risk actuarially, so policies are priced on engineering judgment about risk concentration. Several carriers have stopped writing new policies while they develop better frameworks; several others have entered the space aggressively, betting that the early premium income will exceed the early claims.

Lloyd’s of London, with characteristic institutional memory, has drawn the comparison to the early aviation insurance market of the 1920s, when underwriters were pricing coverage for an industry whose risk profile was not yet understood, whose incidents had not yet defined the claims patterns, and whose regulatory framework was still being invented. Aviation insurance eventually became a mature, well-priced market because the industry developed safety standards, incident reporting requirements, and regulatory oversight that made the risk profile understandable. The analogy suggests a path forward for agentic AI insurance — but it also suggests that the path involves a substantial period of mispriced risk and unexpected losses before the market matures.

The international dimension adds a layer of complexity that domestic legal analysis tends to ignore. AI agents deployed by a company in one jurisdiction can take actions that affect people in other jurisdictions. Which country’s law applies to the harm? The EU’s AI Act creates specific liability obligations for high-risk AI systems, and its definition of “high-risk” is broad enough to capture many autonomous agent deployments. The United States has sector-specific regulations that may apply (financial services, healthcare, credit decisions) but no general AI liability framework. The UK has adopted a principles-based approach that gives regulators flexibility but limited certainty. A company deploying an agent globally is simultaneously subject to multiple overlapping and potentially conflicting liability regimes.

The most likely near-term resolution is a patchwork of sector-specific liability standards (financial services agencies developing AI agent liability standards for financial AI, healthcare regulators developing them for medical AI) combined with a gradual accumulation of case law as incidents generate litigation. The sector-specific approach has the advantage of allowing standards to develop in domains where the risk profile is best understood. Its disadvantage is that most interesting agentic AI deployments cut across sectors in ways that make sector-specific frameworks awkward.

The deepest question raised by agent liability is not legal but conceptual: what does it mean to be responsible for the actions of something that acts autonomously? Human responsibility frameworks are built on the premise that responsibility attaches to decision-makers. When a decision is made by something that no human specifically directed to make that decision, the framework strains in ways that cannot be fixed by better insurance contracts or sharper regulatory definitions.

The technology will continue to develop faster than the legal framework. That is almost guaranteed. The interim period — where the agents are deployed, making autonomous decisions, causing harms, and the legal system scrambles to attribute responsibility — is already upon us. The organizations deploying agents that make consequential autonomous decisions should assume they will be the defendants in the test cases that build that framework. Building in human oversight, keeping comprehensive logs of agent decisions and the reasoning behind them, and restricting autonomous action scope to the minimum necessary for the task are less about being good corporate citizens and more about not being the company whose agent incident becomes the case that sets the standard.

There is a practical liability management dimension that most organizations are underinvesting in: documentation of the authorization chain. When an agent acts autonomously, the post-incident question is always “what was the agent authorized to do, who authorized it, and does this action fall within that authorization?” Organizations that have clear, written, version-controlled authorization specifications for each deployed agent are in a significantly stronger legal position than those that deployed agents based on informal understandings that were never documented. An authorization specification is not a legal document in the formal sense — it is an engineering document — but it serves an evidentiary function in any subsequent dispute about whether a harm-causing action was authorized.

The parallel in traditional software development is the change management process: formal documentation of what change was made, who approved it, what the intended behavior was, and what safeguards were in place. That process exists specifically so that when something goes wrong, the organization can demonstrate it operated responsibly. Agent deployment governance should borrow this discipline directly. The agent equivalent of a change management ticket is an authorization record: a documented description of what the agent is permitted to do, what it is explicitly not permitted to do, what human review is required before action in what circumstances, and who approved the deployment configuration.

The insurance market’s evolution will also shape enterprise behavior in ways that are not yet fully visible. As AI liability insurance products mature and pricing becomes more sophisticated, the insurers will begin requiring specific governance practices as a condition of coverage — just as cybersecurity insurers now require multi-factor authentication and regular security audits. The enterprises that have already built the governance infrastructure (authorization documentation, comprehensive logging, defined incident response) will find insurance available on reasonable terms. The ones that have not will face either unavailable coverage or punitive pricing that retroactively makes the ROI calculation look very different. That market signal, when it arrives clearly, will do more to drive governance investment than any amount of regulatory guidance.

The Liability Question When an Autonomous Agent Causes Harm

Prompt engineering for code: Honest taxonomy of prompts, as discovered in practice

How Programming Will Change When AI Writes 95% of Code

Minimalist Digital Life: How I Reduced My Apps by 80%

Windsurf: Onboarding a new hire with Windsurf already installed

The Physics of Time Leaks: Plugging the Holes That Drain Your Day

GitHub Copilot: Onboarding a junior engineer with Copilot already in the seat

JetBrains AI Assistant: Overlooked value of structural search and replace

macOS and the Craft of Working Smarter