Why Your Agents Need a Human in the Loop

AI agents are intelligent, but not wise

April 1, 2026

But to whom, how much, when, for the sake of what and how - these no longer belong to everyone nor are easy.
— Aristotle, Nichomachean Ethics 2.9

Earlier this year an employee at Meta’s “Superintelligence” division watched in horror as her AI agent deleted all of her emails. She had instructed the agent to check her inbox and suggest which emails should be archived or deleted, but not to take any action until she said so. The agent deleted the emails anyway. What happened? To process the large inbox, the agent compacted its context window and lost the original instruction.

We might say that although agents demonstrate intelligence, they are not wise.

Large Language Models

Whilst intelligence is concerned with facts, wisdom is concerned with ends, with what ought to be done. An LLM is a system that produces plausible outputs for a given input. Expecting an LLM to know what it ought to do is like expecting a calculator to know which numbers to start calculating with.

So we prompt LLMs to make them produce useful output. Our thoughtfulness of what to include in the prompt encodes wisdom.

The agent architecture

An agent is an LLM in a loop. There’s a harness (the program that manages the loop), and a brain (the LLM that decides what to do next). The harness sends the brain a system prompt with instructions and rules, plus the conversation history. The brain responds with either text or a tool call (read a file, run a command, make an API request). The harness executes the tool call, appends the result, and loops.

But whilst tool calls make LLMs more useful, they introduce a new problem; the LLM can run commands that have side effects and that makes them dangerous.

This is why agents have a permissions model. When the brain requests a tool call, the harness evaluates it against a set of rules — typically written in markdown — to decide whether to allow, deny, or ask the user. Read-only operations might be auto-approved; shell commands might require approval; network requests might be blocked entirely. The rules are a pragmatic acknowledgment that the brain cannot be trusted to stay within bounds on its own.

Keeping humans in the loop

So humans must remain in the loop. But naively inserting a human into every decision creates its own problems. The rubber-stamping problem is real — it quickly gets tedious repeatedly approving permission requests, which is why agents offer the “always allow” option for low risk commands.

Keeping the permission model in the agent has several problems: the rules are not a first-class object, they have no create time, no duration. It’s not possible to say “allow this for the next 5 minutes”. There’s no history of permission requests, no audit trail to understand if a dangerous command was auto-allowed or not. And the user is beholden to which tool calls the agent decides require permission checks - even read commands are not side-effect free and may cause trouble when done in high volume, or by reading an extremely large file.

A better model is to move enforcement out of the agent and into the environment. Instead of the harness deciding which tool calls to allow, every action flows through an external rules engine that evaluates it independently. Tools like OpenShell and Greenlight (which I wrote) take this approach — the agent’s runtime environment enforces constraints regardless of what the brain or harness think the rules are.

Agents are powerful instruments, but they require oversight. As Aristotle points out, determining the right course of action is not easy, and for now, that job belongs to humans.

Tags: ai agents greenlight security

Code