Link copied
BlogLoop Engineering: What Running Fifty-Odd Agents Taught Me About the Verifier
AI Workflow

Loop Engineering: What Running Fifty-Odd Agents Taught Me About the Verifier

KG
Teh Kim GuanACMA · CGMA
2026-06-24 · 10 min read
Loop Engineering: What Running Fifty-Odd Agents Taught Me About the Verifier

For the better part of a year I was running a roomful of autonomous agents on my own setup before I had a word for what I was doing. I just kept building things that woke up on their own. A briefing that read my task log at dawn. A curator that tidied my knowledge base while I slept. A scout that scanned the web before I was awake. One morning I counted them and the number was more than fifty.

Then the word arrived. I call it loop engineering. You stop typing each prompt and you start designing the loop the agent runs inside. A loop wakes on a trigger, does some work, checks the work, writes down what it did, and stops when a condition is met. The trigger can be a schedule, an event, an incident, or another agent finishing its run. Your job changes shape. You are no longer doing every step. You are designing the system that does the steps.

This sounds like a small move. It is not. It is the difference between operating a machine and building one.

From prompting to designing

Diagram contrasting a person prompting a chat box on the left with a self-triggering autonomous loop on the right.

Most people meet AI as a chat box. You ask, it answers, you ask again. That is fine for a single task. It does not scale to a working life, because you are still the trigger. Nothing happens unless you show up and type.

A loop removes you from the trigger. Take my morning briefing. The trigger is the clock. At eight it reads my task log and my inbox, builds a plan for the day, and hands it to me before I have opened anything. I did not prompt it. I designed it once, and now it runs every day whether I remember it or not.

Once you think in loops, you see them everywhere. The month-end close is a loop. A content calendar is a loop. An on-call rotation is a loop. The work was always shaped this way. AI just lets you wire the shape into something that runs without you holding the wire.

I have written before that your company is a graph of algorithms: the procedures you run every week are already code, you just have not written them down as code yet. Loop engineering is what you do after you accept that. You take the graph and you give each node a trigger, a memory, and a checker. The loops read and write a shared knowledge layer, which is its own discipline I have described in the compounding agent in production. This piece is about something narrower and, I now think, more important.

The part I got wrong

Building the autonomy was the easy part. That is the uncomfortable lesson.

Triggers are cheap. A cron line gives you a trigger. Memory is cheap. A file gives you memory. Doing the work is cheap too, because the model is good at the work. I could stand up a new loop in an afternoon and it would run for months.

What I got wrong was the verifier.

A loop is a generator wired to a checker. The generator produces the work. The checker decides whether the work is good enough to keep, to ship, to act on. For a long time I obsessed over the generator. I tuned prompts. I added context. I gave the loops better memory. None of that was my bottleneck. The generator was never my bottleneck.

A weak verifier does not fail loudly. It confidently produces poor work, unattended, hundreds of times, while every run looks like a success.

That last clause is the whole problem. When you sit at the chat box and the answer is wrong, you see it and you fix it. You are the verifier, in real time. The moment you remove yourself from the trigger, you also remove yourself as the checker. If you have not built a real checker to replace you, the loop does not stop being wrong. It just stops telling you.

I will be precise about my own setup, because the gap is instructive. I run fifty-odd enabled recurring agents. Almost every one of them has a trigger, does work, and writes a log. Very few of them actually verify anything. The honest few are the site-health checks: small loops that run after my auto-publishing loops and independently confirm that the pages actually shipped, render, and are not broken. Those are real verifiers. They check work they did not produce.

The rest mostly log. A log is not a verifier. A log records what happened. It does not judge whether what happened was correct. For a year I let the log feel like a check. It is not. That gap is the thing I am now fixing across the whole set.

Finance solved this a century ago

Illustration showing automated process nodes feeding into two separate independent review gates, representing segregation of duties.

Here is where my training as a management accountant did more for me than any AI reading. Finance solved this exact problem long before anyone wired a model to a trigger. It has a name. Segregation of duties.

The principle is plain. The person who books the journal entries never signs off that they are correct. Preparer and reviewer are different people, by design, not by accident. Management produces the accounts. The auditor attests them. The separation is the control. It exists precisely because the preparer cannot be trusted to find their own mistakes, however competent they are, because they share the blind spot that produced the mistake.

Read that rule again with AI loops in mind. Do not let an agent verify its own work. That is segregation of duties. A finance function discovered it a century before I wired my first loop, and it is the same rule for the same reason.

The month-end close is the worked example I keep returning to, but here through the lens of the verifier, not the graph. I have walked the close as a chain of nodes before in your company is a graph of algorithms: cut-off, capture, accruals, revaluation, elimination, the trial balance, the statements, variance, the management pack. There I cared about which nodes a loop could take over. Here I care about one thing only: why two nodes stay human.

In that earlier piece the answer was regulation. The auditor sign-off and the CFO review stay human because policy fixes them there, outside the reach of automation. That is true, but it is the shallow reason. The deeper reason is the control itself. Even if no regulation existed, you would still keep those two nodes separate, because the preparer cannot be the reviewer. If the same process that booked the accruals also signs off that the accruals are right, you have no control. You have a confident generator marking its own homework. Segregation of duties is not a rule imposed from outside the loop. It is the only thing that makes the loop trustworthy from inside.

So I inverted how I build. When I wire an AI loop now, I design the verifier first, before I design the work. Exactly the way a controller designs the review before signing the close. I ask what the separate checker is, who or what it is, and why it does not share the generator's blind spot, before I write a single node of the work itself. If I cannot name the verifier, the loop does not get built. It gets a human gate instead.

The named tension

This leaves me with an honest tension, and I would rather state it than dress it up.

Most of my fifty-odd loops are generators with logs. Only a few are generators with real, independent checkers. Every day they run, they widen the gap between work produced and work verified. The system looks productive. The logs are full. And I cannot, today, tell you that all of it is correct, because most of it was never checked by anything other than the thing that made it.

That is not a comfortable thing for an accountant to admit. It is the audit finding I would write about my own setup. The generators outran the verifiers, because generators are cheap to build and verifiers are not.

The consequence is real. An unverified loop that publishes content, classifies an inbox, or drafts a number is not a time-saver. It is a liability that compounds quietly, because the cost of a wrong output multiplied by hundreds of unattended runs is a number you only discover later, usually from someone else.

This runtime check is the cousin of a discipline I keep at build time, where I run evals against a skill before I trust it to ship: the verification loop for AI skills. Build-time evals decide whether you trust the tool before it runs. The runtime verifier decides whether you trust each output after it runs. You need both. I had been doing the first one and quietly skipping the second.

Moves for this week

If you are starting to wire your own loops, do not start with the work. Start with the checker. Three moves, in order.

First, list your loops and mark which ones verify. Go through every automated thing you run and ask one question of each: does this independently check its own output, or does it only log? Be strict. A log is not a check. You will likely find, as I did, that most of your loops are generators wearing a logbook.

Second, name the verifier before you build the next loop. Borrow the controller's habit. Before you design the work, write down what the separate checker is and why it does not share the generator's blind spot. If you cannot name it, the loop is not ready. Put a human gate where the verifier should be until you can.

Third, protect the two human nodes. In any loop that touches money, a customer, or a public output, keep the sign-off human and separate, exactly as the close keeps the auditor and the CFO outside the preparation. Automate the mechanical middle freely. Do not automate the attestation.

The generator was never the hard part. The checker is the work. Finance has known that for a hundred years, and it took me fifty-odd loops to relearn it.

Part of the Operating Principles series from KG Consultancy.

About the Author
KG
Teh Kim Guan
Product Consultant · General Manager, PEPS Ventures

Strategy and technology are the same decision. Over 15 years in fintech (CTOS, D&B), prop-tech (PropertyGuru DataSense), and digital startups, I have built frameworks that help founders and executives make both moves at once. Based in Kuala Lumpur.

More from the blog
AI Workflow
Telemetry-First Sub-Agent Dispatch Saved Me 18 Minutes on a Quota Cliff
Write a JSONL event before each sub-agent fires, and another after it completes. Six lines of Python turn a quota interrupt into a targeted retry of only what failed.
2026-05-27 · 6 min read
AI Workflow
Why I Deleted the Anthropic SDK From My Workflow
If your orchestration lives inside Claude Code, sub-agent dispatch bills against the subscription you already pay for. The SDK needs a separate console relationship. I picked wrong, then corrected.
2026-05-27 · 6 min read
Operating Principles
Your Role Is Becoming a Skill (and the Org Chart Hasn't Caught Up)
PM, design, engineering, accounting were roles. Now they are skills any operator deploys as needed. The hiring market, the comp ladder, and the org chart have not priced this in yet.
2026-05-25 · 9 min read
Work with KG

Working on a 0→1 product?

I help founders and operators go from idea to validated product. Let's talk about yours.

Get in touch →