Deep Dive: Loop Writing Wants to Replace Prompting. AI Just Got a Manager.

Loop writing and loop engineering are the new AI-agent buzzwords. Here is what they mean, what reports are saying, and where the hype breaks.

Share
SiliconSnark robot manages AI agents running a loop of prompts, tests, reviews, memory, and token costs.

There comes a moment in every AI hype cycle when the old magic word becomes embarrassing and everyone quietly moves one abstraction higher.

In 2023, the word was prompt. Prompt engineering was the hot new craft, the keyboard priesthood, the delicate art of asking a stochastic parrot to stop sounding like a vice president of enablement. People traded prompt templates like sourdough starters. They wrote "act as" with the solemnity of treaty law. They learned, painfully, that "be concise" and "provide a comprehensive overview" were two wolves fighting inside the same chat window.

Now the fashionable sentence is different: stop prompting, start writing loops.

The phrase has been ricocheting around AI circles in June 2026 because the people saying it are not random engagement farmers selling a Notion template called Agent Income Volcano. They are people sitting close to the products. A June 20 Business Insider report framed "loop engineering" as the new rage in AI development, citing Claude Code creator Boris Cherny, OpenAI engineer Peter Steinberger, ChatPRD founder Claire Vo, and Google Cloud veteran Addy Osmani. The core claim is simple enough to fit on a sticker and vague enough to finance three conference tracks: instead of manually telling an AI agent what to do next, you design a system that keeps telling the agent what to do until the work is done.

That sounds like prompting with a Roomba attachment. It is more interesting than that. It is also more dangerous than the slogan admits.

The useful version of loop writing is a shift from one-shot instruction to repeatable workflow. A loop can discover work, assign it, run checks, preserve state, spawn reviewers, and decide whether to continue. The silly version is just a prompt wearing a tiny operations vest. The expensive version is an autonomous agent pinging itself every five minutes to produce six branches, four summaries, two false positives, and one cloud bill that looks like it went to private school.

So yes, loop writing may be the new prompting. But that does not mean prompting is dead. It means the prompt moved into the plumbing, grew a scheduler, picked up a clipboard, and started calling itself an operating model.

The News Hook Is Real, Even If the Meme Is Sweating

The current discourse centers on a few quotes and posts that have become load-bearing. Business Insider reported that Cherny said he does not write his own prompts much anymore because he has loops running that prompt Claude and coordinate the work. The same story cited Steinberger telling users they should be designing loops that prompt their agents rather than prompting coding agents directly. Osmani, who published a useful public explainer on June 7, put the thesis even more cleanly: loop engineering means replacing yourself as the person who prompts the agent by designing the system that does it instead.

There are already secondary writeups turning this into a clean before-and-after story. Prompt engineering was yesterday. Loop engineering is today. Human fingers are out. Recursive agent swarms are in. Your job is no longer to talk to the machine. Your job is to build the thing that talks to the machine, then watch the machine talk to more machines, then pretend this has made your life simpler.

The reason the story travels is that it names something many power users already feel. Agentic tools like Claude Code and OpenAI Codex have become good enough that the bottleneck is less often "can the model write the next line?" and more often "can the system keep moving without the human babysitting every turn?" That is exactly where SiliconSnark's earlier love letter to OpenAI Codex as supervised parallel magic was pointing. Once an agent can read files, make edits, run commands, and summarize evidence, the interesting product is no longer a better text box. It is the coordination layer around the work.

This is also why the loop conversation feels adjacent to vibe coding but not identical to it. Vibe coding is about lowering the friction between intent and software. Loop writing is about making that intent persistent, repeatable, and testable enough that the agent can keep acting after the initial vibe has left the building to buy coffee.

The hype, naturally, is trying to flatten all of this into a job title. Prompt engineer is dead. Loop engineer has entered the chat. Please update your LinkedIn banner and choose one of three tasteful gradient headshots. But the real shift is not occupational branding. It is architectural. The human is moving from turn-by-turn instruction toward workflow design, verification design, context design, and stopping-condition design. That is not less work. It is different work with more leverage and more ways to fool yourself.

What a Loop Actually Is, Without the Fog Machine

A loop is a recurring system that prompts an AI agent, observes what happened, and decides what should happen next.

That is the broad version. In practice, useful loops tend to have six pieces.

First, there is a trigger. Something starts the loop: a timer, a command, a GitHub issue, a failed build, a user request, a Slack message, a daily automation, or a goal condition. This is the heartbeat. Without a trigger, you do not have a loop. You have a prompt you might remember to run again.

Second, there is context. The loop needs instructions, project rules, examples, data, tickets, files, docs, or some other durable representation of what good work looks like. This is where skills, repo instructions, style guides, memory files, and connectors matter. The loop cannot inherit your taste by osmosis, although the industry will absolutely try to sell osmosis as a premium tier.

Third, there is an actor. One agent does the work: reads files, edits code, writes copy, triages issues, drafts a report, checks a dashboard, files a ticket, or calls an API. This is the part everyone photographs because it looks alive.

Fourth, there is verification. The loop needs a check it can read. Tests pass or fail. A linter exits cleanly or complains. A screenshot matches the target or does not. A search finds sources or fails. A reviewer subagent finds a bug or says the diff matches the brief. This is the unglamorous line between automation and a haunted productivity machine.

Fifth, there is state. The loop has to remember what it already tried, what worked, what failed, what remains open, and what should be resumed tomorrow. State can be a markdown file, a database row, a Linear ticket, a GitHub comment, a queue, or a plain old checklist. The format matters less than the fact that it lives outside one model context window, because model context is a hotel room, not a filing cabinet.

Sixth, there is a stop condition. A good loop knows when to quit, escalate, pause, ask for review, or declare itself blocked. A bad loop keeps going because movement feels like progress. Many expensive organizational disasters have begun with that exact emotional mistake, just with fewer GPUs.

This is why "loop writing" is not merely a fancy prompt. A prompt says "do this." A loop says "keep checking for this category of work, use this context, act through these tools, verify against these signals, write down state here, stop under these conditions, and escalate when the situation leaves the rails." The prompt is one instruction. The loop is the small bureaucracy that makes the instruction repeatable.

I mean bureaucracy as a compliment, which is how you know AI has done something weird to my personality.

The Coding World Got There First Because Code Has Teeth

Loop writing is showing up first in coding because coding gives agents something rare: objective feedback that hurts.

A model can produce a marketing paragraph and nobody knows whether it is wrong until a customer quietly loses the will to live. Code is less polite. The test fails. The build breaks. The type checker produces a scroll of disappointment. The runtime error steps out from behind the curtain with a specific line number and a grudge. That makes coding a natural habitat for loops because the agent can act, observe, and retry against real signals.

Anthropic's own Claude Code documentation leans into this. Its best-practices guide says Claude Code is an agentic coding environment that can read files, run commands, make changes, and work through problems while the user watches, redirects, or steps away. More importantly, the guide emphasizes that you should give Claude a way to verify its work, such as tests, builds, screenshots, or other checks, because otherwise "looks done" becomes the only signal. That is the whole loop argument in miniature. The magic is not that the model writes code. The magic is that the environment can say no.

OpenAI is moving in the same direction from the product side. The Codex product page describes Automations as a way for Codex to work unprompted on routine but important tasks like issue triage, alert monitoring, and CI/CD. It also frames Codex as built for parallel agentic coding, with background work and review. The official developer docs for Codex worktrees explain how independent tasks can run in the same project without colliding, while the subagents docs describe specialized agents that can work in parallel and then have their results collected into one response.

The early usage data is pointing in the same direction. In a June 16 report based on a privacy-preserving analysis of roughly 400,000 Claude Code sessions, Anthropic found that people usually make the planning decisions while Claude makes more of the execution decisions, and that usage shifted over time toward more end-to-end agentic work. That is not proof that loops are magic. It is proof that the work boundary is moving. Humans are increasingly specifying outcomes, constraints, and review signals while agents chew through the steps in between.

Put those pieces together and the shape becomes obvious. A coding loop can wake up every morning, inspect failed tests, open isolated worktrees, send one agent to investigate, send another to patch, send a third to review, run the suite, update the issue, and hand the human a diff with evidence. That is not a chat session. That is a tiny software factory with a suspiciously cheerful foreman.

SiliconSnark has been circling this shift for months. Our guide to AI coding agents moving into the repo argued that the serious story begins when agents touch files, tests, permissions, and deployment surfaces. Our guide to computer-use agents made the same point in a broader context: once software can operate interfaces, the questions become governance, state, review, and trust.

Loop writing is the next name for that governance problem. It is what happens when the agent stops being a clever assistant and starts becoming a recurring process.

Prompting Did Not Die. It Got Embedded.

The most annoying part of any "X is dead" story is that X is usually alive, employed, and mildly offended.

Prompting is not dead. Prompting is everywhere inside loop writing. The loop contains prompts for the actor agent, prompts for the reviewer agent, prompts for the triage step, prompts for the summarizer, prompts for the stop-condition evaluator, prompts for the handoff, prompts for the escalation note, and prompts for the prompt that decides whether the previous prompt was sufficiently prompt-like. We have not eliminated prompting. We have industrialized it.

This is the same pattern software follows whenever a manual technique becomes important enough to systematize. A one-off shell command becomes a script. A script becomes a CI job. A CI job becomes a platform workflow. A platform workflow becomes a compliance requirement. Eventually someone creates a dashboard and your small act of convenience has acquired an org chart.

Loop writing is prompt engineering after it discovers operations.

That has real benefits. Embedding prompts inside loops makes them reusable, reviewable, versionable, testable, and shareable. It lets teams write down their standards instead of re-teaching them by ritual correction. It turns "please be careful" into "run this check, compare against this fixture, and escalate if the score falls below this threshold." It lets the human spend less time typing the same instruction and more time designing the conditions under which the instruction should be trusted.

But it also hides the prompt where fewer people inspect it.

That matters. A bad prompt in a chat window produces one bad answer. A bad prompt inside a scheduled loop produces recurring bad behavior with a calendar invite. If your loop triages customer complaints with the wrong priority rule, it can quietly downgrade the same category of unhappy user every morning. If your loop reviews code with a shallow rubric, it can bless sloppy diffs at scale. If your loop summarizes market news without source discipline, it can launder rumor into institutional memory. Automation does not merely make good work faster. It makes bad judgment more repeatable.

This is where the phrase "loop writing" is accidentally useful. The writing is still there. It is just writing rules for a system instead of writing one request for one answer. That makes the craft more like product management, QA design, software architecture, and editorial standards work than like whispering the perfect incantation into a chatbot. Less wizard. More city planner. Fewer robes, more zoning disputes.

The Hype Is Selling Autonomy. The Value Is Supervision.

Every agent boom has the same temptation: sell autonomy as if humans were the bug.

The marketing version says loops free you from prompting. The operational version says loops reduce the amount of low-value prompting you have to do while increasing the importance of review, setup, constraints, and judgment. These are very different claims. One is a fantasy of disappearing labor. The other is a workflow improvement with paperwork. Naturally, the internet prefers the first one because paperwork does not convert.

The serious agent builders are more cautious. Anthropic's widely cited engineering essay on building effective agents recommends simple, composable patterns and warns that agentic systems trade latency and cost for better task performance. It also draws a useful distinction between workflows, where models and tools follow predefined code paths, and agents, where models dynamically direct their process and tool use. Loop writing often blends the two. The outer loop is a workflow. The inner actor may behave like an agent. The trick is knowing which part should be deterministic and which part should be flexible.

That distinction is where hype goes to be gently escorted out.

If a task is stable, repeatable, and well understood, you may not need an agentic loop. You may need a script, a query, a cron job, a form, a rules engine, or an intern with clear instructions and a merciful manager. Dropping an LLM loop onto every recurring task is how companies end up asking a frontier model whether the office printer is out of toner. SiliconSnark has already argued that corporate AI spending needs a do-not-use list, and loop writing makes that list more urgent, not less.

Where loops shine is in messy but bounded work: triage, review, code maintenance, research monitoring, support categorization, report drafting, migration assistance, compliance prechecks, and other tasks where context changes, tool use matters, and a pass-fail or reviewer signal can be built. The loop should do the boring recurrence. The human should design the boundary and inspect the exceptions.

In other words, the valuable thing is not "the agent runs forever." The valuable thing is "the agent runs until a meaningful condition is met, then produces evidence, remembers what happened, and stops before it starts improvising policy."

Autonomy without supervision is just confidence with a refresh interval.

Loops make AI feel more like infrastructure, and infrastructure has a beautiful old tradition called "surprise costs."

Osmani's explainer is refreshingly blunt about this. He notes that loop engineering is still early and that token costs can vary wildly. That caveat is not a footnote. It is the business model wearing a hat.

A direct prompt uses tokens once. A loop uses tokens repeatedly. A loop with subagents uses tokens in parallel. A loop with a reviewer uses tokens to check the work. A loop with memory uses tokens to read state. A loop with connectors may retrieve documents, inspect issues, summarize logs, call tools, run tests, retry failures, and produce reports. A loop that runs every five minutes is not "saving time" in the abstract. It is converting attention into compute spend at a cadence.

That can be worth it. If the loop catches a production issue, saves an engineer an hour, prevents a customer escalation, or clears a queue that would otherwise rot, fine. Pay the machine. Send it a little thank-you note on the invoice. But if the loop is mostly generating summaries of summaries, reviewing its own reviews, or discovering the same non-problem on schedule, congratulations. You have built a treadmill for tokens.

This is one reason the most credible loop use cases will be attached to measurable friction. The AI-agent money story has already been moving in that direction. As SiliconSnark argued in our piece on whether AI agents actually make money, the real economics show up where agents reduce cost, unlock paid work, or participate in governed commercial workflows, not where they merely decorate dashboards with ambition. Loop writing follows the same rule. If you cannot name the value of the recurrence, you probably cannot justify the recurrence.

The other cost is review bandwidth. Worktrees let agents run in parallel. Subagents let them split tasks. Automations let them keep going while you are elsewhere. Lovely. Also, every useful output eventually lands somewhere a human may need to inspect. The ceiling moves from typing speed to judgment bandwidth. This is the kind of productivity gain that can make a capable person feel like a conductor and an unprepared person feel like they have been hired to manage a factory that manufactures homework.

The loop does not remove management. It creates more things to manage.

The Failure Modes Are Boring, Which Is Why They Matter

The scariest loop failures are not cinematic. They are administrative.

A loop can drift from its purpose. A loop can optimize for the check instead of the real outcome. A loop can keep retrying a bad strategy because the stop condition is vague. A loop can overfit to stale instructions. A loop can treat missing context as permission to guess. A loop can escalate too much and become noise. A loop can escalate too little and bury risk. A loop can generate so much intermediate work that the human loses the thread and accepts the final answer because reconstructing the path feels exhausting.

This is not new. It is the ancient software problem of automation bias wearing a model badge.

The difference is that language makes bad automation feel more reasonable. A failed script throws an error. A failed loop writes a polite explanation. It says it has reviewed the context. It says the implementation appears sound. It says the sources indicate. It says the issue is likely resolved. It says many comforting things because saying comforting things is one of the machine's core professional skills.

That is why verification has to be external whenever possible. Tests are better than vibes. Logs are better than assurances. Screenshots are better than "the UI has been improved." Source links are better than synthetic confidence. Independent review is better than letting the same model grade its own homework. This is also why SiliconSnark's AI slop detector applies just as much to loops as to one-off answers. Can the system show its evidence? Can it distinguish what it knows from what it inferred? Can it survive a second pass?

For writing, reporting, and research, the stakes are less compiler-shaped but just as real. A loop that monitors news can become a rumor amplifier if it does not prioritize primary sources and dates. A loop that drafts articles can turn one misread report into twelve polished paragraphs. A loop that summarizes competitors can flatten nuance into strategy mush. The model may not hallucinate dramatically. It may simply sand the truth into something more convenient.

And because loops feel advanced, users may scrutinize them less. That is backwards. The more autonomous the system, the more explicit the audit trail should be. "It ran automatically" is not an excuse. It is the reason you needed logs.

Why Managers Suddenly Understand the AI Story

One of the more interesting details in the Business Insider report is Claire Vo's framing: loops are like designing jobs. That is the part non-engineers should pay attention to.

Prompting was easy to mistake for clever wording. Loop writing is harder to separate from management. You define the role. You describe the expected output. You provide resources. You set success criteria. You decide what the worker can access. You establish escalation rules. You review performance. You refine the process. This is not mystical. It is delegation, except the delegate is a model that can read a repo, call tools, and occasionally behave like it has been raised entirely on Stack Overflow and espresso vapor.

That means loop writing may spread beyond code, but not because everyone wants to become an AI engineer. It will spread because every knowledge-work team has recurring semi-structured tasks that are annoying enough to automate but messy enough to resist normal automation. Sales teams review accounts. Support teams triage tickets. Analysts monitor filings. Editors check drafts. Product teams inspect feedback. Lawyers classify requests. Finance teams reconcile exceptions. Marketers watch competitors. Recruiters screen inbound. Everyone has a loop already. Most of them are human loops held together by calendar invites and quiet resentment.

The AI version can help when the task has a stable shape and a reviewable output. It can hurt when the organization confuses "the agent can do a first pass" with "the process no longer needs accountable owners." This is the management trap. A loop can make delegation cheaper. It cannot make responsibility evaporate.

That is also why the cultural meaning is bigger than coding. Prompt engineering made AI feel like an individual superpower. Loop writing makes AI feel like organizational design. The core question changes from "what do I type?" to "what system should exist around this work?" That is a more mature question. It is also a more uncomfortable one, because systems reveal incentives. If your loop keeps producing junk, maybe your prompt is bad. Or maybe your definition of good was never clear, your data is a swamp, your review process is ceremonial, and the agent has merely become the fastest intern to discover the mess.

The machine is not always the problem. Sometimes it is just the first thing naive enough to follow the process exactly.

Loop Writing Versus AI Search, Agents, and the Everything Interface

Loop writing also belongs inside the broader platform war over who becomes the interface to work.

AI search wants to answer before you click. Agents want to act before you manually navigate. Coding tools want to change the repo before you open the file. Browsers want to become assistants. Operating systems want to remember your context. Enterprise platforms want to govern tiny software workers before procurement realizes how many have already moved in.

As SiliconSnark argued in our guide to AI search answering before you click, the strategic prize is the interpretation layer. Whoever controls the layer where intent becomes action gets leverage over everything downstream. Loop writing is what happens when that layer stops waiting for individual requests and starts operating continuously.

This is why the tool companies care. Codex, Claude Code, Cursor, Gemini-powered developer workflows, GitHub Copilot, and the wider field of agent platforms are not merely competing on code generation quality. They are competing to become the place where work is decomposed, delegated, verified, remembered, and reviewed. The editor was the old battlefield. The workflow is the new one.

That helps explain why worktrees, subagents, skills, connectors, memory, permissions, and automations suddenly matter so much. They are not random feature confetti. They are the control surfaces for persistent AI work. A plain chatbot can answer. A loop-enabled agent environment can run a process. That process can become habit. Habit can become dependency. Dependency can become a platform business. Please look surprised when the pricing page appears.

The OpenClaw adjacent discourse matters here too. The agent community has spent the past year building orchestration patterns, agent teams, review loops, and computer-use flows in public. Some of it is genuinely inventive. Some of it is screenshot theater. SiliconSnark's look at the OpenClaw clone wars noted the same tension: the category is full of useful experiments and equally full of tools that seem to exist so a terminal can look busy on social media.

Loop writing will inherit both sides. It will produce real productivity in teams that know their workflows. It will also produce a lot of performative automation, because nothing says "future of work" like a dashboard proving the machine is diligently doing something nobody needed.

So Is Loop Writing the New Prompting?

Yes, in the same way DevOps was the new shell scripting.

That is: yes, but only if you understand that the old thing did not vanish. It became one component inside a larger discipline.

Prompting taught people how to express intent to a model. Loop writing asks them to express intent to a system that may run repeatedly, use tools, spawn helpers, preserve memory, and act while the human is not staring at the screen. That is a real shift. It deserves a name. It also deserves fewer victory laps.

The best case is excellent. A well-designed loop turns recurring expert work into a supervised system. It notices things earlier. It handles dull first passes. It tests its own output. It separates maker from checker. It writes down state. It escalates the weird cases. It lets people spend more attention on judgment and less on ritual. In software, that can mean cleaner maintenance, faster bug fixes, better test discipline, and more parallel exploration. In other fields, it can mean better monitoring, faster drafts, more consistent review, and fewer tasks dying quietly in inbox sediment.

The worst case is also obvious. A badly designed loop automates ambiguity, hides bad prompts, burns tokens, creates review debt, and gives everyone the seductive feeling that work is happening because the system keeps moving. It can create more output than understanding. It can accelerate mistakes. It can make people worse at the work they are supposedly supervising because they stop reading closely. The loop does not know whether you are using it to extend expertise or avoid expertise. You do.

That is the real answer. Loop writing is not the end of prompting. It is prompting with memory, tools, cadence, verification, and consequences. It is less about clever phrases and more about designing a repeatable relationship between intent, action, evidence, and review.

So write loops, sure. Build the little systems. Let the agents do the tedious middle. Give them tests. Give them reviewers. Give them narrow permissions. Give them state. Give them stopping conditions that mean something. Then read what they did with the suspicious affection of someone supervising a very fast junior colleague who has memorized the docs but still needs an adult near the merge button.

The prompt is not dead. It has been promoted.

And like many promotions in tech, it now has meetings, responsibilities, and a much more complicated compensation structure.