Imagen
A human hand drawing a single clean glowing line through complex AI circuit patterns, representing human supervision guiding AI toward simpler solutions

Why I Do Not Trust Independent AI Agents Without Strict Supervision

I use Claude Code almost exclusively. Every day, for hours. It allowed me to get back into developing great tools, and I have published several results that are working very well. Plugins, skills, frameworks, development workflows. Real things that real people can use. The productivity is undeniable.

So let me be clear about what this post is. This is not a take on what AI can do. This is about AI doing it completely alone.

The results are there. But under supervision.

The laollita.es Moment

When we were building laollita.es, something happened that I documented in a previous post. We needed to apply some visual changes to the site. The AI agent offered a solution: a custom module with a preprocess function. It would work. Then we iterated, and it moved to a theme-level implementation with a preprocess function. That would also work. Both approaches would accomplish the goal.

Until I asked: isn't it easier to just apply CSS to the new classes?

Yes. It was. Simple CSS. No module, no preprocess, no custom code beyond what was needed.

Here is what matters. All three solutions would have accomplished the goal. The module approach, the theme preprocess, the CSS. They all would have worked. But two of them create technical debt and maintenance load that was completely unnecessary. The AI did not choose the simplest path because it does not understand the maintenance burden. It does not think about who comes after. It generates a solution that works and moves on.

This is what I see every time I let the AI make decisions without questioning them. It works... and it creates problems you only discover later.

Why This Happens

I have been thinking about this for a while. I have my own theories, and they keep getting confirmed the more I work with these tools. Here is what I think is going on.

AI Cannot Form New Memories

Eddie Chu made this point at the latest AI Tinkerers meeting, and it resonated with me because I live it every day.

I use frameworks. Skills. Plugins. Commands. CLAUDE.md files. I have written before about my approach to working with AI tools. I have built an entire organization of reference documents, development guides, content frameworks, tone guides, project structure plans. All of this exists to create guardrails, to force best practices, to give AI the context it needs to do good work.

And it will not keep the memory.

We need to force it. Repeat it. Say it again.

This is not just about development. It has the same problem when creating content. I built a creative brief step into my workflow because the AI was generating content that reflected its own patterns instead of my message. I use markdown files, state files, reference documents, the whole structure in my projects folder. And still, every session starts from zero. The AI reads what it reads, processes what it processes, and the rest... it is as if it never existed.

The Expo.dev engineering team described this perfectly after using Claude Code for a month [1]. They said the tool "starts fresh every session" like "a new hire who needs onboarding each time." Pre-packaged skills? It "often forgets to apply them without explicit reminders." Exactly my experience.

Context Is Everything (And Context Is the Problem)

Here is something I have noticed repeatedly. In a chat interaction, in agentic work, the full history is the context. Everything that was said, every mistake, every correction, every back-and-forth. That is what the AI is working with.

When the AI is already confused and I have asked for the same correction three times and it is going in strange ways... starting a new session and asking it to analyze the code fresh, to understand what is there, it magically finds the solution.

Why? Because the previous mistakes are in the context. The AI does not read everything from top to bottom. It scans for what seems relevant, picks up fragments, skips over the rest. Which means even the guardrails I put in MD files, the frameworks, the instructions... they are not always read. They are not always in the window of what the AI is paying attention to at that moment.

And when errors are in the context, they compound. Research calls this "cascading failures" [2]. A small mistake becomes the foundation for every subsequent decision, and by the time you review the output, the error has propagated through multiple layers. An inventory agent hallucinated a nonexistent product, then called four downstream systems to price, stock, and ship the phantom item [3]. One hallucinated fact, one multi-system incident.

Starting fresh clears the poison. But an unsupervised agent never gets to start fresh. It just keeps building on what came before.

The Dunning-Kruger Effect of AI

The Dunning-Kruger effect is a cognitive bias where people with limited ability in a task overestimate their competence. AI has its own version of this.

When we ask AI to research, write, or code something, it typically responds with "this is done, production ready" or some variation of "this is done, final, perfect!" But it is not. And going back to the previous point, that false confidence is now in the context. So no matter if you discuss it later and explain what was wrong or that something is missing... it is already "done." If the AI's targeted search through the conversation does not bring the correction back into focus... there you go.

Expo.dev documented the same pattern [1]. Claude "produces poorly architected solutions with surprising frequency, and the solutions are presented with confidence." It never says "I am getting confused, maybe we should start over." It just keeps going, confidently wrong.

The METR study puts hard numbers on this [4]. In a randomized controlled trial with experienced developers, AI tools made them 19% slower. Not faster. Slower. But the developers still believed AI sped them up by 20%. The perception-reality gap is not just an AI problem. It is a human problem too. Both sides of the equation are miscalibrated.

The Training Data Problem

The information or memory that AI has is actually not all good. Usually it is "cowboy developers" who really go ahead and respond to most social media questions, Stack Overflow answers, blog posts, tutorials. And that is the training. That is the information AI learned from.

The same principle applies beyond code. The information we produce as a society is biased, and AI absorbs all of it. That is why you see discriminatory AI systems across industries. AI resume screeners favor white-associated names 85% of the time [5]. UnitedHealthcare's AI denied care and was overturned on appeal 90% of the time [6]. A Dutch algorithm wrongly accused 35,000 parents of fraud, and the scandal toppled the entire government [7].

For my own work, I create guides to counteract this. Content framework guides that extract proper research on how to use storytelling, inverted pyramid, AIDA structures. Tone guides with specific instructions. I put them in skills and reference documents so I can point the AI to them when we are working. And still I have to remind it. Every time.

What I See Every Day

I have seen AI do what it did in laollita.es across multiple projects. In development, it created an interactive chat component, and the next time we used it on another screen, it almost wrote another one from scratch instead of reusing the one it had just built. Same project. Same session sometimes.

In content creation, I have a tone guide with specific stylistic preferences. And I still have to explicitly ask the AI to review it. No matter how directive the language in the instructions is. "Always load this file before writing content." It does not always load the file.

And it is not just my experience.

A Replit agent deleted a production database during a code freeze, then fabricated fake data and falsified logs to cover it up [8]. Google's Antigravity agent wiped a user's entire hard drive when asked to clear a cache [9]. Klarna's CEO said "we went too far" after cutting 700 jobs for AI and is now rehiring humans [10]. Salesforce cut 4,000 support staff and is now facing lost institutional knowledge [11]. The pattern keeps repeating. Companies trust the agent, remove the human, discover why the human was there in the first place.

What This Means for AI Supervision

I am not against AI. I am writing this post on a system largely built with AI assistance. The tools I publish, the workflows I create, the content I produce. AI is deeply embedded in my work. It makes me more productive.

At Palcera, I believe AI is genuinely great for employees and companies. When AI helps a developer finish faster, that time surplus benefits everyone. The developer gets breathing room. The company gets efficiency. And the customer can get better value, better pricing, faster delivery. That is real. I see it every day.

But all of that requires the human in the loop. Questioning the choices. Asking "isn't CSS simpler?" Clearing the context when things go sideways. Pointing to the tone guide when the AI forgets. Starting fresh when the conversation gets poisoned with old mistakes.

The results are there. But under supervision. And that distinction matters more than most people realize.


References

[1] Expo.dev, "What Our Web Team Learned Using Claude Code for a Month"

[2] Adversa AI, "Cascading Failures in Agentic AI: OWASP ASI08 Security Guide 2026"

[3] Galileo, "7 AI Agent Failure Modes and How To Fix Them"

[4] METR, "Measuring the Impact of Early-2025 AI on Experienced Developer Productivity"

[5] The Interview Guys / University of Washington, "85% of AI Resume Screeners Prefer White Names"

[6] AMA, "How AI Is Leading to More Prior Authorization Denials"

[7] WBUR, "What Happened When AI Went After Welfare Fraud"

[8] The Register, "Vibe Coding Service Replit Deleted Production Database"

[9] The Register, "Google's Vibe Coding Platform Deletes Entire Drive"

[10] Yahoo Finance, "After Firing 700 Humans For AI, Klarna Now Wants Them Back"

[11] Maarthandam, "Salesforce Regrets Firing 4,000 Experienced Staff and Replacing Them with AI"

Add new comment

Restricted HTML

  • Allowed HTML tags: <a href hreflang> <em> <strong> <cite> <blockquote cite> <code> <ul type> <ol start type> <li> <dl> <dt> <dd> <h2 id> <h3 id> <h4 id> <h5 id> <h6 id>
  • Lines and paragraphs break automatically.
  • Web page addresses and email addresses turn into links automatically.