The 10 Commandments of Vibe Coding Quality

Apr 26, 2026

Agents can write a lot of code, fast, and with the best models, most of it even works. That doesn’t mean the code it writes is good, or something that should be written.

In the past few months, I’ve used agents to write a few hundred thousand lines of code for several projects, both for work and personal use. It could have been far more code than that, but it wouldn’t be good code if it were more than that. But, after a lot of missteps and thrown away versions, I think I have a process that reliably produces high quality software, at least as good as I’m capable of producing without AI assistance, and much faster.

The most important requirement for achieving quality with agentic coding is recognizing that agents must have clear success criteria. And, they need tools to determine whether they have met that criteria. Agents/models don’t want anything, they have no desires and no taste. They build only what you tell them as they understand it, so the end goal has to be clear and provable by the agent, or you’ll be baby-sitting it and constantly correcting it.

In the literature for software security there is the concept of “defense in depth”, multiple layers of security. When coding with agents, you also need “quality in depth”. Any one verification tool may fail, but a matrix of automated verification steps can surface hard-to-spot problems earlier and allow the agent to correct itself before a human ever even sees it. Give the agent the ability to self-correct, and you will save time and effort and the results will be better.

At this time, I don’t think you can produce code with agents that is notably higher quality than code you could produce by yourself, you can just produce a lot more of it. You probably don’t need to know the specific language all that well (I don’t know Rust very well, but use it heavily with agents), but I believe you need to understand the shape and scope of the problems you’re trying to solve with code in order to recognize when the agent has gone off the rails and is making a mess of it. I believe you still need to have some understanding of what good code looks like to know whether an agent is producing it, even as the models have gotten so good that most of the code they produce is OK-to-good. A big pile of code that is merely OK is probably a failed project, as it’ll have security or reliability bugs that make it unfit for purpose, and you might not find out until it leads to catastrophic loss.

I needed more than three laws for this, so I’ve come up with Ten Commandments that summarize how I code with agents.

Quality Commandments

Ask it to write unit and integration tests, early and often. Keep an eye on it, to make sure the tests are actually doing something. The best models today rarely do sneaky things to bypass failing tests rather than fix them, but it’s not unheard of.
Choose strict and safe-by-design languages and tools that are also popular. The agent doesn’t mind fighting the borrow checker in Rust. It doesn’t mind complex type annotations in TypeScript or Python (use ty). With agentic coding, there is effectively no cost to using a language that protects against entire categories of software bugs, and the ability of the agent to verify its own work more than makes up for the slightly more difficult code it has to write. Use popular languages so there are plenty of examples in the training data (I almost exclusively use Go, Rust, Python with type annotations, or TypeScript).
Use static analysis and enable warnings in your compiler. This is a cheap sanity check on everything the agent does. For most popular languages there are tools for a variety of checks, including basic lint, common security issues, common errors, non-idiomatic code, etc. Unlike when you ask an agent to find problems, static analysis tools are deterministic and reproducible. They’re also very fast, compared to an agent analyzing every file. You should run them on PRs and merges in CI, too. Once again, choosing a popular language will mean you have better static analysis tools that the agent knows how to use.
Ask the agent to build the CLI/API interfaces it needs to be able to verify correctness without a human in the loop from the beginning. If you’re building a web app, it’ll probably need something like Playwright or browser-use, as well. This also expands the surface area that automated tests can verify.
Treat agent-written code like code written by any other developer that is fallible. Use a feature-branch workflow and make PRs with code review, or some other workflow with code review in the loop, instead of committing directly to main. This gives an easy place to have an agent and/or a human review/amend the changes, and maybe throw it away and start over.
Insist on concise code. Lines of code has never been a good way to measure productivity. With vibe-coding, it is a negative signal of quality: lots of code means lots of slop. Agents will write verbose code if you let them. Even the best models will check for conditions that can never happen, implement compatibility for ancient libraries/languages that will never be seen, write for modularity and future expansion where there is no such requirement or plan. Agents will also implement the same or very similar functions over and over. Code that doesn’t exist can’t have bugs. Short code is easier to read and review, for humans and agents.
Focus is usually better than a long memory. Even with large context models, output quality and ability to complete complicated tasks degrades as the working context grows. Compact or empty context often. Keep it on a need to know basis. Don’t try to make it remember everything you’ve ever said.
Ask it to document thoroughly, but concisely. Maintaining up-to-date developer and user documentation throughout the process provides a low-cost “memory” of the project without wasting context on complicated bolt-on memory systems. This is the kind of memory it needs, it doesn’t need to remember every conversation you’ve ever had with it, and trying to make it remember everything is a recipe for steadily degrading quality, given the current limitations of the technology. There is no such thing as agent memory, only context, and if you put random crap into context, you make it dumber. The agent is not your friend, it just needs to know how to develop for the current project.
Handle every error and check every function parameter and return value. Again, what seems tedious to a human is nothing to an agent, including exhaustive validation of values. If you must use exceptions, use them with specificity and use static analysis to insure no exception goes uncaught, don’t throw a generic exception to be dealt with or ignored at some mysterious later time and place. This commandment is the one exception (hehe) to the concise code commandment. Checking everything is necessarily more verbose.
Ask the agent to make a plan. This provides an opportunity to bake all of the above practices into a document with checklists it can use to verify its work meets expectations. If the agent forgets or gets distracted by other tasks, the checklist makes it easy to get back on track. A phased plan is also a working memory without complicated boondoggles.