Vibe Coding

Vibe coding is a phrase coined by Andrej Karpathy to describe an AI-assisted software development approach that focusses on the “vibe” of what you want rather than technical details. With vibe coding, instead of developers writing the code themselves, they describe what they want in natural language and let the AI generate it. I recently experimented with a popular tool for vibe coding, Warp, and thought I’d share my findings.

Although I have been coding for over two decades now, I still stay open to trying new tools for my workflow. I’m a heavy user of AI (primarily ChatGPT/Claude), but mostly for quick snippets and boilerplate rather than full features. Warp describes itself as an agentic development environment at the terminal layer and offers a whole lot more than just writing code. It can scaffold projects, create environments, install dependencies, run commands, react to system events, and orchestrate multiple agents in parallel. It also supports a mixed-model approach, meaning you can use different models for different jobs. For example, you might use ChatGPT for scaffolding or planning, but then use Claude for coding. I looked at other popular vibe coding tools such as Cursor, Replit Agents, Windsurf/Codeium, and Copilot, but chose Warp over the competition for its terminal integration, multi-agent execution, and project-level context.

Initially, I was extremely impressed with Warp, and it very much felt like the future was here. I asked it to create a web app that would easily take a day or more for me to build manually. I watched it plan the app, structure it, set up the environment, write the code, and I had a fully functioning app within a matter of minutes. I Immediately understood the hype around ‘vibe coding’ - the average person can now get a basic web app running at the click of a button. Warp really is an impressive tool for building out simple web apps quickly and iterating on these apps to meet your needs.

However, the more I used Warp and the more I dug into the code, I realised that these creations were not so neat under the bonnet. A simple product can end up with hundreds of modules, full of duplicate logic across the codebase. The bottleneck shifts to reviewing, testing, and securing code. If you’re using tools like Warp, expect to spend much more time reviewing code and remember that reading and fixing AI-written code can often be harder than writing your own. Another negative is that you also start to notice how similar everything looks and feels when created by an LLM.

There are however, ways to improve the output you get, for example specifying all requirements and constraints clearly, describing the interface and acceptance criteria. You can then have the model propose a plan and approve each change iteratively. Also, make sure that the model documents everything it is doing and that it has local project context (files, patterns, constraints) to keep it grounded. This helps when you need to go back and update parts of your project. Without this context you will likely end up with each update introducing different styles or UI inconsistencies. For Warp, any context information can be kept in the WARP.md file that sits in the project root.

There is no doubt these tools have immense value and they are brilliant at certain tasks. In particular, creating MVPs to bring ideas to life quickly or one-off products or demos that you won’t need to maintain in future. There is also no harm in letting Warp dive into an existing project and looking for issues or improvements - especially if it’s an older or large project, for which you want to sanity check the code. The key is to use them in small, verifiable increments with strong supervision, especially if you are working outside of well-known frameworks and languages. You have to assume generated code may have holes, don’t expose unreviewed surfaces - Always review code!

Zooming out to the bigger picture, I think the vibe coding trend will lead to a dramatic rise in the amount of code in the world, but when code is cheap to write, technical debt grows fast. While the number of engineers per line of code may fall, review and maintenance load will undoubtedly increase. Beyond simple prototyping, I think human input and oversight is still necessary for anything other than MVP. LLMs still struggle with architecture, data flow, non-standard edge cases and design choices that demand ‘taste’. Despite being very impressed with Warp, I feel the future will move towards computer-aided software engineering, rather than leaving AI to do 100% of the work.