Supervised Vibe Coding
I took a while to warm up to vibe coding, and have now realised that the trick is in effective "supervision". In a sense, it is similar to managing humans, without the emotions of course!
Vibe coding
Two weeks ago I switched from Cursor to Claude Code. And suddenly I find Vibe Coding far more enjoyable.
This is the second time I’m trying Claude Code. The first time was in early August, when I tried it within the Cursor IDE itself. Somehow it seemed clunky and less useful than Cursor. And so I had given up, and stopped my Claude Premium account.
This time round, I decided to try Claude again since I decided to the Positron (a VSCode fork by the RStudio guys) IDE (it allows you to see graphs as you produce them, on the side; and has a separate “console” from the “terminal”; both of these are extremely important to me). And for whatever reason I started getting better results with it. As things stand, I have a Claude Pro subscription again, and have cancelled my Cursor.
In the last three weeks, I’ve also switched to coding a lot, a big change from the previous 2-3 months where I spent most of my time selling. We have a few POCs going on now, and I realize that the biggest impact I can have is in giving a good product to these POC customers, and that should lead to sales (both longer term contracts with these people, and using their referrals and quotes in selling to others).
Centaurs
Back in 2011, I remember a couple of ex-colleagues gifting me Garry Kasparov’s book “how life imitates chess”. In that, he talks about “advanced chess”, where humans, supported by chess-playing computers, played against each other. And he found that the winners in that tournament were neither the best humans nor the best machines, but humans who were able to make the best use of machines. He called them centaurs.
I’ve been fascinated by this concept ever since I read it then. I remember using that to talk about how more traditional statistical methods of data analysis, or data analysis with a strong human touch, were superior to simple stir-the-pile-of-data machine learning.
Over time, when it comes to data analysis, I’ve possibly changed my mind. There are some situations where I’m (now) convinced that stirring the pile gives you the best outcomes.
And now with generative AI, we have a new machine to partner with and create new “centaurs”.
Back to vibe coding
At the AI company that I run, we mandate everyone to use AI as much as possible in their everyday work. That said, I myself hadn’t used that much vibe coding until the end of August - in fact, on most days, I would end up swearing at Cursor because it just wouldn’t “get me”. That led me to write about whether AI leads to conformity, since conformists (whose ideas are closer to “average”, and hence to “AI”) can gel better with AI.
AI and Conformity
I’ve been using Cursor for coding at work for the last three months or so. I find it useful, but also insanely frustrating. Not a day goes without my swearing at it. I find that it just doesn’t understand me.
Also, I’ve not been a fan of code written by LLMs. In one case, misplaced handling of vectors by a LLM (it used map2 instead of pmap when it was operating on more than three columns in the tibble) led us to lose a few thousand dollars (in AWS Bedrock credits).
Claims like “X% of code in our company is written by AI” are believable solely because of how verbose AI-written code is.
That apart, I find LLM-written code way too verbose and “defensive”. Claims like “X% of code in our company is written by AI” are believable solely because of how verbose AI-written code is.
And then, I find that AI written code is largely unreadable (and thus uneditable) by humans. Unless you are willing to put in considerable effort, vibe coding is a one way street - once you use LLMs to write one file, only other LLMs can work on that file. It’s like what my grandfather used to say “only brahma can undo brahma’s knots” (ಬ್ರಹ್ಮಗಂಟನ್ನು ಬ್ರಹ್ಮನೇ ಬಿಡಿಸೊಕ್ಕಾಗೋದು).
vibe coding is a one way street - once you use LLMs to write one file, only other LLMs can work on that file. It’s like what my grandfather used to say “only brahma can undo brahma’s knots”.
The other reason I’ve not been a big fan of vibe coding is that vibe coding is fundamentally imprecise. I largely code in R, which I’ve been using since 2008. That means I know the syntax rather well, and it takes me far less time and effort to express in code than what it would take me to express in English.
How I learned to stop worrying and love to vibe code
I don’t know if it’s just in my head but I find Claude Code more pliable than Cursor. I find it writes better code, and the way it is trained more aligned to how I think (though Cursor also largely uses Claude under the hood).
As I realised I had to code a lot, I started using Claude Code for little tasks, such as:
“explain this piece of code and tell me what it does”
“this code is way too verbose. Make it simpler using modern tidyverse principles”
“do we really need this try catch?”
etc.
And then it grew from there. And then I started using it for automating boring tasks “take these lines of code and rewrite them for manipulating that kind of a data frame”.
They key, I realised, is to “supervise” the LLM. Basically you ask it to generate code, and then you verify that the code works. This is like “management”, in some ways. Like I had written in 2022 after taking over a new team in my then job:
But then – if you think about it, at some level, management is basically about “verification”. To see whether you have done your work properly, I don’t need to precisely know how you have done it. All I need to know is whether you have done bullshit – which means, I don’t need to “replicate your algorithm”. I only need to “verify your algorithm”, which computer science tells us can be an order of magnitude simpler than actually building the algorithm.
I had written this from the point of view of managing humans. Now, this applies when you have to manage LLMs as well. If you know nothing about the domain that you are asking to work on (like Github, for example - which, despite heavy usage in the last year, I still fail to fully get), you don’t know what it is doing, and sometimes it can do rubbish (even yesterday I lost code because it stashed it and then those files suddenly disappeared (they are not in my stash) ).
On the other hand, if you are asking the LLM to do something which you have some idea about, you can very quickly verify its output and make it much better. You know when it is making some big mistakes. You know when the output can be better.
Stud Anand, who calls himself a “LLM psychologist” gives talks on vibe coding for data analysis (I sat through one of these a few months ago). If you watch him work with LLMs, everything seems so simple, and you start wondering if AGI is close by.
But if you look closely, what you see is that it is the combination of Anand and the LLM that are achieving the achievements. He is good at both data science and understanding LLMs, and so he knows exactly how to prompt. More importantly, he knows exactly how to verify the output of the LLMs, and to re-prompt them to give better output, and this little “handholding” leads to significantly better outputs.
And that is exactly what I’ve been finding with my own work - because I’m getting LLMs to write code in my own domain, I’m able to very quickly verify the outputs and ask for tweaks. I’ve been learning how to ask for these tweaks effectively. And now all the boring bits of coding are outsourced to the LLMs. So, together with the LLMs, I’m finding myself to be a much more effective coder.
If you look at my code now, you will see that it is well written and concise, even though it has been largely written by LLMs. That is all down to my “supervision”!