Vendors focussing on the wrong things
You hire someone to do one thing, and they spend all their time doing something else, and expect that it makes up for not doing the main thing properly
Someone on a cricket WhatsApp group I’m on sent me this yesterday, about how Indian medium pacer Akash Deep rates his batting performance in the recent Test at The Oval ahead of his 10 wicket haul earlier in the series at Edgbaston.
The story touched a raw nerve for me, and not because of any cricketing reason. For it brought to my mind this consistent kind of experience I’ve experienced with people I’ve hired (in various capacities) over the years. It is possible likely that I’ve been guilty of this from the other direction as well. It is about “focussing on the wrong thing”.
The wrong things
Rather than writing stuff that might expose people I have hired (and thus put myself in trouble), let me put myself on the other side for the illustrative examples for this post.
I remember once (now I might be hallucinating) about this company that hired me to build a fraud model. I remember doing everything but that in that time period. The data was exciting. The stories behind it were equally exciting. I spent a lot of my time on that “assignment” in this data exploration, maybe building lots of beautiful plots (I’m good at that), rather than actually building that fraud model.
With benefit of hindsight, the model I turned in was substandard and there was no surprise that my contract wasn’t renewed, but I was annoyed (then) that the company had overlooked all the great work I had done in analyzing the data and all the beautiful graphs I had made.
This kind of thing is more common than you might expect - in fact, think about all the bad hires you have made, in various capacities, and you see that (apart from the occasional obviously completely useless ones - where the blame falls on you for hiring them in the first place) they are likely to have spent their time doing something other than the main thing you hired them for.
Why does this happen?
The most obvious answer is misaligned expectations. Maybe you weren’t clear, or they weren’t clear, at the time of signing on. You hired me to be a lifeguard and I interpreted that job description as “sit at the beach and stare at the sea all day”. I saw something else in the job than you did, and we clearly didn’t communicate, which means I did my own thing rather than what you expected.
In other cases, you are actually out of your depth on what you were actually hired for (OK, this is also a hiring problem), and instead assume that what you are doing “on the side” will more than make up for your incompetence in your main job. When confronted, you can always claim that you were useful, though not in the intended manner, and that all the time you spent in this “secondary thing” didn’t leave you enough time for the main thing you were hired for (though nobody asked you really to do this secondary thing).
The third is that this secondary thing you’ve been performing well at gives you the satisfaction of “having added value” which means that you feel less bad about slacking on your main job. So you think it is (more) okay to slack, and you slack.
Okay, my explanations are overlapping with each other so I’ll stop adding to this list. You get the drift.
Deception?
My wife is into AI alignment, and she keeps talking about this thing called “deception” that seems to be popular in those circles. Basically it is an anthropomorphizing of AI, of cases where it “performs well in training, but poorly in inference”, where it is alleged that AI is actually trying to deceive you in terms of believing that it is “good”, while it is actually “evil”.
I don’t read much science fiction, and given that I’ve done a fair amount of work in data science, I’ve seen more than my share of models that perform rather well in-sample but poorly outside. It’s basically “overfitting”, and nothing about the model (a “pile of linear algebra”) trying to cheat you.
In any case, my wife had written this rather nice post recently (well, I was on the receiving end of it) where she pretty much compared me to a “deceptive LLM”. Remember that she has spent over a decade working on human (romantic) relationships before getting into (essentially) human-AI relationships. She wrote:
Deception stems from human inability to evaluate accurately, and as a result we often reward the appearance of alignment, and in doing so, we train systems to fake being good instead of becoming good.
And once that landed, it hit me like a thunderclap. The same pattern I was hearing in these late-stage divorces. This slow erosion and the collapse of a commitment relationship after years of false signals was the same damn problem.
[…]
So, these women praised the husband for cooking dinner once, thinking it meant he was finally becoming emotionally available. They said thank you for him taking the kids out for an hour, hoping it meant he understood her exhaustion. They interpreted help as understanding. They mistook compliance for compatibility.
They clung to incremental acts of decency, hoping it signalled deep realignment. They mistook help for healing, like confusing a band-aid for surgery. And every time they forgave, adjusted or accommodated, they unknowingly incentivised deception or performative alignment. This was not done out of malice, but out of social programming.
At some level of abstraction, this is eerily similar to what I’m talking about in this post - like the “husbands” in my wife’s blogpost, people you hire sometimes do “small acts” (like the secondary thing they have not been hired to do), and hope that it can cover for the main act that they have not been performing at.
And in most cases, like I’ve written above, they are not looking to deceive you. It is just that they are optimizing for different things than you are, and so you have the classic principal-agent problem (what is AI optimizing for? whatever is the objective function of its intended gradient descent. that the AI has many more parameters than the human mind can comprehend means that this objective function can’t be defined properly, and that results in misalignment, and seeming deception).
How do we solve for this?
This has happened to me so many times, on both sides (and also as a husband, and a user of AI), that I don’t really know! Maybe better hiring or setting of expectations can help? Then again, like the number of parameters used to define model AI models, the number of degrees of freedom in any agreement between humans is so large that it is not possible to define accurately. Legalese / code is of course precise, but also restrictive in terms of the number of possible actions one can take.
The other, I guess, is through reinforcement. Maybe over a period of time I will learn that someone doing something they were not hired for is not simply being nice - they are “trying to deceive” (like the AI alignment folks might say), and so I will react appropriately to “bring them back on track”.
Have you seen this kind of behaviour from people you have hired (or married, or “trained”)? How have you looked to solve this, and what have been your insights? Do leave a comment and let me know!
The issue is with evaluation/ feedback. You could try improving it, but it's likely never going to be 100% foolproof. So, while you continue improving evaluation, it might be useful to start building out repair mechanisms. Hopefully, that will help avoid the expense of completely dispensing with the model or the hire.