>Human programmers don't usually hallucinate things out of thin air
Oh, you wouldn't believe how much they do that too, or are unreliable in similar ways. Bullshiting, thinking they tested x when they didn't, misremembering things, confidently saying that X is the bottleneck and spending weeks refactoring without measuring (to turn out not to be), the list goes on.
>So no, they aren't working the exact same way.
However they work internally, most of the time, current agents (of say, last year and above) "describe the issue exactly in the way a human programmer would".
LLM hallucinating is not an edge case. It is how they generate output 100% time. Mainstream media only calls it "hallucination" when the output is wrong, but from the point of view of a LLM, it is working exactly it is supposed to....
>LLM hallucinating is not an edge case. It is how they generate output 100% time
If enough of the time it matches reality -- which it does, it doesn't matter. Especially in a coding setup, where you can verify the results, have tests you wrote yourself, and the end goal is well defined.
And conversely, if a human is a bullshitter, or ignorant, or liar, or stupid, it doesn't matter if they end up with useless stuff "in a different way" than an LLM hallucinating. The end result regarding the low utility of his output is the same.
Besides, one theory of cognition (pre LLM even) is of the human brain as a prediction machine. In which case, it's not that different than an LLM in principle, even if the scope and design is better.
Does it have to be a specific number? Whatever you feel like its worth using it over not.
If I write code for medical devices I might not tolerate even one AI-induced issue. If I write glorified web apps, I could tolerate dozens of them as long as it still helps get stuff done faster when it works.
Your car fails ocassionally and needs service. If most of the time it gets you there, enough that you find it worth it over NOT having a car or buying a new one, then it's useful.
And unlike the car, you can do whatever review/verification/testing of the resulting AI code before you deploy it. And the code failing wont kill you or others (if you write trading software, medical devices firmware, or airplane code, you can always not use it).
You don't even need to let it rip on your system, can use it with user confirmation for any action or have it go in a sandbox.
> Whatever you feel like its worth using it over not.
"feel" is a bad way to do statistics. The whole problem with LLMs hallucinating the output 100% of time is that it makes people "feel" it is a lot more capable than it actually is.
Human programmers don't usually hallucinate things out of thin air, AIs like to do that a whole lot. So no, they aren't working the exact same way.