It probably wouldn't hurt to set up Microsoft Clarity - it's a free solution that lets you see how people actually use your product and identify bottlenecks and pain points. It also has a “copilot ai” feature that can offer some suggestions (though it's just a copilot, so don't expect too much). But I’m sure that as soon as you see exactly what users are doing, you’ll be able to spot the problem areas.
So you end up with two parallel permission systems that contradict each other, and the Settings UI only controls one of them. It's not a bug, it's architectural debt that they've decided is cheaper to leave than to fix.
The policy makes sense as a liability shield, but it doesn't address the actual problem, which is review bandwidth. A human signs off on AI-generated code they don't fully understand, the patch looks fine, it gets merged. Six months later someone finds a subtle bug in an edge case no reviewer would've caught because the code was "too clean."
> they don't fully understand, the patch looks fine
I don't get this part. Why is the reviewer signing off on it? AI code should be fully documented (probably more so than a human could) and require new tests. Code review gates should not change
But "teammate" is a stretch. The failure mode is different from a human -- a person will tell you "I don't know how to do this," an agent will confidently do it wrong and you won't notice until something breaks in production. The supervision cost doesn't go away, it just changes shape.
The fact that OpenAI's pipeline had no minimumReleaseAge configured is surprising though. That's basically saying "run whatever npm published 5 minutes ago in a context that has access to my signing keys." For a company that size, with that attack surface, feels like this should've been caught in a security review.
I've been on Max20 for quite a while now, and I remember my transition process very well. Now I'm missing the Max20 subscription, and I’m thinking about buying a second account. I can’t say the problem is with Anthropic, because I really am using the service more and more. With the Pro subscription, I couldn’t afford to run two agents in a separate terminal that restart each other for hours on end. Or run research with 10–15 agents simultaneously, but this really boosts efficiency by a factor of several times, so yes, a second account is the way to go for me.
Borrow checker in a functional concatenative language is a wild combination. I write Rust for real-time audio and Elixir for the orchestration layer in the same project, so I deal with both worlds daily. In Rust the borrow checker saves you from data races but fights you on anything concurrent. In Elixir you just don't have shared mutable state at all, problem solved differently. Curious where Slap lands -- does it feel more like Rust's "prove to the compiler you're safe" or more like "the language just doesn't let you do the unsafe thing"?
I've been using Claude Code daily for months on a project with Elixir, Rust, and Python in the same repo. It handles multi-language stuff surprisingly well most of the time. The worst failure mode for me is when it does a replace_all on a string that also appears inside a constant definition -- ended up with GROQ_URL = GROQ_URL instead of the actual URL. Took a second round of review agents to catch it. So yeah, you absolutely can't trust it to self-verify.
You say you've used it for months, I wonder if the example you gave was recent and if you've been noticing an overall degradation in quality or it's been constantly bad for you?