Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I don't see what you're seeing, in any dimension. But here's a fair take.

I wrote several very specialized benchmarks that I've used over time, that surface "model personalities" and their effects on decision making (as well as measuring the outcomes).

Grok 4.1 Fast Reasoning is/was a solid model. It's also fundamentally different from the pack.

I call it a smart, aggressive, Claude Haiku. That is, its "thinking" is quite chaotic and sometimes short-hand and its output can be as well (relate to other models).

Its aggressiveness can allow it to punch above in competitive scenarios that I have in some of my benchmarks. Its write-ups and documentation are often replete with "dominate", "relentless" and a general high energy that skirts the limits of 'cringe bro'. That said, it has generally performed just beneath the SOTA (at the time: GPT-5.2, Gemini-3-Flash, Claude Opus 4.5). Angry Sonnet perhaps.

The latest release feels quite similar but also underperforms the same older crowd (so far) so it hasn't quite made the leap that Claude's 4.6 and GPT's 5.3/5.4 series made. It's also now priced the same as its peers but does not deliver SOTA capabilities (at least not consistently in my opinion).



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: