I am using 4.7 with the default extra high thinking, and it is clearly very stup...

sothatsit · 2026-04-17T01:33:19 1776389599

You can’t make up your mind about a model by using it on one task. Especially to say it’s such a bad downgrade after that is ludicrous. I’ve had great experiences with it this morning.

raylad · 2026-04-17T04:10:05 1776399005

That was more than one task. It was 3.

I also had Opus 4.7 and Opus 4.6 do audits of a very long document using identical prompts. I then had Codex 5.4 compare the audits. Codex found that 4.6 did a far better job and 4.7 had missed things and added spurious information.

I then asked a new session of Opus 4.7 if it agreed or disagreed with the Codex audit and it agreed with it.

I also agreed with it.

solenoid0937 · 2026-04-17T02:45:18 1776393918

It's been dramatically better than any model I have ever used before on my tasks.