Wow that is terrible. In my memory GPT 2 was more interesting than that. I remem...

daveguy · 2026-04-16T21:23:41 1776374621

Here is the XL model. 20x the size of the medium model. Still just 2B parameters, but on the bright side it was trained pre-wordslop.

sillysaurusx · 2026-04-16T22:27:41 1776378461

There’s an art to GPT sampling. You have to use temperature 0.7. People never believe it makes such a massive difference, but it does.

wat10000 · 2026-04-16T21:32:23 1776375143

Probably a much better prompt, too. I just literally pasted in the top part of my comment and let fly to see what would happen.