Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

If you want to experiment with hardcoding small programs into transformer weights, maybe try ALTA: https://arxiv.org/abs/2410.18077v2
 help



I'm less interested in turning programs into transformers and more interested in turning programs into subnetworks within large language models.

Which the blog post brings up as a research direction, but never actually elaborates upon. And the interface between the two is a hard problem.

I'll check out the link though, thanks.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: