Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I saw this in my feed recently which was an interesting analysis on how code training was added as a fine tune (Codex) on a foundational model (GPT-3): https://yaofu.notion.site/How-does-GPT-Obtain-its-Ability-Tr...

I do wonder if anyone is considering mixing in larger and larger percentages of The Stack https://huggingface.co/datasets/bigcode/the-stack with this or the Pile to get more code and see what happens.

(Likely beyond mere mortals' budgets though.)



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: