Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This is a very real concern. I've seen quantized models outputting complete garbage in LLMs. In most cases it definitely felt that a smaller unquantized model would do better. They must be included in every comparison.

E.g. compare quantized LLaMA 70B to unquantized LLaMA 8B.

Even better if the test model has a smaller version with similar byte size to the quantized larger one.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: