Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The way things are going, we'll see more efficient and faster methods to run transformer arch on edge, but I'm afraid we're approaching the limit because you can't just rust your way out of the VRAM requirements, which is the main bottleneck in loading large-enough models. One might say "small models are getting better, look at Mistral vs. llama 2", but small models are also approaching their capacity (there's only so much you can put in 7b parameters).

I don't know man, this approach to AI doesn't "feel" like it'll lead to AGI—it's too inefficient.



I think we have plenty of headroom with MoE systems, dynamically loading LoRAs and such, even with the small models.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: