Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

That's a rather odd comparison to make. First of all, OP, like llama.cpp, doesn't use the GPU – in contrast to most Python ML code. It's not hard to write Python code that "optimally exploits" the GPU. You might call the GPU a "specialized environment to build and run" but it's arguably much better suited to the problem.

Second, OP, like llama.cpp, produced efficient and highly specialized code after it was clear the model being specialized for (StableDiffusion / LLaMa / …) works well. Where Python shines, though, is the prototyping phase when you have yet to find an appropriate model. We have yet to see this sort of easy & convenient prototyping in C++.

Now, this is not to take away anything from the fantastic work that's being done by the llama.cpp people (to whom I also count OP) in the "ML on a CPU" space. But the problems being solved are entirely different.



> Where Python shines, though, is the prototyping phase when you have yet to find an appropriate model. We have yet to see this sort of easy & convenient prototyping in C++.

+1.

To produce a highly-optimized C/C++ kernel that utilizes the CPU to the fullest extent, it requires tremendously amount of talent and expertise. For example, not everyone can write a hand-vectorized kernel with AVX2 intrinsics (outside a few specialized applications like 3D graphics, media encoding, and the likes), and even fewer people can exploit the underlying feature of the algorithm for optimization, such as producing usable output at greatly reduced numerical precision. The power of LLM provides strong motivation to drive the brainpower of countless programmers all over the world to do just that. New techniques are proposed and implemented on a monthly basis, with people thinking and applying every possible trick on the LLM optimization problems. In this regard, moving from Python to C is totally reasonable.

In comparison, right now I'm working on optimizing a niche open-source scientific simulation kernel with a naive C codebase. Before me, there were hardly any contributors in the last decade.

Python has its place because not everyone has a level of resource and expertise comparable to ML. In particular, when the bulk of the data processing of a Python script is in done in a function call to a C++ or FORTRAN kernel like scipy, the differences between naive C and naive Python code (or Julia code if you're following the trend) are not that much, especially when it's a one-off project for just publishing a single paper.


It’s going to be a tf or PyTorch feature rather than going directly to writing things in C. No point solving this problem only once.


Yeah i make a living in the GPU space. I think my comment comes from colleagues having to hold my hand to set up their ML / Python environments with all of their picadellos. In fact its bad enough that i have to use docker to create an insular environment tailored to their specific setup. And Python is like a 1000 times slower when its not using other libs like numpy.


Are they not using venvs or something? It should be as simple as python -m venv venv; ./activate; pip install -r requirements.txt


Everyone has their own way to do this. Every step is broken by some unfamiliar dependency that requires special arcane knowledge to fix. Part of me is a grumpy old man that doesn’t gravitate to the shiny new tools that come out every week that the younger devs keep up with :)


pip and venv are neither shiny nor new, it's the standard way of doing things for a while. I am an outsider to python and am incredibly thankful for this standardization, because i agree getting python env set up correctly before venv was a huge pain

If your guys arent on this I'd suggest you get them on it, it dramatically simplifies setup


Here is a tiny excerpt try to get dvc to work just so I could get the training weights for deployment ... remember I don't develop much w Python...

    $ dvc pull
    Command 'dvc' not found, but can be installed with:
    sudo snap install dvc

    $ sudo snap install dvc
    error: This revision of snap "dvc" was published using classic confinement and thus may perform
    arbitrary system changes outside of the security sandbox that snaps are usually confined to,
    which may put your system at risk.

    If you understand and want to proceed repeat the command including --classic.

 ok I get dvc installed somehow -- don't remember. Time to get the weights...

    $ python3 -m dvc pull
    ERROR: unexpected error - Forbidden: An error occurred (403) when calling the HeadObject operation: Forbidden            

    Having any troubles? Hit us up at https://dvc.org/support, we are always happy to help!

Finally I just have my colleague manually copy the weights. This kind of thing went for hours.


Hey, DVC maintainer here.

Thanks for giving DVC a try!

There are a few ways to install dvc, see https://dvc.org/doc/install/linux

With snap, you need to use `--classic` flag, as noted in https://dvc.org/doc/install/linux#install-with-snap Unfortunately that's just how snap works for us there :(

Regarding the pull error, it simply looks like you don't have some credentials set up. See https://dvc.org/doc/user-guide/data-management/remote-storag... Still, the error could be better, so that's on us.

Feel free to ping us in discord (see invite link in https://dvc.org/support). I'm @ruslan there. We'll be happy to help.


Thanks… i know my colleague uses it a lot. I generally use his models and don’t do much ML development yet. At some point I need to properly learn all of this. It seems ML tools are only for developers not for those who simply want to deploy and use the resulting NN.


Researchers are notorious for writing bad code

What even is dvc

edit: also- i'd avoid snap and just use your regular package manager.


I think dvc is like git for large binary files. You need someway to manage your NN weights -- what are other methods?


git lfs is what everyone is using, HF in particular


> Are they not using venvs or something? It should be as simple as python -m venv venv; ./activate; pip install -r requirements.txt

In most cases, it would be possible to do close to that, but it is extremely common to run into things being distributed in the AI/ML space with install instructions that don’t include that, and instruct you to have a global install of a certain Python version, and then to pip install the dependencies (and globally install non-Python package dependencies, if there are any), so even if they’d work in a venv, you have to (1) indepently know you should be doing that, and (b) translate the instructions – which where (1) applies is usually trivial if all the dependencies are proper python packages, but can be more involved otherwise.

So, yeah, I can see that a lot of the time the path of least resistance is just to create an isolated container environment for it.


Unfortunatly its not that simple expecially for NVIDIA driver and cuda install. That's why we usually use conda that can handle cuda install but even with that some time it work flawlessly and some time not.


>You might call the GPU a "specialized environment to build and run" but it's arguably much better suited to the problem.

I feel like the person you're replying to knows that the GPU is better suited than the CPU to do this task, and your argument doesn't really make sense. I think they were referring to the python venv environment with all the library dependencies as the "specialized environment"


The point is that as awesome as this repo is it doesn't do much to ween the "ML folks" off of Python since it doesn't provide the flexibility and GPU support that people designing and training DL systems rely on.


I’m just encouraged when I see ML libraries not using Python w its environment kludges. Just a step in the right direction.


I don't disagree that Python environments are a mess. I'm actually a developer on quite a prominent large scale neural network training library and a DL researcher that uses said library. With my developer hat on I like to have minimal dependencies and keep Python scripting as decoupled as possible from the CUDA C++ implementation. With my researcher hat on I don't want to be slowed down by C++ development every time I want to change my model or training pipeline. At least for me, C++ development is slower and more error prone than modifying Python.

Obviously doing any heavy lifting in Python is a bad idea. But as a scripting language I think it's good, especially if you keep the environment simple. I don't think the answer for DL training is to dump Python entirely and start over in pure C/C++/Rust/Julia/whatever. Learning C/C++ is too big of an ask for everyone working on the model design and training side and it would slow down progress significantly - most of that work is actually data munging and targeted model tweaks. But I do think there's still a lot that can be done to decouple Python from the underlying engine and yield networks where inference can be run in a minimal dependency environment. There's lots of great people working on all these things.


That Python ML code is calling C++ code running in the GPU, one more reason to use C++ across the whole stack.

CERN already used prototyping in C++, with ROOT and CINT, 20 years ago.

https://root.cern/

Nowadays it is even usable from Netbooks via Xeus.

It is more a matter of lack of exposure to C++ interpreters than anything else.


Add to that it's only inference code, not training.


>That's a rather odd comparison to make. First of all, OP, like llama.cpp, doesn't use the GPU

When was the last time you looked at llama.cpp? It has supported GPU, GPU+CPU, and distributed inference using OpenMPI for awhile now. It also supports training, as well as negative prompting and grammars! The ease of getting llama.cpp running on just about anything has already started innovation.


not sure what "It's not hard to write Python code that "optimally exploits" the GPU", exactly means but Python is so far from exploiting the GPU resources even with C/C++ bindings that it's not even funny. I am sure that HPC folks would have migrated way from FORTRAN and C/C++ long time ago if it was so easy.


I wasn't trying to claim that Python is great at fully exploiting GPU resources on generic GPU tasks. But in ML applications it often does, at least in my experience.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: