I'm not going to pretend to know everything about Ollama, llama.cpp or llamafile...

I'm not going to pretend to know everything about Ollama, llama.cpp or llamafile, but my experiences using llama.cpp and llamafile (llama.cpp based) were both negative. Web UI frontends aren't relevant here, this is just purely about whether I can load a model and get it to produce coherent results that are in the realm of what the people who created the model intended.

With llama.cpp or llamafile, I was constantly having to look up a model's paper, documentation or other pages to see what the recommended parameters were, recommended templates were, and so on. My understanding is that GGUFs were supposed to solve that, yet still I was getting poor results.

You know, I don't know all the details or if there's any difference between what Modelfiles are for versus what GGUF metadata is for, but my experiences with Ollama have been that it just worked. It took me a while to even try Ollama, because the expectation is that it would simply be another interface on top of the same issues.

There are things I don't like about Ollama, but mostly they were easy to work around by writing a few scripts. Not using any web UI with it at all.