Remix.run Logo
danielhanchen 4 days ago

Oh hey I'm assuming this is for conversion to GGUF after a finetune? If you need to quantize to GGUF Q4_K_M, we have to compile llama.cpp, hence apt-get and compiling llama.cpp within a Python shell.

There is a way to convert to Q8_0, BF16, F16 without compiling llama.cpp, and it's enabled if you use `FastModel` and not on `FastLanguageModel`

Essentially I try to do `sudo apt-get` if it fails then `apt-get` and if all fails, it just fails. We need `build-essential cmake curl libcurl4-openssl-dev`

See https://github.com/unslothai/unsloth-zoo/blob/main/unsloth_z...

pxc 4 days ago | parent | next [-]

It seems Unsloth is useful and popular, and you seem responsive and helpful. I'd be down to try to improve this and maybe package Unsloth for Nix as well, if you're up for reviewing and answering questions; seems fun.

Imo it's best to just depend on the required fork of llama.cpp at build time (or not) according to some configuration. Installing things at runtime is nuts (especially if it means modifying the existing install path). But if you don't want to do that, I think this would also be an improvement:

  - see if llama.cpp is on the PATH and already has the requisite features
  - if not, check /etc/os-release to determine distro
  - if unavailable, guess distro class based on the presence of high-level package managers (apt, dnf, yum, zypper, pacman) on the PATH
  - bail, explain the problem to the user, give copy/paste-friendly instructions at the end of we managed to figure out where we're running
Is either sort of change potentially agreeable enough that you'd be happy to review it?
danielhanchen 4 days ago | parent | next [-]

As an update, I pushed https://github.com/unslothai/unsloth-zoo/commit/ae675a0a2d20...

(1) Removed and disabled sudo

(2) Installing via apt-get will ask user's input() for permission

(3) Added an error if failed llama.cpp and provides instructions to manual compile llama.cpp

mFixman 3 days ago | parent [-]

Maybe it's a personal preference, but I don't want external programs to ever touch my package manager, even with permission. Besides, this will fail loudly for systems that don't use `apt-get`.

I would just ask the user to install the package, and _maybe_ show the command line to install it (but never run it).

lucideer 3 days ago | parent | next [-]

I don't think this should be a personal preference, I think it should be a standard*.

That said, it does at least seem like these recent changes are a large step in the right direction.

---

* in terms of what the standard approach should be, we live in an imperfect world and package management has been done "wrong" in many ecosystems, but in an ideal world I think the "correct" solution here should be:

(1) If it's an end user tool it should be a self contained binary or it should be a system package installed via the package manager (which will manage any ancillary dependencies for you)

(2) If it's a dev tool (which, if you're cloning a cpp repo & building binaries, it is), it should not touch anything systemwide. Whatsoever.

This often results in a README with manual instructions to install deps, but there are many good automated ways to approach this. E.g. for CPP this is a solved problem with Conan Profiles. However that might incur significant maintenace overhead for the Unsloth guys if it's not something the ggml guys support. A dockerised build is another potential option here, though that would still require the user to have some kind of container engine installed, so still not 100% ideal.

danielhanchen 3 days ago | parent [-]

I would like to be in (1) but I'm not a packaging person so I'll need to investigate more :(

(2) I might make the message on installing llama.cpp maybe more informative - ie instead of re-directing people to the docs on manual compilation ie https://docs.unsloth.ai/basics/troubleshooting-and-faqs#how-..., I might actually print out a longer message in the Python cell entirely

Yes we're working on Docker! https://hub.docker.com/r/unsloth/unsloth

lucideer 3 days ago | parent [-]

> Yes we're working on Docker!

That will be nice too, though I was more just referring to simply doing something along the lines of this in your current build:

  docker run conanio/gcc11-ubuntu16.04 make clean -C llama.cpp etc etc...
(likely mounting & calling a sh file instead of passing individual commands)

---

Although I do think getting the ggml guys to support Conan (or monkey patching your own llama conanfile in before building) might be an easier route.

danielhanchen 3 days ago | parent [-]

Oh ok I'll take a look at conanfiles as well - sorry I'm not familiar with them!

danielhanchen 3 days ago | parent | prev | next [-]

Hopefully the solution for now is a compromise if that works? It will show the command as well, so if not accepted, typing no will error out and tell the user on how to install the package

solarkraft 3 days ago | parent | prev [-]

I like it when software does work for me.

Quietly installing stuff at runtime is shady for sure, but why not if I consent?

danielhanchen 3 days ago | parent [-]

Do you think it's ok for permissioning I guess? I might also add a 30 second timer and just bail out if there's no response from the user

danielhanchen 4 days ago | parent | prev [-]

Thanks for the suggestions! Apologies again I'm pretty bad at packaging, so hence the current setup.

1. So I added a `check_llama_cpp` which checks if llama.cpp does exist and it'll use the prebuilt one https://github.com/unslothai/unsloth-zoo/blob/main/unsloth_z...

2. Yes I like the idea of determining distro

3. Agreed on bailing - I was also thinking if doing a Python input() with a 30 second waiting period for apt-get if that's ok? We tell the user we will apt-get some packages (only if apt exists) (no sudo), and after 30 seconds, it'll just error out

4. I will remove sudo immediately (ie now), and temporarily just do (3)

But more than happy to fix this asap - again sorry on me being dumb

mkl 3 days ago | parent [-]

It shouldn't install any packages itself. Just print out a message about the missing packages and your guess of the command to install them, then exit. That way users can run the command themselves if it's appropriate or add the packages to their container build or whatever. People set up machines in a lot of different ways, and automatically installing things is going to mess that up.

danielhanchen 3 days ago | parent | next [-]

Hmmm so I should get rid of the asking / permissions message?

mkl 3 days ago | parent [-]

Yes, since you won't actually need the permissions.

danielhanchen 3 days ago | parent [-]

Hmmm I'm worried people will really not get on how to install / compile / use the terminal hmmm hence I thought permissions were like a compromise solution

solarkraft 3 days ago | parent | next [-]

I think that it is, quite a good one, even:

- Determine the command that has to be run by the algorithm above.

This does most of the work a user would have to figure out what has to be installed on their system.

- Ask whether to run the command automatically.

This allows the “software should never install dependencies by itself” crowd to say no and figure out further steps, while allowing people who just want it to work to get on with their task as quickly as possible (who do you think there are more of?).

I think it would be fine to print out the command and force the user to run it themselves, but it would bring little material gain at the cost of some of your users’ peace (“oh no it failed, what is it this time ...”).

danielhanchen 3 days ago | parent [-]

Oh ok! I would say 50% of people manually install llama.cpp and the other 50% want it to be automated

segmondy 3 days ago | parent | prev [-]

Don't listen to this crowd, these are "technical folks". Most of your audience will fail to figure it out. You can provide an option that llama.cpp is missing and give them an option where you auto install it or they can install it themselves and do manual configuration. I personally won't tho.

Computer0 3 days ago | parent | next [-]

Who do you think the audience is here if not technical. We are in a discussion about a model that requires over 250gb of ram to run. I don't know a non-technical person with more than 32gb.

pxc 3 days ago | parent [-]

I think most of the people like this in the ML world are extreme specialists (e.g.: bioinformaticians, statisticians, linguists, data scientists) who are "technical" in some ways but aren't really "computer people". They're power users in a sense but they're also prone to strange bouts of computing insanity and/or helplessness.

danielhanchen 3 days ago | parent | prev [-]

I think for a compromise solution I'll allow the permission asking to install. I'll definitely try investigating pre built binaries though

solarkraft 3 days ago | parent | prev [-]

This is an edge case optimization at the cost of 95% of users.

mkl 3 days ago | parent [-]

95% of users probably won't be using Linux. Most of those who are will have no problem installing dependencies. There are too many distributions and ways of setting them up for automated package manager use to be the right thing to do. I have never seen a Python package even try.

Balinares 3 days ago | parent | prev | next [-]

I'll venture that whoever is going to fine-tune their own models probably already has llama.cpp installed somewhere, or can install if required.

Please, please, never silently attempt to mutate the state of my machine, that is not a good practice at all and will break things more often than it will help because you don't know how the machine is set up in the first place.

danielhanchen 3 days ago | parent [-]

Oh yes so before we install llama.cpp we do an path environment check and if its not defined then it'll install.

But yes agreed there won't be any more random package installs sorry!

Balinares 3 days ago | parent [-]

Thanks for the reply! If I can find the time (that's a pretty big if), I'll try to send a PR to help with the packaging.

danielhanchen 3 days ago | parent [-]

No worries :)

elteto 4 days ago | parent | prev | next [-]

Dude, this is NEVER ok. What in the world??? A third party LIBRARY running sudo commands? That’s just insane.

You just fail and print a nice error message telling the user exactly what they need to do, including the exact apt command or whatever that they need to run.

danielhanchen 4 days ago | parent | next [-]

As an update, I pushed https://github.com/unslothai/unsloth-zoo/commit/ae675a0a2d20...

(1) Removed and disabled sudo

(2) Installing via apt-get will ask user's input() for permission

(3) Added an error if failed llama.cpp and provides instructions to manual compile llama.cpp

Again apologies on my dumbness and thanks for pointing it out!

danielhanchen 4 days ago | parent | prev | next [-]

Yes I had that at the start, but people kept complaining they don't know how to actually run terminal commands, hence the shortcut :(

I was thinking if I can do it during the pip install or via setup.py which will do the apt-get instead.

As a fallback, I'll probably for now remove shell executions and just warn the user

rfoo 3 days ago | parent | next [-]

IMO the correct thing to do to make these people happy, while being sane, is - do not build llama.cpp on their system. Instead, bundle a portable llama.cpp binary along with unsloth, so that when they install unsloth with `pip` (or `uv`) they get it.

Some people may prefer using whatever llama.cpp in $PATH, it's okay to support that, though I'd say doing so may lead to more confused noob users spam - they may just have an outdated version lurking in $PATH.

Doing so makes unsloth wheel platform-dependent, if this is too much of a burden, then maybe you can just package llama.cpp binary and have it on PyPI, like how scipy guys maintain a https://pypi.org/project/cmake/ on PyPI (yes, you can `pip install cmake`), and then depends on it (maybe in an optional group, I see you already have a lot due to cuda shit).

danielhanchen 3 days ago | parent [-]

Oh yes I was working on providing binaries together with pip - currently we're relying on pyproject.toml, but once we utilize setup.py (I think), using binaries gets much simpler

I'm still working on it, but sadly I'm not a packaging person so progress has been nearly zero :(

ffsm8 3 days ago | parent | next [-]

I think you misunderstood rfoos suggestion slightly.

From how I interpreted it, he meant you could create a new python package, this would effectively be the binary you need.

In your current package, you could depend on the new one, and through that - pull in the binary.

This would let you easily decouple your package from the binary,too - so it'd be easy to update the binary to latest even without pushing a new version of your original package

I've maintained release pipelines before and handled packaging in a previous job, but I'm not particularly into the python ecosystem, so take this with a grain of salt: an approach would be

Pip Packages :

    * Unsloth: current package, prefers using unsloth-llama, and uses path llama-cpp as fallback (with error msg as final fallback if neither exist, promoting install for unsloth-llama)
    * Unsloth-llama: new package which only bundles the llama cpp binary
danielhanchen 3 days ago | parent [-]

Oh ok sorry maybe I misunderstood sorry! I actually found my partial work I did for precompiled binaries! https://huggingface.co/datasets/unsloth/precompiled_llama_cp...

I was trying to see if I could pre-compile some llama.cpp binaries then save them as a zip file (I'm a noob sorry) - but I definitely need to investigate further on how to do python pip binaries

docfort 3 days ago | parent [-]

https://docs.astral.sh/uv/guides/package/#publishing-your-pa...

danielhanchen 2 days ago | parent [-]

Oh thanks - we currently use Pypi so pip install works - https://pypi.org/project/unsloth/

But I think similarly for uv we need a setup.py for packaging binaries (more complex)

rat9988 3 days ago | parent | prev [-]

Don't worry. Don't let the rednecks screaming here affect you. As for one, I'm happy that you have automated this part and sad to see it is going away. People will always complain. It might be reasonable feedback worth acting upon. Don't let their tone distract you though. Some of them are just angry all day.

danielhanchen 3 days ago | parent [-]

Thanks - hopefully the compromise solution ie python input asking for user permissions works ok?

rpdillon 3 days ago | parent [-]

As a guy that would naturally be in the camp of "installing packages is never okay", I also live in the more practical world where people want things to work. I think the compromise you're suggesting is a pretty good one. I think the highest quality implementation here would be.

Try to find prebuilt and download.

See if you can compile from source if a compiler is installed.

If no compiler: prompt to install via sudo apt and explaining why, also give option to abort and have the user install a compiler themselves.

This isn't perfect, but limits the cases where prompting is necessary.

danielhanchen 3 days ago | parent [-]

I'm going to see if I can make prebuilt versions work :) But thanks!

devin 4 days ago | parent | prev [-]

Don't optimize for these people.

danielhanchen 4 days ago | parent [-]

Yep agreed - I primarily thought it was a reasonable "hack", but it's pretty bad security wise, so apologies again.

The current solution hopefully is in between - ie sudo is gone, apt-get will run only after the user agrees by pressing enter, and if it fails, it'll tell the user to read docs on installing llama.cpp

woile 3 days ago | parent | next [-]

Don't apologize, you are doing amazing work. I appreciate the effort you put.

Usually you don't make assumptions on the host OS, just try to find the things you need and if not, fail, ideally with good feedback. If you want to provide the "hack", you can still do it, but ideally behind a flag, `allow_installation` or something like that. This is, if you want your code to reach broader audiences.

shaan7 3 days ago | parent | next [-]

Yep there's no need to apologize, you've been very courteous and took all that feedback constructively. Good stuff :)

danielhanchen 3 days ago | parent [-]

Thank you!

danielhanchen 3 days ago | parent | prev [-]

Thank you! :)

3 days ago | parent | prev [-]
[deleted]
lyu07282 3 days ago | parent | prev | next [-]

on a meta level its kind of worrying for the ecosystem that there is nothing in PyPI that blocks & bans developers who try to run sudo on setup. I get they don't have the resources to do manual checks, but literally no checks against malicious packages?

danielhanchen 3 days ago | parent [-]

Sadly not - you can run anything within a python shell since there's os system, subprocess popen and exec - it's actually very common for setup.py files where installers execute commands

But I do agree maybe for better security pypi should check for commands and warn

pxc 4 days ago | parent | prev [-]

How unusual is this for the ecosystem?

pshirshov 3 days ago | parent [-]

Unfortunately, Python ecosystem is the Wild West.

pshirshov 3 days ago | parent | prev | next [-]

It won't work well if you deal with non ubuntu+cuda combination. Better just fail with a reasonable message.

danielhanchen 3 days ago | parent [-]

For now I'm re-directly people to our docs https://docs.unsloth.ai/basics/troubleshooting-and-faqs#how-...

But I'm working on more cross platform docs as well!

pshirshov 3 days ago | parent [-]

My current solution is to pack llama.cpp as a custom nix formula (the one in nixpkgs has the conversion script broken) and run it myself. I wasn't able to run unsloth on ROCM nor for inference nor for conversion, sticking with peft for now but I'll attempt again to re-package it.

danielhanchen 3 days ago | parent [-]

Oh interesting oh for ROCM there are some installation instructions here: https://rocm.docs.amd.com/projects/ai-developer-hub/en/lates...

I'm working with the AMD folks to make the process easier, but it looks like first I have to move off from pyproject.toml to setup.py (allows building binaries)

pshirshov 3 days ago | parent [-]

Yes, it's trivial with the pre-built vllm docker, but I need a declarative way to configure my environment. The lack of prebuilt rocm wheels for vllm is the main hindrance for now but I was shocked to see the sudo apt-get in your code. Ideally, llama.cpp should publish their gguf python library and the conversion script to pypi with every release, so you can just add that stuff as a dependency. vllm should start publishing a rocm wheel, after that unsloth would need to start publishing two versions - a cuda one and a rocm one.

danielhanchen 3 days ago | parent [-]

Yes apologies again - yes rocm is still an issue

lambda 4 days ago | parent | prev [-]

[flagged]

danielhanchen 4 days ago | parent | next [-]

I added it since many people who used Unsloth don't know how to compile llama.cpp, so the only way from Python's side is to either (1) Install it via apt-get within the Python shell (2) Error out then tell the user to install it first, then continue again

I chose (1) since it was mainly for ease of use for the user - but I agree it's not a good idea sorry!

:( I also added a section to manually compile llama.cpp here: https://docs.unsloth.ai/basics/troubleshooting-and-faqs#how-...

But I agree I should remove apt-gets - will do this asap! Thanks for the suggestions :)

Imustaskforhelp 4 days ago | parent | next [-]

Hey man, I was seeing your comments and you do seem to respond to each and everyone nicely regarding this sudo shenanigan.

I think that you have removed sudo so this is nice, my suggestion is pretty similar to that of pxc (basically determine different distros and use them as that)

I wonder if we will ever get a working universal package manager in linux, to me flatpak genuinely makes the most sense even sometimes for cli but flatpak isn't built for cli unlike snap which both support cli and gui but snap is proprietory.

danielhanchen 4 days ago | parent [-]

Hey :) I love suggestions and keep them coming! :)

I agree on handling different distros - sadly I'm not familiar with others, so any help would be appreciated! For now I'm most familiar with apt-get, but would 100% want to expand out!

Interesting will check flatpak out!

Imustaskforhelp 3 days ago | parent [-]

Just to let you know though that its really rare that flatpak is used for cli's. I think I mentioned it in my comment too or if not, my apologies but flatpak is used mostly in gui's.

I doubt its efficacy here, they might be more useful if you provide a whole jupyter / browser gui though but a lot o f us run it just in cli so I doubt flatpak.

I didn't mean to say that flatpak was the right tool for this job, I seriously don't know too much to comment and so I'd prefer if you could ask someone definitely experienced regarding it.

My reasoning for flatpak was chunking support (that I think is rare in appimage) and easier gpu integration (I think) compared to docker, though my reasoning might be flawed since flatpak isn't mostly used with cli.

danielhanchen 3 days ago | parent [-]

Oh no worries on flatpak - I always like investigating things I'm not familar with :)

exe34 3 days ago | parent | prev [-]

have you considered cosmopolitan? e.g. like llamafile that works on everything up to and including toasters.

danielhanchen 3 days ago | parent [-]

Oh llamafile is very cool! I might add it as an option actually :) For generic exports (ie to vLLM, llamafile etc), normally finetunes end with model.save_pretrained_merged and that auto merges to 16bit safetensors which allows for further processing downstream - but I'll investigate llamafile more! (good timing since llamafile is cross platform!)

pxx 4 days ago | parent | prev [-]

> What this whole thing hallucinated by an LLM?

probably not, because LLMs are a little more competent than this