Remix.run Logo
mythz 4 hours ago

I consider HuggingFace more "Open AI" than OpenAI - one of the few quiet heroes (along with Chinese OSS) helping bring on-premise AI to the masses.

I'm old enough to remember when traffic was expensive, so I've no idea how they've managed to offer free hosting for so many models. Hopefully it's backed by a sustainable business model, as the ecosystem would be meaningfully worse without them.

We still need good value hardware to run Kimi/GLM in-house, but at least we've got the weights and distribution sorted.

data-ottawa 3 hours ago | parent | next [-]

Can we toss in the work unsloth does too as an unsung hero?

They provide excellent documentation and they’re often very quick to get high quality quants up in major formats. They’re a very trustworthy brand.

disiplus 3 hours ago | parent | next [-]

Yeah, they're the good guys. I suspect the open source work is mostly advertisements for them to sell consulting and services to enterprises. Otherwise, the work they do doesn't make sense to offer for free.

arcanemachiner 11 minutes ago | parent [-]

I hope that is exactly what is happening. It benefits them, and it benefits us.

cubie 3 hours ago | parent | prev | next [-]

I'm a big fan of their work as well, good shout.

mirekrusin an hour ago | parent | prev [-]

Yes, they should get a Nobel for their work, it's from different planet.

Tepix 2 hours ago | parent | prev | next [-]

It's insane how much traffic HF must be pushing out of the door. I routinely download models that are hundreds of gigabytes in size from them. A fantastic service to the sovererign AI community.

vardalab 22 minutes ago | parent [-]

Yup, I have downloaded probably a terabyte in the last week, especially with the Step 3.5 model being released and Minimax quants. I wonder what my ISP thinks. I hope they don't cut me off. They gave me a fast lane, they better let me use it, lol

zozbot234 3 hours ago | parent | prev | next [-]

> We still need good value hardware to run Kimi/GLM in-house

If you stream weights in from SSD storage and freely use swap to extend your KV cache it will be really slow (multiple seconds per token!) but run on basically anything. And that's still really good for stuff that can be computed overnight, perhaps even by batching many requests simultaneously. It gets progressively better as you add more compute, of course.

HPsquared 3 hours ago | parent [-]

At a certain point the energy starts to cost more than renting some GPUs.

vardalab 21 minutes ago | parent [-]

Yeah, that is hard to argue with because I just go to OpenRouter and play around with a lot of models before I decide which ones I like. But there's something special about running it locally in your basement

sowbug 3 hours ago | parent | prev | next [-]

Why doesn't HF support BitTorrent? I know about hf-torrent and hf_transfer, but those aren't nearly as accessible as a link in the web UI.

embedding-shape 2 hours ago | parent [-]

> Why doesn't HF support BitTorrent?

Harder to track downloads then. Only when clients hit the tracker would they be able to get download states, and forget about private repositories or the "gated" ones that Meta/Facebook does for their "open" models.

Still, if vanity metrics wasn't so important, it'd be a great option. I've even thought of creating my own torrent mirror of HF to provide as a public service, as eventually access to models will be restricted, and it would be nice to be prepared for that moment a bit better.

homarp 10 minutes ago | parent | next [-]

how are all the private trackers tracking ratios?

sowbug 2 hours ago | parent | prev | next [-]

I thought of the tracking and gate questions, too, when I vibed up an HF torrent service a few nights ago. (Super annoying BTW to have to download the files just to hash the parts, especially when webseeds exist.) Model owners could disable or gate torrents the same way they gate the models, and HF could still measure traffic by .torrent downloads and magnet clicks.

It's a bit like any legalization question -- the black market exists anyway, so a regulatory framework could bring at least some of it into the sunlight.

embedding-shape 2 hours ago | parent [-]

> Model owners could disable or gate torrents the same way they gate the models, and HF could still measure traffic by .torrent downloads and magnet clicks.

But that'll only stop a small part, anyone could share the infohash and if you're using the dht/magnet without .torrent files or clicks on a website, no one can count those downloads unless they too scrape the dht for peers who are reporting they've completed the download.

sowbug 2 hours ago | parent [-]

Right, but that's already happening today. That's the black-market point.

taminka 18 minutes ago | parent | prev [-]

most of the traffic is probably from open weights, just seed those, host private ones as is

Fin_Code 2 hours ago | parent | prev [-]

I still don't know why they are not running on torrent. Its the perfect use case.

heliumtera 2 hours ago | parent | next [-]

How can you be the man in the middle in a truly P2P environment?

freedomben 2 hours ago | parent | prev [-]

That would shut out most people working for big corp, which is probably a huge percentage of the user base. It's dumb, but that's just the way corp IT is (no torrenting allowed).

zozbot234 2 hours ago | parent [-]

It's a sensible option, even when not everyone can really use it. Linux distros are routinely transfered via torrent, so why not other massive, open-licensed data?

freedomben 2 hours ago | parent [-]

Oh as an option, yeah I agree it makes a ton of sense. I just would expect a very, very small percentage of people to use the torrent over the direct download. With Linux distros, the vast majority of downloads still come from standard web servers. When I download distro images I opt for torrents, but very few people do the same

zrm an hour ago | parent [-]

With Linux distros they typically put the web link right on the main page and have a torrent available if you go look for it, because they want you to try their distro more than they want to save some bandwidth.

Suppose HF did the opposite because the bandwidth saved is more and they're not as concerned you might download a different model from someone else.