Remix.run Logo
tgma 4 hours ago

I installed this so you don't have to. It did feel a bit quirky and not super polished. Fails to download the image model. The audio/tts model fails to load.

In 15 minutes of serving Gemma, I got precisely zero actual inference requests, and a bunch of health checks and two attestations.

At the moment they don't have enough sustained demand to justify the earning estimates.

splittydev 3 hours ago | parent | next [-]

They released this like a day ago, I'm not surprised that there's not enough demand right now. Give it some time to take off

tgma 3 hours ago | parent [-]

You'd think to bootstrap a marketplace you'd spend your own money to feed fake requests (or perhaps allow free chat so that they induce requests).

Still, absolute zero is an unacceptable number. Had this running for more than an hour.

splittydev 3 hours ago | parent [-]

I kind of see your point, but I also kind of don't.

Sure, it would be great if you'd immediately get hammered with hundreds of requests and start make money quickly. It would also be great if it was a bit more transparent, and you could see more stats (what counts as "idle"? Is my machine currently eligible to serve models?). But it's still very new, I'd say give it some time and let's see how it goes.

If you have it running and you get zero requests, it uses close to zero power above what your computer uses anyway. It doesn't cost you anything to have it running, and if you get requests, you make money. Seems like an easy decision to me.

usrusr 7 minutes ago | parent | next [-]

Bootstrapping will be near-impossible (or incredibly costly) unless they offer inference consumers models with established demand arriving at some least-cost router service where they can undercut the competition (if they actually can). And then dogfood the opportunistic provider side on their own Macs, but with a preference to putting third parties first in the queue. Everything else is just wishful thinking.

tgma 3 hours ago | parent | prev [-]

Well I already made the Ctrl+C decision. Yours may have been different, but I suppose only one of us installed it, and that one counts.

yard2010 2 hours ago | parent | next [-]

I went with the ctrl z approach.

subroutine 2 hours ago | parent | prev [-]

Copy?

oneeyedpigeon 2 hours ago | parent [-]

SIGINT

lxglv 3 hours ago | parent | prev | next [-]

weird to learn that they do not generate inference requests to their network themselves to motivate early adopters at least to host their inference software

thatxliner 4 hours ago | parent | prev | next [-]

and I don't think they ever will unless they're highly competitive (hopefully that price they have stays? at least for users)

I was thinking of building this exact thing a year ago but my main stopper was economics: it would never make sense for someone to use the API, thus nobody can make money off of zero demand.

I guess we just have to look at how Uber and Airbnb bootstrapped themselves. Another issue with my original idea was that it was for compute in general, when the main, best use-case, is long(er)-running software like AI training (but I guess inference is long running enough).

But there already exist software out there that lets you rent out your GPU so...

tgma 3 hours ago | parent | next [-]

People underestimate how efficient cost/token is for beefy GPUs if you are able to batch. Unlikely for one off consumer unit to be able to compete long term.

starkeeper 3 hours ago | parent | prev [-]

What's a good place to do this?

subroutine 2 hours ago | parent | prev [-]

Has anyone tested the system from the other end... sending a prompt and getting a response?