Remix.run Logo
paxys 16 hours ago

The problem with all these "AI box" startups is that the product is too expensive for hobbyists, and companies that need to run workloads at scale can always build their own servers and racks and save on the markup (which is substantial). Unless someone can figure out how to get cheaper GPUs & RAM there is really no margin left to squeeze out.

nine_k 15 hours ago | parent | next [-]

Would a hedge fund that does not want to trust to a public AI cloud just buy chassis, mobos, GPUs, etc, and build an equivalent themselves? I suspect they value their time differently.

paxys 11 hours ago | parent [-]

Why do you think a hedge fund can't hire a couple of IT guys? Most of the larger ones have technical operations that would put big tech to shame.

ViscountPenguin 6 hours ago | parent | next [-]

Medium sized hedge funds are a good portion of the market, and only really want to hire just enough tech people to keep the quant pipelines running.

signal_v1 7 hours ago | parent | prev [-]

[dead]

qubex 8 hours ago | parent | prev | next [-]

They’re kickstarting a TINY device that is pocketable and aimed at consumers. I’ve backed it (full disclosure).

jgrizou 3 hours ago | parent | next [-]

https://www.kickstarter.com/projects/tiinyai/tiiny-ai-pocket...

ankaz 2 hours ago | parent | prev [-]

[dead]

kkralev 15 hours ago | parent | prev [-]

i think the real gap isnt at the high end tho. theres a whole segment of people who just want to run a 7-8b model locally for personal use without dealing with cloud APIs or sending their data somewhere. you dont need 4 GPUs for that, a jetson or even a mini pc with decent RAM handles it fine. the $12k+ market feels like it's chasing a different customer than the one who actually cares about offline/private AI

wmf 14 hours ago | parent [-]

just want to run a 7-8b model locally

This is already solved by running LM Studio on a normal computer.

zozbot234 14 hours ago | parent [-]

Ollama or llama.cpp are also common alternatives. But a 8B model isn't going to have much real-world knowledge or be highly reliable for agentic workloads, so it makes sense that people will want more than that.

zach_vantio 12 hours ago | parent [-]

the compute density is insane. but giving a 70B model actual write access locally for agentic workloads is a massive liability. they still hallucinate too much. raw compute without strict state control is basically just a blast radius waiting to happen.