| ▲ | nmfisher 20 hours ago | |||||||||||||
antirez running (quantized) DeepSeek V4 Pro on a Mac Studio M3 Ultra with 512GB of RAM: https://bsky.app/profile/antirez.bsky.social/post/3mlzwmvlov... It's much closer than you think. We're going to see specialized hardware in the next 24 months capable of running 2025-era frontier models. That's big. | ||||||||||||||
| ▲ | menaerus 34 minutes ago | parent | next [-] | |||||||||||||
2-bit quantization? That's a lot of signal being removed. Considering how quickly the AI models are progressing in their capabilities (still exponential curve), I will not want to use the 2025 model in two years time. Similarly, how I don't want to use llama-3 or old Anthropic model from 2023 or 2024. Newer models are so much better that it makes it very difficult to ignore. Once and if the advancements with the AI models slow down, only then IMHO it will become feasible to design the specialized HW for general-purpose consumption and general-purpose workloads. | ||||||||||||||
| ▲ | treis 18 hours ago | parent | prev | next [-] | |||||||||||||
It's big because it may take a big swath of people who will actually pay for LLMs out of the market. But for the average consumer they're going to primarily use their phone/tablet and we're far away from that being possible. Even if it were possible the LLMs are such a gold mine of user data. It's really hard to see that opportunity be passed up. | ||||||||||||||
| ▲ | 18 hours ago | parent | prev | next [-] | |||||||||||||
| [deleted] | ||||||||||||||
| ▲ | dist-epoch 19 hours ago | parent | prev [-] | |||||||||||||
That specialized hardware will be scooped up by AI data-centers, just like RAM is today. | ||||||||||||||
| ||||||||||||||