| ▲ | bryanhogan 2 hours ago |
| Claude.ai is now at a 98.85% uptime. There's been so many frustrations with Claude / Anthropic lately (very heavy usage limits, wrong A / B testing, etc.). Claude status: https://status.claude.com/ I have been really happy with my Codex subscription lately, but feels like these things change every other day. The OpenCode Go subscription for trying out GLM, Kimi, Qwen, Deepseek and friends also looks useful. But nonetheless, Opus 4.6 is a very capable model, but justifying a Claude subscription gets more and more difficult, think I might just sometimes use it through OpenRouter or as part of something like Cursor (although I'm not sure about the value of that subscription as well). OpenCode Go: https://opencode.ai/go Cursor: https://cursor.com |
|
| ▲ | oefrha an hour ago | parent | next [-] |
| There were periods where I was entirely unable to use Claude Code for hour+ due to auth gateway always returning 500 or timing out, there was an "elevated errors" incident shown on status.claude.com, but zero minute of downtime recorded (not even "partial outage"). So the real uptime should be even worse. |
|
| ▲ | rubslopes 2 hours ago | parent | prev | next [-] |
| April has been a crazy month for open weights models. I've been using Claude Code for work and Kimi 2.6 for personal projects and Kimi has been very good. Glm-5.1 is also great. Qwen, Mimo and Deepseek I need to test some more, but they all have been producing good results. I have the impression that they are all are at the same level, or close to, Sonnet 4.6. |
| |
| ▲ | bombcar an hour ago | parent | next [-] | | What are you running them on? | | |
| ▲ | wswope an hour ago | parent [-] | | Not OP, but having explored the field a good bit, Openrouter + pi harness in a devcontainer work great as a sane starting point. Highly recommend as a clean way to try out the upstart models. |
| |
| ▲ | slopinthebag an hour ago | parent | prev [-] | | They are close to Opus, not Sonnet. | | |
| ▲ | 2ndorderthought an hour ago | parent | next [-] | | The little qwen36 is at sonnet level . Kimi2.6 is about opus. The one can run on a single GPU on your gaming pc. The other you can run way cheaper from a provider. Or if you are really wealthy and have lots of gpus can run it yourself. Not sure where deepseek 4 sits | | |
| ▲ | vidarh 35 minutes ago | parent | next [-] | | Kimi 2.6 is nowhere near even Sonnet in overall robustness. It can get close when everything goes perfectly. I have about 1KLOC of harness code written by Kimi to work around quirks in Kimi not needed for any other model I've tested, such as infinite toolcall loops and other weirdness. You can do quite a bit with it and never run into those quirks, or you might hit it every request. It is very sensitive to "confusing" things about it's environment in a way Sonnet and Opus are not. Still great value, but they have some way to go. | |
| ▲ | ryandrake an hour ago | parent | prev | next [-] | | Would "lots of gpus" even help for huge models? Maybe this is exposing my lack of knowledge but don't you need to keep the whole model and context in a single GPU's VRAM? My understanding is that multiple GPUs help with scaling (can handle N X inference requests simultaneously) but it doesn't help with using large models. If that were the case, I could jam another GPU in my box and double the size of model I can serve. | | |
| ▲ | Kirby64 an hour ago | parent | next [-] | | > Would "lots of gpus" even help for huge models? Maybe this is exposing my lack of knowledge but don't you need to keep the whole model and context in a single GPU's VRAM? How do you think the large providers do inference? No single GPU has 1TB plus of memory on board. It’s a cluster of a bunch of gpus. | |
| ▲ | 2ndorderthought an hour ago | parent | prev [-] | | 1t model instances(opus, gpt,etc) are not running on a single GPU. The catch is how the cards communicate and how the model is broken up. There's a bit that goes into it but the answer is yes the more gpus the bigger the model you can run. | | |
| ▲ | ryandrake 36 minutes ago | parent [-] | | Really cool. I'm very much still learning about this stuff. Sounds like this inter-GPU communication is a feature of special hardware (not consumer GPUs). | | |
| ▲ | 2ndorderthought 32 minutes ago | parent [-] | | Not really, there's various ways it can be done but even I think the old 1080tis could do it. Keep reading about it, my interest is in small models on a single GPU though so I don't fuss over those details. |
|
|
| |
| ▲ | Jabrov an hour ago | parent | prev | next [-] | | Yes multiple GPUs absolutely help with inference even for a single model instance. Some models are simply too big to fit on the largest available GPU. Check out tensor parallelism | |
| ▲ | ffsm8 an hour ago | parent | prev [-] | | Please don't oversell them. Eg Kimi k2.6 has a maximum context size of 270k, that's a quarter of opus. The model is fine, Ive switched to it entirely for a personal project, but it's not opus. And no, you're not running then locally unless you're a millionaire. You still need hundreds of GB (500+++) of VRAM on your graphics card - that's not at a level of consumer electronics. Sure you can run the quantized models, but then you're at Haiku performance. | | |
| ▲ | 2ndorderthought an hour ago | parent [-] | | Qwen 3.6 runs in a single GPU. But I mostly agree with you except, just because a model has a given context doesn't mean it's all available or entirely reliable. |
|
| |
| ▲ | andai an hour ago | parent | prev [-] | | Based on benchies or experience? |
|
|
|
| ▲ | egeozcan 42 minutes ago | parent | prev | next [-] |
| Codex randomly stops working because some silly cybersecurity detector. Insane amount of false positives. Last time it happened, I was just letting it write me a small tool to translate the text in my clipboard. What cybersecurity? Code wasn't even published, or remotely like anything hacking related. I'm always letting AI write some boring CRUD tools that I don't want to code myself. It's bordering on being useless. |
| |
| ▲ | azuanrb 14 minutes ago | parent [-] | | It's probably their system prompt. Unlike Claude Code, they don't ban you for using different harness with their subscription (for now). If you use pi, their "safety" is off. Works great for me. |
|
|
| ▲ | tappio an hour ago | parent | prev | next [-] |
| I have used past week opencode go with deepseek v4 pro and claude code with opus 4.7 side by side and... they are both good. They are different, both have their good and bad sides... but they do get things done. Especially the OpenCode has been very enjoyable experience. Thank you Anthropic for all the down time, I would have probably not explored alternatives otherwise. I can vouch for the OpenCode Go sub! |
|
| ▲ | loloquwowndueo an hour ago | parent | prev [-] |
| > Claude.ai is now at a 98.85% uptime. So, at least better than GitHub, right? :) |