Remix.run Logo
ngxson an hour ago

I do use both DSv4 the "normal" and the flash variant, non-locally. It works well, not exceptionally. And while it's cheap, I'd say that the difference between $1 per month vs $5 per month is not a big concern to me. IMO pricing is pretty competitive among open-weight models: https://huggingface.co/inference/models

Depending on use cases, but for me I found 2 use cases where a local model is a must and not optional:

- Running offline without internet access: for example, I have this project that allow transcribe and summarize audio in real time. I already used it in some events where wifi is not available: https://github.com/ngxson/llama.cpp-realtime-audio-recap

- Handle private personal data, for example health records. This is the same category of "privacy" that you mentioned, but I just want to bring up the fact that people value their privacy differently.