DeepSeek-V4-Flash means LLM steering is interesting again

I'm surprised the article doesn't mention the biggest use of steering vectors, which is the potential to remove refusals from models (a.k.a. abliteration or uncensoring).

There was an earlier paper that found that "most refusals are on a single vector", and you can identify and "nerf" that vector so the model will skip refusals and answer "any" request normally. This was very doable for earlier models trained with SFT for refusals, seems to be a bit more complicated for newer models, but still doable to some extent.

There are already some libraries to automate this process and reduce refusals, but usually they focus on identifying and then modifying the models and releasing them as uncensored models. This technique of steering lets you enable this vector changing dynamically, so you don't need to change models if the abliteration process somehow hurts accuracy on other unrelated tasks.

	▲	cyanydeez 13 minutes ago \| parent [-]
		not sure why youre fixed on censoring. if we invert your POV censoring includes not reporting falsehoods "vaccines are harmful". Science and logic often tackle these subject via censoring, but a model given a equal sampling of Internet, would think vacinnes are harmful. a less naive correction would censor this problematic context. so im cofised as to why you think unmasking whatever bias you think is censored will result in improvement in generic use case.

▲

antirez 7 minutes ago | parent | prev | next [-]

Thank you for posting this! Just a clarification, with DwarfStar steering features I was able to completely remove refusal from DS4. It is only the example dataset (prompt pairs I provide) which is a toy, not the abilities. I thought that who is able to come up with the right dataset and understands how to use the well-documented steering feature, can access to steering. People that have no idea and would just cut & paste, I'm not sure, maybe it is a good idea if they also have access to a model without refusals? I the doubt I didn't release publicly the steering file, but I'm highly perplexed.

Btw recently the support was extended and now the steering vector can be applied to the activations at different time: always, only after thinking, only outside of tool calling, ...

	▲	zozbot234 2 minutes ago \| parent [-]
		AIUI, DS4 has very little (if any) of the refusal behavior you usually get from Western AI models for benign input. Is this mainly about the software security assessment case?

▲

wolttam 40 minutes ago | parent | prev | next [-]

> inspired to write this post by antirez’s recent project DwarfStar 4, which is a version of llama.cpp that’s been stripped down to run only DeepSeek-V4-Flash

This is not true, it is its own project.

Indebted to llama.cpp, sure, but not a stripped down version

▲

antirez 4 minutes ago | parent | next [-]

Yep, the code overlap is minimal, a few kernels. Some quantization code for the quantizer it implements. DwarfStar 4 is not a fork of llama.cpp, but without llama.cpp the project would be a lot more lacking, since I was able to get all the details that mattered in a second. But it is not a stripped down llama.cpp. This does not reduce in any way how much llama.cpp was for the not just for this project, but for all the projects that followed. It's not a matter of code: the way to follow, the quants formats, the lessons, the optimized kernels you can check to learn the patterns.

▲

embedding-shape 10 minutes ago | parent | prev [-]

Truth seems to sit somewhere in-between, DwarfStar 4 seems to mainly exists only because of llama.cpp, and authors basically were very inspired by llama.cpp's code, and even in some places literally have copied pieces from it, all with proper attribution and everything, I'm not trying to say this is bad, seems OK to me:

> ds4.c does not link against GGML, but it exists thanks to the path opened by the llama.cpp project and the kernels, quantization formats, GGUF ecosystem, and hard-won engineering knowledge developed there. We are thankful and indebted to llama.cpp and its contributors. Their implementation, kernels, tests, and design choices were an essential reference while building this DeepSeek V4 Flash-specific inference path. Some source-level pieces are retained or adapted here under the MIT license: GGUF quant layouts and tables, CPU quant/dot logic, and certain kernels. For this reason, and because we are genuinely grateful, we keep the GGML authors copyright notice in our LICENSE file. - https://github.com/antirez/ds4#acknowledgements-to-llamacpp-...

Been a lot of fun to play around with it since https://news.ycombinator.com/item?id=48142885 (~2 days ago), managed to make the generation go from 47.85 t/s to 57.07 t/s so far :)

	▲	antirez 2 minutes ago \| parent [-]
		Send patches! But remember that many speedups end being not exactly correct and the logits drift. But there is extensive testing and even ds4-eval now to test how it performs.

▲

aswegs8 6 minutes ago | parent | prev | next [-]

How does the model qualify as local? ~192 GB RAM needed sounds a bit much for local.

	▲	antirez 5 minutes ago \| parent [-]
		Runs on 96GB MacBooks. 128GB is better. Check the README of DwarfStar.

▲

dominotw 13 minutes ago | parent | prev [-]

> you can already exercise extremely fine-grained control by tweaking the language of your prompt.

maybe i suck at prompting but i find it impossible to overcome its biases from training data, post training ect.

you can only pattern mine from training data using prompts. you dont really have sort of fine-grained control.