Remix.run Logo
unrvl22 8 hours ago

The municipality of Rio de Janeiro (via its IT company IplanRIO) released Rio-3.5-Open-397B, presented as a homegrown Qwen3.5 fine-tune that beats comparable open models on benchmarks. The linked issue argues it's actually a weighted merge of ~60% Nex-N2 Pro + ~40% Qwen3.5-397B-A17B - Nex-N2 having been released about a week earlier.

DonsDiscountGas 6 hours ago | parent | next [-]

I didn't know model merging like that was possible. (Obviously possible from a pure software standpoint but I'm surprised it's effective)

bwhitty 5 hours ago | parent [-]

As another poster above linked, it’s been shown to be effective since 2022: https://arxiv.org/abs/2203.05482

nightpool 2 hours ago | parent [-]

it works because Nex N2 is also a derivative of the original base Qwen model. If it was two completely unrelated models it wouldn't work.

Lucasoato 6 hours ago | parent | prev | next [-]

So the problem isn’t in the missing attribution to Qwen, but with the fact that they didn’t mention Nex-N2 Pro right?

Aurornis 6 hours ago | parent [-]

The problem is that they claimed to have made a big achievement with their home grown post training, and they expected to receive a lot of praise for it.

Then researchers looked at the weights and there is no post training at all.

They are now attributing both models they merged, but their excuse for the lack of post training is to claim they accidentally uploaded the wrong files.

serial_dev 4 hours ago | parent [-]

I’d believe they accidentally uploaded the wrong files if they uploaded the correct ones. To state that they accidentally uploaded something else and then not upload the correct version means they probably do not have anything and either hope people forget about this or they are scrambling to have something that is at least close to their original claim.

7 hours ago | parent | prev | next [-]
[deleted]
clear-octopus 6 hours ago | parent | prev [-]

[dead]