| ▲ | khalic 13 hours ago | |||||||||||||||||||||||||||||||||||||
Another example of the mindf@#$ these systems are: I was doing some fine tuning to a small model, take data fields and make a sentence out of it. I was running into mode collapse (basically when the AI simplifies too much and always output the same thing). I got unstuck by randomizing the field order for each row?!? At training, and now I'm thinking I should do the same at inference time... | ||||||||||||||||||||||||||||||||||||||
| ▲ | p_stuart82 12 hours ago | parent | next [-] | |||||||||||||||||||||||||||||||||||||
the irony of modern software engineering: we spent decades perfecting deterministic algorithms, and now we're basically just shaking a black box and hoping the magic rocks align. | ||||||||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||||||||
| ▲ | auspiv 10 hours ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||||||||
apparently you can straight up duplicate/add/rearrange layers without changing any of the weights and get better results as well - https://dnhkng.github.io/posts/rys/ | ||||||||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||||||||
| ▲ | toddmorey 12 hours ago | parent | prev [-] | |||||||||||||||||||||||||||||||||||||
wow that's fascinating | ||||||||||||||||||||||||||||||||||||||