Remix.run Logo
Garlef 2 days ago

Have you considered that this might not be due to the model itself but due to less focus/time/money spent on alignment during the training?

My guess is that this is a bit of a throwaway experiment before they actually spend millions on training a larger model based on the technology.

findingMeaning 2 days ago | parent [-]

Yeah it could. One thing for sure is that, it's really impressive in terms of speed and using it would mean we can do so many cool stuffs with it!

Even if there is no improvement in terms of quality, the speed alone will make it usable for a lot of downstream tasks.

It feels like ChatGPT3.5 moment to me.