▲ | KittenInABox 8 hours ago | |||||||||||||||||||||||||||||||||||||
I think we should be more generous, given the fact that we cannot reproduce many findings in our own field (computer science) given the hardware to do so is limited to such few entities... | ||||||||||||||||||||||||||||||||||||||
▲ | yjftsjthsd-h 7 hours ago | parent | next [-] | |||||||||||||||||||||||||||||||||||||
> given the fact that we cannot reproduce many findings in our own field (computer science) Sounds like we should be less generous to CS findings, not more generous to other fields. (And yes, we should be skeptical of findings in CS as well.) > given the hardware to do so is limited to such few entities... That said - what research is happening in CS that needs specific hardware? The theoretical stuff can still happen on chalk boards, and interesting algorithmic or technical advances tend to propagate quickly precisely because someone will reproduce them. | ||||||||||||||||||||||||||||||||||||||
▲ | hearsathought 7 hours ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||||||||
> I think we should be more generous We really shouldn't. > given the fact that we cannot reproduce many findings in our own field (computer science) Computer science isn't a "science". Computer science is really a branch of mathematics. For example, when you study computation theory, you prove theorems (deduction). You don't generate a hypothesis and test it. > given the hardware to do so is limited to such few entities... Unless you are talking about computer engineering, which isn't really science either but engineering. Computer science isn't done in "hardware". Maybe you should go learn what computer science is. | ||||||||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||||||||
▲ | lostmsu 7 hours ago | parent | prev [-] | |||||||||||||||||||||||||||||||||||||
Have you actually tried training a 10MB MoE (that would train in a few days on 3090)? I came to an opinion that most of the current AI research can be easily reproduced on the small scale. CoT is possibly the only exception as it sounds like it requires certain emergent behavior, but even there I am not sure it is impossible to retrofit to tiny models. |