Remix.run Logo
yorwba 6 days ago

The author admits they "kinda stopped reading this paper" after noticing that they only used one hyperparameter configuration, which I agree is a flaw in the paper, but that's not an excuse for sloppy treatment of the rest of the paper. (It would however, be an excuse to ignore it entirely.)

In particular, the assumption that |a_k| ≈ 0 initially is incorrect, since in the original paper https://arxiv.org/abs/2502.01628 the a_k are distances from one vector to multiple other vectors, and they're unlikely to be initialized in such a way that the distance is anywhere close to zero. So while the gradient divergence near 0 could certainly be a problem, it doesn't have to be as fatal as the author seems to think it is.

totalizator 2 days ago | parent | next [-]

That would be "welcome to the world of academia". My post-doc friends won't even read a blog post prior to checking author's resume. They are very dismissive every time they notice anything they consider sloppy etc.

throw_pm23 2 days ago | parent | next [-]

You seem to merge two different points. Not reading based on sloppiness is defensible. Not reading based on the author's resume less so.

aidenn0 2 days ago | parent [-]

When "sloppiness" is defined as "did anything on my personal list of pet peeves" (and it often is) then the defensibility of the two begin to converge.

lblume 2 days ago | parent | prev [-]

Which is a problem with the reputation-based academic system itself ("publish or perish") and not individuals working in it.

2 days ago | parent | prev [-]
[deleted]