▲ | SpaceManNabs 3 days ago | |||||||
Why do you think they perform so poorly? | ||||||||
▲ | dccsillag 3 days ago | parent [-] | |||||||
Theory-wise, I'm not convinced that the models have good approximation properties (the Kolmogorov-Arnold / Kolmogorov Superposition Theorem they base themselves on has quite a bit of nuance), and the optimization problem might be a bit tricky. I'm also can't see how to incorporate inductive biases other than the standard R^n / tabular regression one, and the existing attempts on this that I'm aware of are just band-aids (along the lines of feature engineering). In practice, I've personally ran some benchmarks on a collection of datasets I had laying around. The results were generally abysmal, with the method only matching simple baselines in some few datasets. Finally, the original paper is very weird, and reads more as a marketing piece. The theory, which is touted throughout the paper, is very weak, the actual algorithm is not sufficiently well explained there and the experiments are lacking. In particular, I find it telling that they do not include and even go out of their way to ignore important baselines such as boosted trees, which are the state-of-the-art solution to the problem that they intended to solve (and even work very well in occasions where they claim that both KANs and MLPs perform badly, e.g. in high dimensions). | ||||||||
|