Remix.run Logo
jmpeax 18 hours ago

> They typically need to compare many or all points to each other, leading to O(N²) complexity.

UMAP is not O(n^2) it is O(n log n).

romanfll 18 hours ago | parent [-]

Thanks for your comment! You are right, Barnes-Hut implementation brings UMAP down to O(N log N). I should have been more precise in the document. The main point is that even O(N log N) could be too much if you run this in a browser.. Thanks for clarifying!

emil-lp 16 hours ago | parent [-]

If k=50, then I'm pretty sure O(n log n) beats O(nk).

romanfll 13 hours ago | parent [-]

You are strictly correct for a single pass! log2(9000)~13, which is indeed much smaller than k=50. The missing variable in that comparison is Iterations. t-SNE and UMAP are iterative optimisation algorithms. They repeat that O(N log N) step hundreds of times to converge. My approach is a closed-form linear solution (Ax=b) that runs exactly once. So the wall-clock comparison is effectively: Iterations * (N log N) VS 1 * (N *k) That need for convergence is where the speedup comes from, not the complexity class per se.