▲ | mxkopy a day ago | |||||||
In the graph you’re referencing, K = 2 never reaches C = 0.2. K = 3 only reaches C = 0.3 before getting cut off. I’m not even sure why that would be a problem, because he’s projecting N basis vectors into K dimensions and C is some measure of the error this introduces into mapping points in the N-space onto the K-space, or something. I’m glad to be shown why this is inconsistent with the graph, but your argument doesn’t talk about this idea at all. | ||||||||
▲ | cgadski 13 hours ago | parent [-] | |||||||
I was a little informal with my argument. It's not strictly true that we only see C = 0.2 when K = 2. I was reading what the graph says about the case when N is much greater than k. I'll try to clarify. C is meant to be the smallest constant so that, for each (N, k, epsilon) with k > C epsilon^-2 log N and epsilon > 0, there exists some arrangement of N vectors in R^k with inner products smaller than epsilon in absolute value. For each (N, k), the author optimized epsilon and reported the empirical value k epsilon^2 / log N, which is the smallest value of C for which our condition holds restricted to the given values of (N, k). (Of course, this is my good-faith interpretation---the article introduces C in the context of a JL distortion bound, and it takes an extra step to turn that into a bound on inner products.) It can be shown that C = 4 satisfies this condition, when log is the natural log. See [1], for example. Based on the graph, the article claims to do better: "for sufficiently large spaces," it says we can put C = 0.2. This would be a very significant improvement. For k = 2, the graph shows that C will be lower than 0.2 for sufficiently large N. (The color scheme is confusing! The line for k = 2 is the one that starts at just under 0.8 when N = 1.) Already for k = 3, the graph doesn't give us reason to believe it will be lower than 0.2---you correctly observed it gets to around 0.3. For larger value of k, the graph doesn't seem to show what we can expect for large N: the curves go up, but do not come down. This is what I meant with my comment: the conclusion that C <= 0.2 as N -> infinity is only justified by the behavior at K = 2. Now, do these results make sense? In the case k = 2, we're trying to put a large number (N) of vectors on the unit circle, and thinking about how small the maximum inner product (epsilon) between any pair of vectors can be. As N -> infinity, the vectors will be packed very tightly and the maximum inner product epsilon will come close to 1. Overall, C = k epsilon^2 / log N will become arbitrarily small. In fact, the same happens for every k. So, just in connection to this graph, the article makes three mistakes: 1) The article's interpretation of its experiment is wrong: the graph alone doesn't show that C < 0.2 for "large spaces". 2) However, it should be obvious a priori that, for all values of k, the reported values C should converge to 0 for large N (albeit very slowly, at a rate of 1/log N). 3) Unfortunately, this doesn't tell us anything about the minimum value of k / log(N) for a given epsilon and k, and so it doesn't support the conclusion of the article. The problem with this kind of LLM-driven article is that it gives uncareful work the _appearance_ of careful work but none of the other qualities that usually come with care. | ||||||||
|