| ▲ | Spacecosmonaut 10 days ago |
| Accelerationists may argue that the eroding of proper attribution and proof verification by humans is a meaningless short term struggle of a dying field. Mathematics seems to be entering an era where human + machine maximizes performance, much like chess in the 1990s. However, imagine a future where even talented mathematicians are nothing but noise in the machine (as is the case in chess now). A future where AI generates and verifies proofs without humans in the loop. Where the mathematics may be beyond human comprehension. In that future, does it matter that early career mathematicians are inhibited by these developments? Perhaps not. Programming faces the same issue. As AI crawls up the competence ladder, does it matter that fewer people have opportunities to develop the skillset of a senior engineer? Perhaps not. |
|
| ▲ | wongarsu 10 days ago | parent | next [-] |
| Much like for many the point of chess is that it's played by humans, with truly superhuman AI relegated to a training aid, mathematics is in many ways about human comprehension. You can use AI to find and proof new theorems. But if you get to the point where humans can't understand it, is it even still math? |
| |
| ▲ | vitally3643 9 days ago | parent | next [-] | | Neural networks are already systems of linear algebra that are beyond human understanding. Most humans could probably grok a 1 or 2 dimensional slice of a network, but the latent vector space is completely beyond the human brain. We have to use tools to analyze neural networks piecemeal in exactly the same way that we analyze any other higher-dimensional construct. Few humans are truly capable of reasoning in 4+ dimensions, that doesn't make string theory "not math". Nor does a trillion-dimension vector space of an LLM make it "not programming". Humans by themselves invented mathematical concepts beyond human understanding a long time before we invented neural networks. | |
| ▲ | Spacecosmonaut 10 days ago | parent | prev | next [-] | | Perhaps P=NP. The new algorithms are handed down to us. We can apply them without fundamentally understanding why P=NP. | |
| ▲ | squidbeak 9 days ago | parent | prev | next [-] | | > Much like for many the point of chess is that it's played by humans, with truly superhuman AI relegated to a training aid It's much more than just training. Humans use the engines to prepare openings and find promising novelties. Over time these novelties unearthed by engines fill out theory. It's easy to fine elite games where neither player is out of book for dozens of moves. Modern players are full hybrids in that sense. Looking back at chess, it seems natural that Mathematics will go the same way. | |
| ▲ | jrflo 9 days ago | parent | prev | next [-] | | I think there would still be a place for it if it's beyond human comprehension. For instance, really complex lemmas to solve human-tractable problems. If you can pose a question in a proof assistant language like Lean, have an AI write a Lean program that solves it, you can use that as a Lemma for some other problem. There's quite a bit of math out there that is "correct assuming conjecture X is correct", maybe AI could fill that gap and "still be math". | |
| ▲ | zelphirkalt 9 days ago | parent | prev [-] | | Is it still chess, if humans cannot understand it? Because that's the point we are at in chess. Engines making moves, that humans cannot understand, but somehow they work out to be best or seemingly best. Look at the Leela Zero games, when it came out. These engines play kind of other-worldly chess. | | |
| ▲ | wongarsu 9 days ago | parent [-] | | Exactly. If we wanted "the best chess" we'd watch Stockfish against Leela Zero. Far better mechanically than human chess. But people are much more interested in Magnus Carlsen playing Gukesh. Both train with chess engines, but the thing that makes the chess game interesting are the human beings that understand their own moves and try to understand those of their opponent |
|
|
|
| ▲ | 0x59 10 days ago | parent | prev | next [-] |
| An issue I see is who controls the information. The next generation may not recieve the knowledge, it may be gatekept by industry who *will* own the gate. The future may not have access unless we fight to ensure they do. This is how I read the article. |
|
| ▲ | orbital-decay 10 days ago | parent | prev | next [-] |
| Surely such AI would also be able to reduce and simplify the math for human understanding. Which is what mathematicians do all the time, from turning base 60 cuneiform into modern number systems to simplifying Maxwell's equations for the students. |
| |
| ▲ | Joel_Mckay 9 days ago | parent [-] | | In general, most humans top out at 3D visualization, and instead rely on crude mathematical tools to work with higher dimensions. Every so often, people like Euler or Leibnitz pops up to give people new methods for blind men with a cane to explore the unseen yet knowable world(s). Scientific work is not normally naturally statistically salient for LLM observational data inferences. =3 |
|
|
| ▲ | overgard 9 days ago | parent | prev | next [-] |
| On the other hand, it could stall out at: good enough to take the easy problems, not good enough to take over the field, but damaging enough to erode the quality of new entrants. (Which incidentally is the scenario I think plays out for software) |
| |
| ▲ | math_dandy 9 days ago | parent [-] | | I think the OpenAI model that resolved the Unit Distance Problem would be capable of solving a significant proportion of mathematics PhD thesis problems. |
|
|
| ▲ | rad_val 9 days ago | parent | prev | next [-] |
| AI (in this form) will never be able to solve things we truly cannot solve yet. It might catch things that we didn't project properly or brute force things no human can , but it will never unify general relativity with quantum mechanics. It's amazing at finding hidden truths in large datasets, but won't win a Nobel unassisted. |
| |
| ▲ | par1970 9 days ago | parent [-] | | > AI (in this form) will never be able to solve things we truly cannot solve yet. Argument? | | |
| ▲ | rad_val 9 days ago | parent | next [-] | | The strongest argument for this is structural: what LLMs are. In a brutal simplistic way: each token is represented in a high dimensional vector. LLMs operate on them. They are the true, underlying meaning of the token for the LLM. Think of it as 1000+ ways to think of that word/token. Those meanings are baked in at training time. So, LLMs might be able to cross-reference them and solve a class of problems that flew under our radar, but can't come up with revolutionary theories that were never in the training set. Of course, they will help winning a Nobel in the years to come, no doubt, but can't speak mathematics we can't understand (beyond simple obfuscation) and won't discover anything substantial on their own. | | |
| ▲ | resident423 9 days ago | parent | next [-] | | > but can't come up with revolutionary theories that were never in the training set. Can you elaborate? I don't think the solution to the unit distance problem was in the training set, but I'm guessing you mean there's some higher bar for revolutionary theories LLMs cant reach? If so where do you expect the limit will be? | |
| ▲ | redox99 9 days ago | parent | prev | next [-] | | Instead of going into a long technical argument of why your description of LLMs is flawed, I'll go straight to the point, because people keep moving the goal posts. What exact problem would need to be solved by LLMs to convince you that they DO discover novel solutions? | | |
| ▲ | rad_val 9 days ago | parent | next [-] | | I'm more interested why you think my understanding is flawed honestly. I thought I distilled it decently well in two sentences. The bottom line is, in this hyperdimensional space you can find relationships that are not easily distinguished by human minds, but the corpus is still fixed, a llm can't truly know anything beyond its training data. | | |
| ▲ | redox99 9 days ago | parent [-] | | > Think of it as 1000+ ways to think of that word/token I assume you used 1000 because that's in the ballpark of the vector size. But these are not independent scalars, like each might store a certain property. Just like in 2D you can have 4 quadrants (or subdivide further), with a vector of size 1000 you can encode an insane amount of meaning. > Those meanings are baked in at training time. So, LLMs might be able to cross-reference them and solve a class of problems that flew under our radar, but can't come up with revolutionary theories that were never in the training set. There's a lot of jumping to conclusions here, but I'll try to answer more generally. This idea of how LLMs work is mostly to build an intuition, like with a CNN you'd say imagine a layer does edge detection, and so on. And to some degree you can detect those kinds of behavior, but a NN is a VERY general architecture. It needn't work like you say, it can calculate any function and running under a loop and a scratchpad (basically an agent) is turing complete. Even ignoring that, this part is misleading > Those meanings are baked in at training time. Being baked in at training time does not mean it didn't build novel meanings at training time. This is even more significant when you take into account post training RL. A simple proof that transformers can generate novel, superhuman solutions, is that you can build a transformer based chess bot, feed it 0 human games, and train it with RL until it can beat any human, completely novel and unconstrained by human gameplay (because it would've never seen it). You can do that with any task that's verifiable, like coding or math. (Also as a separate fact, as long as a task is easier to verify than solve (basically always), you have somewhat of a million monkeys with a typewriter, and with temperature sampling the model might eventually stumble it's way onto a solution.) |
| |
| ▲ | dehsge 9 days ago | parent | prev [-] | | unify general relativity with quantum mechanics. The continuum hypothesis. The traveling salesman problem in polynomial time. | | |
| ▲ | redox99 9 days ago | parent | next [-] | | I think it's cool how in a decade we went from "Neural networks will never be able to understand this sentence that's obvious to humans" to "LLMs must be able to solve problems that humanity hasn't been able to after almost a century, and that might even be unsolvable" | | |
| ▲ | dehsge 9 days ago | parent | next [-] | | So that is kind of the point of studying maths right? Why something in unsolvable or undecidable can be as important as the output of a theorem. Questions like these, fields medal level problems or Karp’s 21 NP-complete problem are problems working mathematicians are interested in. Will LLMs help as an human assistant in the future? Probably. Will LLMs answer these questions themselves, provide insights and bounds to these new mathematics and teach other mathematicians why this new math they create is true? Will these models have phds and take candidates teaching them how to apply and think about the maths problems they are interested in? | |
| ▲ | roywiggins 9 days ago | parent | prev [-] | | it can operate at the level of a mere mathematics professor, who everyone knows are barely conscious, basically automatons. wake me up when it's Einstein |
| |
| ▲ | 3uruiueijjj 9 days ago | parent | prev [-] | | The continuum hypothesis was proven independent of ZFC over sixty years ago, I think even GPT2 could have told you that much. |
|
| |
| ▲ | int_19h 9 days ago | parent | prev [-] | | I don't see how any of this follow. Yes, the LLMs will learn the "meaning" (here narrowly defined as relative configuration in the embedding space) of vectors that correspond to tokens in whatever tokenizer is used to feed into them. But that vector space is not discrete, and nothing precludes the model from internally operating on other vectors that it never saw in training, based on how they relate to those vectors which it did see. |
| |
| ▲ | fc417fc802 9 days ago | parent | prev | next [-] | | We have yet to see evidence of proper generalization AFAIK. Examples such as this proof are the closest I'm aware of. I haven't read this one in detail yet but the other examples I've seen have been (upon examination) much closer to an (absurdly) deep literature search than to novel thought. Obviously that doesn't mean we won't eventually achieve novel thought, or even that the current form is fundamentally incapable of it, merely that we've yet to see evidence of it and thus the default assumption is that we aren't there yet. | |
| ▲ | rienbdj 9 days ago | parent | prev [-] | | The burden of proof is the other way |
|
|
|
| ▲ | Joel_Mckay 9 days ago | parent | prev [-] |
| In general, most researchers already incorporate LLM into their workflows, as it is quite good at context search. However, the relevant training data is based on the collective works of the field of experts. Collecting current data on that work is what makes the LLM sound relevant, and any improvement of the LLM model requires frequent new data from both researchers and the chat bot users themselves. LLM are not real "AI", and anyone that says otherwise is selling people something. To phrase this differently, LLM companies conduct unauthorized targeted intelligence gathering on peoples work, codify that act of plagiarism or theft as MoE documentation, and sell unaccountable token output to other users. There is a reason output becomes more nonsensical as "AI" companies try to use dynamic weight granularity and conceptual compaction. It is not necessarily "AI" hallucinations, but rather people fooling themselves into believing smart people are no longer needed if they willingly become a hapless exploited data source caste. This simply isn't true, as people will leave the field for awhile. The LLM business model regularly requires copyright theft and plagiarism to persist. It will not magically become sentient/AGI/less-stupid, as these algorithms have been operating for over 40 years. What has changed is the scale of the deployment, data pool size, and the energy consumed. Scientists are still necessary, as they create the world models LLM try to guess at by statistical inference. Hype and FUD ahead of an IPO for a highly dubious revenue company is expected. We look forward to the low cost liquidated GPU hardware in the near future. =3 |
| |
| ▲ | martin1975 9 days ago | parent [-] | | Reading this invoked the image of ouroboros in my mind. | | |
| ▲ | Joel_Mckay 9 days ago | parent [-] | | The Ouroboros in western mythology is a cautionary tale about the uselessness of the first perfect immortal being, and why humans should suffer our imperfections with insightful grace. The concept also made a great Red Dwarf episode. LLM are more like the Mechanical Turk trick, but the persons inside the machine running the con is unaware of how their actions affect the confounded observers. Have a wonderful day =3 | | |
| ▲ | martin1975 9 days ago | parent [-] | | Thank you, human being! Have you looked at the price of a RTX 5090? I can get a used car for that. | | |
| ▲ | Joel_Mckay 9 days ago | parent [-] | | Indeed, just paid $3k more for the same workstation we purchased last year at this time. Just the DDR5 sticks and NVMe drive cost more than most parts right now including a rtx 5070 Ti 16G card. For h265 hardware encoding, the performance differences on higher-end cards benchmarks was negligible. Building systems based on application specific benchmarks rather than general what-if use-case scenarios will sometimes show you something interesting. ymmv. =3 |
|
|
|
|