Remix.run Logo
0xfaded 4 days ago

I once published a method for finding the closest distance between an ellipse and a point on SO: https://stackoverflow.com/questions/22959698/distance-from-g...

I consider it the most beautiful piece of code I've ever written and perhaps my one minor contribution to human knowledge. It uses a method I invented, is just a few lines, and converges in very few iterations.

People used to reach out to me all the time with uses they had found for it, it was cited in a PhD and apparently lives in some collision plugin for unity. Haven't heard from anyone in a long time.

It's also my test question for LLMs, and I've yet to see my solution regurgitated. Instead they generate some variant of Newtons method, ChatGPT 5.2 gave me an LM implementation and acknowledged that Newtons method is unstable (it is, which is why I went down the rabbit hole in the first place.)

Today I don't know where I would publish such a gem. It's not something I'd bother writing up in a paper, and SO was the obvious place were people who wanted an answer to this question would look. Now there is no central repository, instead everyone individually summons the ghosts of those passed in loneliness.

erikig 4 days ago | parent | next [-]

The various admonitions to publish to a personal blog, while encouraging, don't really get at the 0xfaded's request which I'd summarize as follows:

With no one asking questions these technical questions publicly, where, how and on what public platform will technical people find the problems that need solving so they can exercise their creativity for the benefit of all?

Aurornis 4 days ago | parent | next [-]

> The various admonitions to publish to a personal blog, while encouraging, don't really get at the 0xfaded's request

They also completely missed the fact that 0xfaded did write a blog post and it’s linked in the second sentence of the SO post.

> There is a relatively simple numerical method with better convergence than Newtons Method. I have a blog post about why it works http://wet-robots.ghost.io/simple-method-for-distance-to-ell...

keepamovin 4 days ago | parent | prev | next [-]

Clearly we need something in between the fauxpen-access of journals and the wilde west of the blogosphere, probably. Why wouldn't the faded ox publish in a paper? Idk, but I guess we need things similar to those circulars that British royal society members used to send to each other...except not reserved for a club. The web should be a natural at this. But it's either centralized -> monetized -> corrupted, or decentralized -> unindexed/niche -> forgotten fringe. What can come between?

Nition 4 days ago | parent | next [-]

I wonder if there could be something like a Wikipedia for programming. A bit like what the book Design Patterns did in 1994, collecting everyone's useful solutions, but on a much larger scale. Everyone shares the best strategies and algorithms for everything, and updates them when new ones come about, and we finally stop reinventing the wheel for every new project.

To some extent that was Stack Overflow, and it's also GitHub, and now it's also LLMs, but not quite.

May I suggest "PASTE": Patterns, Algorithms, Solutions, Techniques, and Examples. "Just copy PASTE", they'll say.

fabianholzer 4 days ago | parent | next [-]

Ward Cunningham once, of all places in an Github issue [0], explained how the original C2 Wiki was seeded.

> Perhaps I should explain why wiki worked. > I wrote a program in a weekend and then spent two hours a day for the next five years curating the content it held. For another five years a collection of people did the same work with love for what was there. But that was the end. A third cohort of curators did not appear. Content suffered.

A heroic amount effort of a single person, and later the collective effort of a small group, worked in the mid-90es. I'm skeptical that it will be repeatable 30 years later. Despite this, it would be the type of place, that I'd like to visit on the web. :(

[0] https://github.com/WardCunningham/remodeling/issues/51#issue...

Voklen 4 days ago | parent | prev | next [-]

Great idea! https://paste.voklen.com/wiki/Main_Page If people start using it I'll get a proper domain name for it.

nyargh 4 days ago | parent | prev | next [-]

An algolwiki is a great idea, but I just wanted to say I got a good chuckle from this, thanks :)

> May I suggest "PASTE": Patterns, Algorithms, Solutions, Techniques, and Examples. "Just copy PASTE", they'll say.

4 days ago | parent [-]
[deleted]
oneeyedpigeon 4 days ago | parent | prev | next [-]

> To some extent that was Stack Overflow

Yup, that was always very much the plan, from the earliest days. Shame it soured a bit, but since the content is all freely reusable, maybe something can be built atop the ashes?

__patchbit__ 4 days ago | parent [-]

There is https://grokipedia.com which encourages you to suggest an article and you may submit improvements to an existing article.

lobsterthief 4 days ago | parent [-]

This is _not_ at all the same thing. Grok just ripped off Wikipedia as its base and then applied a biased spin to it. Check out the entry on Grok owner Elon Musk; it praises his accomplishments and completely omits or downplays most of his better-known controversies.

latexr 4 days ago | parent [-]

And everything is “fact checked” by the Grok LLM. Which… Yeah…

https://en.wikipedia.org/wiki/Grok_(chatbot)#Controversies

__patchbit__ 18 hours ago | parent [-]

The Grok information source is more reliable than Wikipedia.

Objectively and incrementally improving. The leadership behind Grok is human rated safe rocket science quality.

Whereas Wikipedia is a fugly dumpsterdive.

bambax 4 days ago | parent | prev | next [-]

Yes exactly! It would need some publicity of some kind to get started but it's the best solution, certainly? And all of the tools and infrastructure already exist.

progval 4 days ago | parent | prev [-]

There is https://www.wikifunctions.org/

lelanthran 4 days ago | parent | prev | next [-]

> Clearly we need something in between the fauxpen-access of journals and the wilde west of the blogosphere, probably.

I think GP's min-distance solution would work well as an arxiv paper that is never submitted for publication.

A curated list of never-published papers, with comments by users, makes sense in this context. Not sure that arxiv itself is a good place, but something close to it in design, with user comments and response-papers could be workable.

Something like RFC, but with rich content (not plain-text) and focused on things like GP published (code techniques, tricks, etc).

Could even call it "circulars on computer programming" or "circulars on software engineering", etc.

PS. I ran an experiment some time back, putting something on arxiv instead of github, and had to field a few comments about "this is not novel enough to be a paper" and my responses were "this is not a publishable paper, and I don't intend to submit it anywhere". IOW, this is not a new or unique problem.

(See the thread here - https://news.ycombinator.com/item?id=44290315)

knolan 4 days ago | parent | prev [-]

There is the Journal of Open Source Software perhaps:

https://joss.theoj.org/

zahlman 4 days ago | parent | prev | next [-]

You can (and always were encouraged to) ask your own questions, too.

And there are more sites like this (see e.g. https://codidact.com — fd: moderator of the Software section). Just because something loses popularity isn't a reason to stop doing it.

eastbound 4 days ago | parent [-]

StackOverflow is famously obnoxious about questions badly asked, badly categorized, duplicated…

It’s actually a topic on which StackOverflow would benefit from AI A LOT.

Imagine StackOverflow rebrands itself as the place where you can ask the LLM and it benefits the world, whoch correctly rephrasing the question behind the scenes and creating public records for them.

rerdavies 4 days ago | parent | next [-]

And famously obnoxious about rejecting questions that are properly asked, properly categorized, and not actually duplicated.

eastbound 4 days ago | parent [-]

SO is not obnoxious because the users are wrong!

wizzwizz4 4 days ago | parent | prev [-]

The company tried this. It fell through immediately. So they went away, and came back with a much improved version. It also fell through immediately. Turns out, this idea is just bad: LLMs can't rephrase questions accurately, when those questions are novel, which is precisely the case that Stack Overflow needs.

For the pedantic: there were actually three attempts, all of which failed. The question title generator was positively received (https://meta.stackexchange.com/q/388492/308065), but ultimately removed (https://meta.stackoverflow.com/q/424638/5223757) because it didn't work properly, and interfered with curation. The question formatting assistant failed obviously and catastrophically (https://meta.stackoverflow.com/a/425167/5223757). The new question assistant failed in much the same ways (https://meta.stackoverflow.com/a/432638/5223757), despite over a year of improvements, but was pushed through anyway.

eastbound 3 days ago | parent [-]

This is an excellent piece of information that I didn’t have. If the company with most data can’t succeed, then it seems like a really hard problem. On the side, they can understand why humans couldn’t do it either.

Forgeties79 4 days ago | parent | prev | next [-]

Seriously where will we get this info anymore? I’ve depended on it for decades. No matter how obscure, I could always find a community that was talking about something I needed solved. I feel like that’s getting harder and harder every year. The balkanization of the Internet + garbage AI slop blogs overwhelming the clearly declining Google is a huge problem.

nerusskyhigh 4 days ago | parent | next [-]

My genuine impression is that communities moved from forums to discord. Maybe that's why they are harder to find

nyargh 4 days ago | parent [-]

And discord is a terrible tool for knowledge collection imo. Their search is ok, but then I find myself digging through long and disjointed message threads, if replies/threading are even used at all by the participants.

roncesvalles 4 days ago | parent [-]

Not to mention, it's not indexed by search engines. It's the "deep web".

nyargh 4 days ago | parent [-]

Yes, its a treasure hunt every single time when some project has most of their discussions on discord. It's awful imo.

seb1204 4 days ago | parent | prev | next [-]

Keep using SO?

Forgeties79 4 days ago | parent [-]

When I grew up shakes fist at clouds I had a half dozen totally independent forums/sites to pull on for any interest or hobby no matter how obscure. I want it back!

hattmall 4 days ago | parent | next [-]

It's true though, and the information was so deep and specific. Plus the communities were so legitimate and you could count on certain people appearing in threads and waiting for their input. Now the best you have are subreddits or janky Facebook groups .

wiether 4 days ago | parent | prev [-]

The discoverability, both from the outside and within is absolute trash, but the closest I find of those old forums nowadays are Discord servers.

Forgeties79 4 days ago | parent [-]

Agreed, it’s the discoverability that’s the real problem here at the end of it all. All the veterans are pulling up the drawbridges to protect their communities from trolls, greedy companies, AI scraping, etc. which means new people can’t find them. Which then means these communities eventually whither and stop being helpful resources for us all.

HumblyTossed 4 days ago | parent | prev [-]

Usenet?

Forgeties79 4 days ago | parent [-]

I guess? I feel like it’s too small now. It can’t cover all my interests

0xbadcafebee 4 days ago | parent | prev [-]

> where, how and on what public platform will technical people find the problems that need solving so they can exercise their creativity for the benefit of all?

The same place people have always discovered problems to work on, for the entire history of human civilization. Industry, trades, academia, public service, newspapers, community organizations. The world is filled with unsolved problems, and places to go to work on them.

Einstein was literally a patent clerk.

sky2224 4 days ago | parent | prev | next [-]

This is a perfect example of an element of Q&A forums that is being lost. Another thing that I don't think we'll see as much of anymore is interaction from developers that have extensive internal knowledge on products.

An example I can think of was when Eric Lippert, a developer on the C# compiler at the time, responded to a question about a "gotcha" in the language: https://stackoverflow.com/a/8899347/10470363

Developer interaction like that is going to be completely lost.

tempest_ 4 days ago | parent | next [-]

This type of thing often lives in the issues / discussion tab of a github repo now a days, for better and worse.

dimator 4 days ago | parent | next [-]

Yuck. I don't know if it's just me, but something feels completely off about the GH issue tracker. I don't know if it's the spacing, the formatting, or what, but each time it feels like it's actively trying to shoo me away.

It's whatever the visual language equivalent of "low signal" is.

NitpickLawyer 4 days ago | parent [-]

Still gh issues are better than some random discord server. The fact that forums got replaced by discord for "support" is a net loss for humanity, as discord is not searchable (to my knowledge). So instead of a forum where someone asks a question and you get n answers, you have to visit the discord, and talk to the discord people, and join a wave channel first, hope the people are there, hope the person that knows is online, and so on.

nerdix 4 days ago | parent [-]

Yeah, I suspect that a lot of the decline represented in the OP's graph (starting around early 2020) is actually discord and that LLMs weren't much of a factor until ChatGPT 3.5 which launched in 2022.

LLMs have definitely accelerated Stackoverflow's demise though. No question about that. Also makes me wonder if discord has a licensing deal with any of the large LLM players. If they don't then I can't imagine that will last for long. It will eventually just become too lucrative for them to say no if it hasn't already.

chongli 4 days ago | parent [-]

Discord isn’t just used for tech support forums and discussions. There are loads of completely private communities on there. Discord opening up API access for LLM vendors to train on people’s private conversations is a gross violation of privacy. That would not go down well.

skvark 4 days ago | parent | prev | next [-]

I think most relevant data that provides best answers lives in GitHub. Sometimes in code, sometimes in issues or discussions. Many libs have their docs there as well. But the information is scattered and not easy to find, and often you need multiple sources to come up with a solution to some problem.

fireflash38 4 days ago | parent | prev [-]

A lot of valuable information lived/lives in email threads that might or might not be publicly archived.

Philpax 4 days ago | parent | prev | next [-]

The second answer cites Lippert's pre-existing blog post on the subject: https://ericlippert.com/2009/11/12/closing-over-the-loop-var...

I agree that there will be some degradation here, but I also think that the developers inclined to do this kind of outreach will still find ways to do it.

gessha 3 days ago | parent | prev | next [-]

I believe the community has seen the benefit of forums like SO and we won’t let the idea go stale. I also believe the current state of SO is not sustainable with the old guard flagging any question and response you post there. The idea can/should/might be re-invented in an LLM context and we’re one good interface away from getting there. That’s at least my hope.

yaroslavvb 4 days ago | parent | prev [-]

I used to look at all TensorFlow questions when I was on the TensorFlow team (https://stackoverflow.com/tags/tensorflow/info). Unclear where people go to interact with their users now....Reddit? But the tone on Reddit is kind of negative/complainy

namanyayg 4 days ago | parent | prev | next [-]

I had a similar beautiful experience where an experienced programmer answered one of my elementary JavaScript typing questions when I was just starting to learn programming.

He didn't need to, but he gave the most comprehensive answer possible attacking the question from various angles.

He taught me the value of deeply understanding theoretical and historical aspects of computing to understand why some parts of programming exist the way they are. I'm still thankful.

If this was repeated today, an LLM would have given a surface level answer, or worse yet would've done the thinking for me obliviating the question in the first place.

I wrote a blog post about my experience at https://nmn.gl/blog/ai-and-learning

matsemann 4 days ago | parent | next [-]

Had a similar experience. Asked a question about a new language feature in java 8 (parallell streams), and one of the language designers (Goetz) answered my question about the intention of how to use it.

An LLM couldn't have done the same. Someone would have to ask the question and someone answer it for indexing by the LLM. If we all just ask questions in closed chats, lots of new questions will go unanswered as those with the knowledge have simply not been asked to write the answers down anywhere.

4 days ago | parent | next [-]
[deleted]
haddr 4 days ago | parent | prev [-]

Would you share the link to the answer?

matsemann 4 days ago | parent [-]

https://stackoverflow.com/questions/20375176/should-i-always...

cinntaile 4 days ago | parent | prev | next [-]

You can prompt the LLM to not just give you the answer. Possibly even ask it to consider the problem from different angles but that may not be helpful when you don't know what you don't know.

Gigachad 4 days ago | parent | prev [-]

For every example of that, there were 999 instances of people having their question closed, criticised, or ignored.

jvanderbot 4 days ago | parent | prev | next [-]

You can write a paper, submit the arxiv, and you can also make a blog post. At any rate, I agree - SO was (is?) a wonderful place for this kind of thing.

I once had a professor mention that they knew me from SO because I posted a few underhanded tricks to prevent an EKF from "going singular" in production. That kind of community is going to be hard to replace, but SO isnt going anywhere, you can still ask a question and answer your own question for permanent, searchable archive.

paulgerhardt 4 days ago | parent [-]

I would imagine the endorsement requirement reduces submissions by a few orders of magnitude.

marcosdumay 4 days ago | parent [-]

At this point SO seems harder to publish into than arxiv.

DrewADesign 4 days ago | parent [-]

If you had used the search feature you’d realize that many similar comments have already been posted on HN. Vote to close.

rerdavies 4 days ago | parent [-]

If only those who voted to close would bother to check whether the dup/close issue was ACTUALLY a duplicate. If only there were (substantial) penalties for incorrectly dup/closing. The vast majority of dup/closes seem to not actually be dup/closes. I really wish they would get rid of that feature. Would also prevent code rot (references to ancient versions of the software or compiler you're interested in that are no longer relevant, or solutions that have much easier fixes in modern versions of the software). Not missing StackOverflow in the least. It did not age well. (And the whole copyright thing was just toxically stupid).

DrewADesign 4 days ago | parent [-]

I think they should have had some mechanism that encouraged people to help everybody, including POSITIVELY posting links to previously answered questions, and then only making meaningfully unique ones publicly discoverable (even in the site search by default), afterwards. Instead, they provided an incentive structure and collection of rationales that cultivated a culture of hall monitors with martyr complexes far more interested in punitively enforcing the rules than being a positive educational resource.

scirob 4 days ago | parent | prev | next [-]

Has anyone tried building a modern Stack Overflow that's actually designed for AI-first developers? The core idea: question gets asked → immediately shows answers from 3 different AI models. Users get instant value. Then humans show up to verify, break it down, or add production context. But flip the reputation system: instead of reputation for answers, you get it for catching what's wrong or verifying what works. "This breaks with X" or "verified in production" becomes the valuable contribution. Keep federation in mind from day one (did:web, did:plc) so it's not another closed platform. Stack Overflow's magic was making experts feel needed. They still do—just differently now.

noduerme 4 days ago | parent | next [-]

Oh, so it wasn't bad enough to spot bad human answers as an expert on Stack Overflow... now humans should spend their time spotting bad AI answers? How about a model where you ask a human and no AI input is allowed, to make sure that everyone has everyone else's full attention?

imcritic 4 days ago | parent [-]

Why disallow AI input? Is it that poor? Surely it isn't.

noduerme 4 days ago | parent | next [-]

The entire purpose of answering questions as an "expert" on S.O. is/was to help educate people who were trying to learn how to solve problems mostly on their own. The goal isn't to solve the immediate problem, it's to teach people how to think about the problem so that they can solve it themselves the next time. The use of AI to solve problems for you completely undermines that ethos of doing it yourself with the minimum amount of targeted, careful questions possible.

wtetzner 4 days ago | parent | prev [-]

What's the point of AI on a site like that? Wouldn't you just ask an LLM directly if you were fine with AI answers?

noduerme 4 days ago | parent | next [-]

You're absolutely correct, but the scary thing is this: What happens when a whole generation grows up not knowing how to answer another person's question without consulting AI?

[edit] It seems to me that this is a lot like the problem which bar trivia nights faced around the inception of the smartphone. Bar trivia nights did, sporadically and unevenly, learn how to evolve questions themselves which couldn't be quickly searched online. But it's still not a well-solved problem.

When people ask "why do I need to remember history lessons - there is an encyclopedia", or "why do I need to learn long division - I have a calculator", I guess my response is: Why do we need you to suck oxygen? Why should I pay for your ignorance? I'm perfectly happy to be lazy in my own right, but at least I serve a purpose. My cat serves a purpose. If you vibe code and you talk to LLMs to answer your questions...I'm sorry, what purpose do you serve?

scirob 4 days ago | parent | prev [-]

I and many others already go the extra mile to ask multiple LLM's for hard questions or for getting a diversity of AI opinions to then internalize and cross check myself.

There are apps that build up a nice sized user base on this small convenience aded of getting 2 answers at once REF https://lmarena.ai/ https://techcrunch.com/2025/05/21/lm-arena-the-organization-...

All the major AI companies of course do not want to give you the answers from other AI's so this service needs to be a third party.

But then beyond that there are hard/niche questions where the AI's are wrong often and humans also have a hard time getting it right, but with a larger discussion and multiple minds chewing the problem one can get to a more correct answer often by process of elimination.

I encountered this recently in a niche non-US insurance project and I basically coded together the above as an internal tool. AI suggestions + human collaboration to find the best answer. Of course in this case everyone is getting paid to spend time with this thing so more like AI first Stack Overflow Internal. I have no evidence that an public version would do well when ppl don't get paid to commend and rate.

noduerme 4 days ago | parent [-]

I was making a point elsewhere in this thread that the best way to learn is to teach; and that's why Stack Overflow was valuable for contributors, as a way of honing their skills. Not necessarily for points.

What you need to do, in your organization, is to identify the people who actually care about teaching and learning for their own sake, as opposed to the people who do things for money, and to find a way to promote the people with the inclination to learn and teach into higher positions. Because it shows they aren't greedy, they aren't cheating, and they probably will have your organization's best interests at heart (even if that is completely naïve and they would be better off taking a long vacation - even if they are explicitly the people who claim to dislike your organization the most). I am not talking about people who simply complain. I mean people who show up and do amazing work on a very low level, and teach other people to do it - because they are committed to their jobs. Even if they are completely uneducated.

For me, the only people I trust are people who exhibit this behavior: They do something above and beyond which they manifestly did not need to do, without credit, in favor of the project I'm spending my time on.

>> But then beyond that there are hard/niche questions where the AI's are wrong often and humans also have a hard time getting it right, but with a larger discussion and multiple minds chewing the problem one can get to a more correct answer often by process of elimination.

Humans aren't even good at this, most of the time, but one has to consider AI output to be almost meaningless babble.

May I say that the process of elimination is actually not the most important aspect of that type of meeting. It is the surfacing of things you wouldn't have considered - even if they are eliminated later in debate - which makes the process valuable.

cpa 4 days ago | parent | prev | next [-]

Am I reading an AI trying to trick me into becoming its subordinate?

dataviz1000 4 days ago | parent | next [-]

In 2014, one benefit of Stack Overflow / Exchange is a user searching for work can include that they are a top 10% contributor. It actually had real world value. The equivalent today is users with extensive examples of completed projects on Github that can be cloned and run. OP's solution if contained in Github repositories will eventually get included in a training model. Moreover, the solution will definitely be used for training because it now exists on Hacker News.

scirob 4 days ago | parent | next [-]

I had a conversation with a couple accountants / tax-advisor types about them participating in something like this for their specialty. And the response was actually 100% positive because they know that there is a part of their job that the AI can never take 1) filings requires you to have a human with a government approved license 2) There is a hidden information about what tax optimization is higher or lower risk based on their information from their other clients 3) Humans want another human to make them feel good that their tax situation is taken care of well.

But also many said that it would be better if one wraps this in an agency so the leads that are generated from the AI accounting questions only go to a few people instead of making it fully public stackexchange like.

So +1 point -1 point for the idea of a public version.

noduerme 4 days ago | parent | prev [-]

LOL. As a top 10% contributor on Stack Overflow, and on FlashKit before that, I can assure you that any real world value attached to that status was always imaginary, or at least highly overrated.

Mainly, it was good at making you feel useful and at honing your own craft - because providing answers forced you to think about other people's questions and problems as if they were little puzzles you could solve in a few minutes. Kept you sharp. It was like a game to play in your spare time. That was the reason to contribute, not the points.

imcritic 4 days ago | parent | prev | next [-]

Yeah, they didn't even bother to suggest paying you with tokens for the job well done! The audacity!

scirob 4 days ago | parent [-]

hehe yea this existing of course. like these guys https://yupp.ai/ they have not announced the tokens but there are points and they got all their VC money from web3 VC. I'm sure there are others trying

scirob 4 days ago | parent | prev [-]

hehe, damn I did let an AI fix my grammer and they promptly put the classic tell of — U+2014 in there

j45 4 days ago | parent | prev | next [-]

AI is generally setup to return the "best" answer as defined as the most common answer, not the rightest, or most efficient or effective answer, unless the underlying data leans that way.

It's why AI based web search isn't behaving like google based search. People clicking on the best results really was a signal for google on what solution was being sought. Generally, I don't know that LLMs are covering this type of feedback loop.

whilenot-dev 4 days ago | parent | prev | next [-]

That seems like a horrible core idea. How is that different from data labeling or model evaluation?

Human beings want to help out other human beings, spread knowledge and might want to get recognition for it. Manually correcting (3 different) automation efforts seems like incredible monotone, unrewarding labour for a race to the bottom. Nobody should spend their time correcting AI models without compensation.

scirob 4 days ago | parent [-]

Great point, thanks for the reality check.

Speaking of evals the other day I found out that most of the people who contributed to Humanities Last Exam https://agi.safe.ai/ got paid >$2k each. So just adding to your point.

mcintyre1994 4 days ago | parent | prev [-]

I think this could be really cool, but the tricky thing would be knowing when to use it instead of just asking the question directly to whichever AI. It’s hard to know that you’ll benefit from the extra context and some human input unless you already have a pretty good idea about the topic.

imcritic 4 days ago | parent [-]

Presumably over time said AI could figure out if your question had already been answered and in that case would just redirect you too the old thread instead.

achille 4 days ago | parent | prev | next [-]

thanks for sharing that, it was simple, neat, elegant.

this sent me down a rabbit hole -- I asked a few models to solve that same problem, then followed up with a request to optimize it so it runs more efficiently.

chatgpt & gemini's solutions were buggy, but claude solved it, and actually found a solution that is even more efficient. It only needs to compute sqrt once per iteration. It's more complex however.

                   yours  claude
  ------------------------------
  Time (ns/call)    40.5   38.3
  sqrt per iter        3      1
  Accuracy        4.8e-7 4.8e-7
Claude's trick: instead of calling sin/cos each iteration, it rotates the existing (cos,sin) pair by the small Newton step and renormalizes:

  // Rotate (c,s) by angle dt, then renormalize to unit circle
  float nc = c + dt*s, ns = s - dt*c;
  float len = sqrt(nc*nc + ns*ns);
  c = nc/len; s = ns/len;
See: https://gist.github.com/achille/d1eadf82aa54056b9ded7706e8f5...

p.s: it seems like Gemini has disabled the ability to share chats can anyone else confirm this?

0xfaded 4 days ago | parent [-]

Thanks for pushing this, I've never gone beyond "zero" shotting the prompt (is it still called zero shot with search?)

As a curiosity, it looks like r and q are only ever used as r/q, and therefore a sqrt could be saved by computing rq = sqrt((rxrx + ryry) / (qxqx + qyqy)). The if q < 1e-10 is also perhaps not necessary, since this would imply that the ellipse is degenerate. My method won't work in that case anyway.

For the other sqrt, maybe try std::hypot

Finally, for your test set, could you had some highly eccentric cases such as a=1 and b=100

Thanks for the investigation:)

Edit: BTW, the sin/cos renormalize trick is the same as what tx,ty are doing. It was pointed out to me by another SO member. My original implementation used trig functions

achille 4 days ago | parent [-]

Nice, that worked. It's even faster.

                 yours  yours+opt  claude
  ---------------------------------------
  Time (ns)        40.9      36.4    38.7
  sqrt/iter           3         2       1
  Instructions      207       187     241
Edit: it looks like the claude algorithm fails at high eccentricities. Gave chatgpt pro more context and it worked for 30min and only made marginal improvement on yours, by doing 2 steps then taking a third local step.

https://gist.github.com/achille/23680e9100db87565a8e67038797...

0xfaded 4 days ago | parent [-]

Haha nice, hanging in there by a thread

gchuf 4 days ago | parent | next [-]

Consider updating your answer on SO - I know I'll keep visiting SO for answers like these for quite some time. And enjoy the deserved upvotes :)

HappyPanacea 4 days ago | parent | prev [-]

Do you think you can extend it to distance from a point to an ellipsoid?

0xfaded 4 days ago | parent [-]

Yes, people have done this

weatherlite 4 days ago | parent | prev | next [-]

I can relate. I used to have a decent SO profile (10k+ reputation, I know this isnt crazy but it was mostly on non low hanging fruit answers...it was a grind getting there). I used to be proud of my profile and even put it in my resume like people put their Github. Now - who cares? It would make look like a dinosaur sharing that profile, and I never go to SO anymore.

davchana 4 days ago | parent | prev | next [-]

I too, around 2012 was too much active on so, in fact, it had that counter thing continuously xyz days most of my one liners, or snippets for php are still the highest voted answers. Even now when sometimes I google something, and an answer comes up, I realize its me who asked the same question and answered it too.

banku_brougham 4 days ago | parent | next [-]

I have had this experience -- twice with the same answer. There is nothing so amusing in quite this way.

googlehater 3 days ago | parent | prev [-]

I often forget just how much smaller and less siloed the internet was just ~13 years ago.

zellyn 4 days ago | parent | prev | next [-]

Please, start a blog! Hugo + GitHub hosting makes it laughably simple. (Or pick a different stack; that’s just mine.)

Even if you’re worried it’ll be sparse and crappy, isn’t an Internet full of idiosyncratic personal blogs what we all want?

If you want help or encouragement, reach out: zellyn@ most places

Aurornis 4 days ago | parent | next [-]

> Please, start a blog!

The second sentence of the SO post is a link to their blog where it was posted originally. The blog is not a replacement for the function SO served.

0xfaded 4 days ago | parent | prev [-]

It's been a long time, but here is the writeup https://blog.chatfield.io/simple-method-for-distance-to-elli...

OJFord 4 days ago | parent | prev | next [-]

I don't disagree completely by any means, it's an interesting point, but in your SO answer you already point to your blog post explaining it in more detail, so isn't that the answer, you'd just blog about it and not bother with SO?

Then AI finding it (as opposed to already trained well enough on it, I suppose) will still point to it as did your SO answer.

Neywiny 4 days ago | parent | prev | next [-]

Looks like solid code. My only gripe is the shadowing of x. I would prefer to see `for _ in range`. You do redefine it immediately so it's not the most confusing, but it could trip people up especially as it's x and not i or something.

0xfaded 4 days ago | parent [-]

Hahaha thanks, I never noticed that. If I ever print it out and frame it I'll be sure to fix it

noduerme 4 days ago | parent | prev | next [-]

That's pretty nice ;)

I once wrote this humdinger, that's still on my mostly dead personal website from 2010... one of my proudest bits of code besides my poker hand evaluator ;)

The question was, how do you generate a unique number for any two positive integers, where x!=y, such that f(x,y) = f(y,x) but the resulting combined id would not be generated by any other pair of integers. What I came up with was a way to generate a unique key from any set of positive integers which is valid no matter the order, but which doesn't key to any other set.

My idea was to take the radius of a circle that intersected the integer pair in cartesian space. That alone doesn't guarantee the circle won't intersect any other integer pairs... so I had to add to it the phase multiple of sine and cosine which is the same at those two points on the arc. That works out to:

(x^2+y^2)+(sin(atan(x/y))*cos(atan(x/y)))

And means that it doesn't matter which order you feed x and y in, it will generate a unique float for the pair. It reduces to:

x^2+y^2+( (x/y) / (x^2+y^2) )

To add another dimension, just add it to the process and key it to one of the first...

x^2+y^2+z^2+( (x/y) / (x^2+y^2) )+( (x/z) / (x^2+z^2) )

bazzargh 4 days ago | parent [-]

It looks like you have typos? (x^2+y^2)+(sin(atan(x/y))*cos(atan(x/y))) reduces to x^2+y^2+( (x/y) / (x^2/y^2 + 1) ) - not the equation given? Tho it's easier to see that this would be symmetrical if you rearrange it to: x^2+y^2+( (xy) / (x^2+y^2) )

Also, if f(x,y) = x^2+y^2+( (x/y) / (x^2+y^2) ) then f(2,1) is 5.2 and f(1,2) is 5.1? - this is how I noticed the mistake. (the other reduction gives the same answer, 5.4, for both, by symmetry, as you suggest)

There's a simpler solution which produces integer ids (though they are large): 2^x & 2^y. Another solution is to multiply the xth and yth primes.

I only looked because I was curious how you proved it unique!

noduerme 4 days ago | parent [-]

Hhhhmm. Ok. So I invented this solution in 2009 at what you might call a "peak mental moment", by a pool in Palm Springs, CA, after about 6 hours of writing on napkins. I'm not a mathematician. I don't think I'm even a great programmer, since there are probably much better ways of solving the thing I was trying to solve. And also, I'm not sure how I even came up with the reduction; I probably was wrong or made a typo (missing the +1?), and I'm not even certain how I could come up with it again.

2^x & 2^y ...is the & a bitwise operator...???? That would produce a unique ID? That would be very interesting, is that provable?

Primes take too much time.

The thing I was trying to solve was: I had written a bitcoin poker site from scratch, and I wanted to determine whether any players were colluding with each other. There were too many combinations of players on tables to analyze all their hands versus each other rapidly, so I needed to write a nightly cron job that collated their betting patterns 1 vs 1, 1 vs 2, 1 vs 3... any time 2 or 3 or 4 players were at the same table, I wanted to have a unique signature for that combination of players, regardless of which order they sat in at the table or which order they played their hands in. All the data for each player's action was in a SQL table of hand histories, indexed by playerID and tableID, with all the other playerIDs in the hand in a separate table. At the time, at least, I needed a faster way to query that data so that I could get a unique id from a set of playerIDs that would pull just the data from this massive table where all the same players were in a hand, without having to check the primary playerID column for each one. That was the motivation behind it.

It did work. I'm glad you were curious. I think I kept it as the original algorithm, not the reduced version. But I was much smarter 15 years ago... I haven't had an epiphany like that in awhile (mostly have not needed to, unfortunately).

bazzargh 4 days ago | parent | next [-]

The typo is most likely the extra /, in (x/y)/(x^2+y^2) instead of (xy)/(x^2+y^2).

`2^x & 2^y ...is the & a bitwise operator...???? That would produce a unique ID? That would be very interesting, is that provable?`

Yes, & is bitwise and. It's just treating your players as a bit vector. It's not so much provable as a tautology, it is exactly the property that players x and y are present. It's not _useful_ tho because the field size you'd need to hold the bit vector is enormous.

As for the problem...it sounds bloom-filter adjacent (a bloom filter of players in a hand would give a single id with a low probability of collision for a set of players; you'd use this to accelerate exact checks), but also like an indexed many-to-many table might have done the job, but all depends on what the actual queries you needed to run were, I'm just idly speculating.

noduerme 4 days ago | parent [-]

At the time, at least, there was no way to index it for all 8 players involved in a hand. Each action taken would be indexed to the player that took it, and I'd need to sweep up adjacent actions for other players in each hand, but only the players who were consistently in lots of hands with that player. I've heard of bloom filters (now, not in 2012)... makes some sense. But the idea was to find some vector that made any set of players unique when running through a linear table, regardless of the order they presented in.

To that extent, I submit my solution as possibly being the best one.

I'm still a bit perplexed by why you say 2^x & 2^y is tautologically sound as a unique way to map f(x,y)==f(y,x), where x and y are nonequal integers. Throwing in the bitwise & makes it seem less safe to me. Why is that provably never replicable between any two pairs of integers?

bazzargh 3 days ago | parent [-]

I'm saying it's a tautology because it's just a binary representation of the set. Suppose we have 8 players, with x and y being 2 and 4: set the 2nd and 4th bits (ie 2^2 & 2^4) and you have 00001010.

But to lay it out: every positive integer is a sum of powers of 2. (this is obvious, since every number is a sum of 1s, ie 2^0). But also every number is a sum of _distinct_ powers of 2: if there are 2 identical powers 2^a+2^a in the sum, then they are replaced by 2^(a+1), this happens recursively until there are no more duplicated powers of 2.

It remains to show that each number has a unique binary representation, ie that there are no two numbers x=2^x1+2^x2+... and y=2^y1+2^y2+... that have the same sum, x=y, but from different powers. Suppose we have a smallest such number, and x1 y1 are the largest powers in each set. Then x1 != y1 because then we can subtract it from both numbers and get an _even smaller_ number that has distinct representations, a contradiction. Then either x1 < y1 or y1 < x1. Suppose without loss of generality that it's the first (we can just swap labels). then x<=2^(x1+1)-1 (just summing all powers of 2 from 1..x1) but y>=2^y1>=2^(x1+1)>x, a contradiction.

or, tl;dr just dealing with the case of 2 powers: we want to disprove that there exists a,b,c,d such that

2^a + 2^b = 2^c + 2^d, a>b, c>d, and (a,b) != (c,d).

Suppose a = c, then subtract 2^a from both sides and we have 2^b = 2^d, so b=d, a contradiction.

Suppose a>c; then a >= c+1.

2^c + 2^d < 2^c + 2^c = 2^(c+1).

so

2^c + 2^d <= 2^(c+1) - 1 < 2^(c+1) + 2^b <= 2^a + 2^b

a contradiction.

noduerme 3 days ago | parent [-]

Thanks for the great response. Honestly, TIL that 2^0 = 1. That was a new one for me and I'm not sure I understand it. I failed pre-Calculus, twice.

Visually I think I can understand the bitwise version now, from reading this. But it wouldn't work for 3 integers, would it?

bazzargh 3 days ago | parent [-]

it works for any number of integers. The first proof above (before tl;dr) is showing that every positive integer has a unique representation as a sum of distinct powers of 2, ie binary, and that no two integers have the same representation. You can watch a lecture about the representation of sets in binary here https://www.youtube.com/watch?v=Iw21xgyN9To (google representing sets with bits for way more like this)

But again it's not useful in practice for very sparse sets: if you have say a million players, with at most 10 at the same poker table, setting 10 bits of a million-bit binary number is super wasteful. Even representing the players as fixed size 20-bit numbers (1 million in binary is 20 bits long), and appending the 10 sorted numbers, means you don't need more than 200 bits to represent this set.

And you can go much smaller if all you want is to label a _bucket_ that includes this particular set; just hash the 10 numbers to get a short id. Then to query faster for a specific combination of players you construct the hash of that group, query to get everything in that bucket (which may include false positives), then filter this much smaller set of answers.

bazzargh 4 days ago | parent | prev [-]

BTW, yet another way to do it (more compact than the bitwise and prime options) is the Cantor pairing function https://en.wikipedia.org/wiki/Pairing_function

... z = (x+y+1)(x+y)/2 + y - but you have to sort x,y first to get the order independence you wanted. This function is famously used in the argument that the set of integers and the set of rationals have the same cardinality.

noduerme 4 days ago | parent [-]

mm. I did see this when I was figuring it out. The sorting first was the specific thing I wanted to avoid, because it would've been by far the most expensive part of the operation when looking at a million poker hands and trying to target several players for potential collusion.

bazzargh 4 days ago | parent [-]

you're only sorting players within a single hand. so a list of under 10 items? thats trivial

noduerme 3 days ago | parent [-]

So the goal was to generate signatures for 2, 3 or more players and then be able to reference anything in the history table that had that combination of players without doing a full scan and cross-joining the same table multiple times. Specifically to avoid having ten index columns in the history table for each seat's player. This was also prior to JSON querying in mysql. I needed a way to either bake in the combinations at write time, or to generate a unique id at read time in a way that wouldn't require me to query whether playerIDs were [1201,1803,2903] or [1803,1201,2903] etc. Just a one-shot unique signature for that combination of players that could always evaluate the same regardless of the order. If that makes sense. There were other considerations and this was not exactly how it worked, since only certain players were flagged and I was looking for patterns when those particular players were on the same table. It wasn't like every combination of players had a unique id, just a few combinations where I needed to be able to search over a large space to find when they were in the same room together, but disregarding the order they were listed in.

emmelaich 4 days ago | parent | prev | next [-]

You should write it up and submit it to some journal officially. Doesn't matter if it mostly duplicates your own (technically unpublished) work.

PeterStuer 4 days ago | parent | prev | next [-]

SO in 2013 was a different world from the SO of the 2020's. In the latter world your post would have been moderator classified as 'duplicate' of some basic textbook copy/pasted method posted by a karma grinding CS student and closed.

eitland 4 days ago | parent [-]

My experience as well:

Stack Overflow used to (in practice) be a place to ask questions and get help and also help others.

At some point it became all about some mission and not only was it not as useful anymore but it also became a whole lot less fun.

eru 4 days ago | parent | prev | next [-]

I have a similar story about an interesting little advance in computing that I haven't formally published anywhere, but it's at https://cs.stackexchange.com/a/171695/50292

The question boils down to: can you simulate the bulk outcome of a sequence of priority queue operations (insert and delete-minimum) in linear time, or is O(n log n) necessary. Surprisingly, linear time is possible.

RustyRussell 4 days ago | parent | prev | next [-]

On the other hand, I once implemented something to be told later it was novel and probably the optimal solution in the space.

An AI might be more likely to find it...

eviks 4 days ago | parent | prev | next [-]

> Today I don't know where I would publish such a gem.

In the same blog you published it originally, then mentioning it on whatever social media site you use? So same?

fho 4 days ago | parent | prev | next [-]

Then let me quickly say: thank you! I used that algorithm three times in different projects during my academic "career" :-)

rerdavies 4 days ago | parent | prev | next [-]

Reddit is my current go-to for human-sourced info. Search for "reddit your question here". Where on reddit? Not sure. I don't post, tbh, but I do search.

Has the added benefit of NOT returning stackoverflow answers, since StackOverflow seems to have rotted out these days, and been taken over by the "rejection police".

mightybyte 4 days ago | parent | prev | next [-]

Sounds like this should live in Wikipedia somewhere on https://en.wikipedia.org/wiki/Ellipse...or maybe a related but more CS focused related page.

kwakubiney 4 days ago | parent | prev | next [-]

Naive question maybe but how haven’t the models been trained on your answer if it’s on SO?

wesammikhail 4 days ago | parent [-]

Models are NOT search engines.

Even if LLMs were trained on the answer, that doesn't mean they'll ever recommend it. Regardless of how accurate it may be. LLMs are black box next token predictors and that's part of the issue.

jmux 4 days ago | parent | prev | next [-]

This is a really method for solving that problem! I wouldn’t have thought to use the tangents but that makes perfect sense

baq 4 days ago | parent | prev | next [-]

If you ask me your blog post is basically a paper, I’d publish to arxiv.

userbinator 4 days ago | parent | prev | next [-]

That algorithm reminds me of raymarching signed distance functions.

lbj 4 days ago | parent | prev | next [-]

Really great write-up, thanks for sharing it again!

techsystems 4 days ago | parent | prev | next [-]

Amazing work!

mmaaz 4 days ago | parent | prev | next [-]

Very cool!

qwertox 4 days ago | parent | prev [-]

Why did SO decide to do that to us? to not invest in ai and then, iirc, claim our contributions their ownership. i sometimes go back to answers i gave, even when answered my own questions.

socalgal2 4 days ago | parent [-]

Decide to do what?

SO didn't claim contributions. They're still CC-BY-SA

https://stackoverflow.com/help/licensing

AFAICT all they did is stop providing dumps. That doesn't change the license.

I was very active, In fact I'm actually upset at myself for spending so much time there. That said, I always thought I was getting fair value. They provided free hosting, I got answers and got to contribute answers for others.