Remix.run Logo
sp0rk 4 days ago

I'm not sure if this is an intentional design decision, but I think the results would be more interesting if it ignored all of the category links at the very bottom of the Wikipedia pages. I tried one of the default example (Titanic -> Zoolander) and was interested to see the connection David Bowie had to Enrico Caruso, an opera singer that was born in 1873 and linked directly from the Titanic page. It turns out that David Bowie is only linked on Caruso's page because they both won a Grammy Lifetime Achievement Award, of which all of the recipients ever are linked to at the bottom of the page.

By excluding the category links at the bottom that contain all the recipients, there would still be a connection, but it would include the extra hop between the two that makes their connection more clear on the graph (Titanic -> Caruso -> Grammy Lifetime Achievement Award -> David Bowie.)

Otherwise, this is a fun little tool to play around with. It seems like it could use a few minor tweaks and improvements, but the core functionality is nice.

chatmasta 4 days ago | parent | next [-]

Maybe the edges should be weighted based on the link location. If it’s in the bio box it’s high priority (sibling, father, Alma Mater, etc). If it’s in “See Also” it’s medium priority. If it’s a link on a “list of X” page it’s low priority…

chuckadams 3 days ago | parent | prev | next [-]

> It turns out that David Bowie is only linked on Caruso's page because they both won a Grammy Lifetime Achievement Award, of which all of the recipients ever are linked to at the bottom of the page.

Sounds like a perfectly good connection to me, but "exclude categories" could still be a neat feature for exploring more indirect linkage. Not sure it would help in this case though -- is that actually a category page?

re 3 days ago | parent [-]

> is that actually a category page?

What the parent commenter is referring to is actually called a Navbox (https://en.wikipedia.org/wiki/Wikipedia:Navigation_template). Like @chatmasta, I think it would be interesting to label those types of links distinctly and allow excluding them.

Or perhaps alternatively, exclude the contents of those navigation templates, but allow using them as an additional node: David_Bowie -> Template:Grammy_Lifetime_Achievement_Award -> Enrico_Caruso. (In this case, that is redundant with the main non-template Grammy_Lifetime_Achievement_Award page.)

layman51 3 days ago | parent | prev | next [-]

Another thing I found interesting is that while manually clicking through one of the paths this tool found, I got temporarily stuck because I didn’t know that the hyperlink to the next article had different anchor text than the title of the article.

Affric 3 days ago | parent | prev | next [-]

Good shout. Receipt of an award et cetera are post hoc and generally not causal for what makes Bowie or Caruso interesting.

Its orthogonal to art.

seu 3 days ago | parent | prev | next [-]

Exactly. The connection between Tetris and Max Weber is... Internet Archive. :shrug:

_7mza 9 hours ago | parent | prev [-]

[dead]