Remix.run Logo
Popeyes 5 days ago

What is this war about?

I was looking at another thread about how Wikipedia was the best thing on the internet. But they only got the head start by taking copy of Encyclopedia Britannica and everything else is a

And now the corpus is collected, what difference does a blog post make, does it nudge the dial to comprehension 0.001% in a better direction? How many blog posts over how many weeks makes the difference.

nvader 5 days ago | parent | next [-]

> they only got the head start by taking copy of Encyclopedia Britannica

Wikipedia used a version of Encyclopedia Britannica that was in the public domain.

Go thou and do likewise.

simonw 5 days ago | parent | prev | next [-]

This is the first I've heard of Wikipedia starting with a copy of Britannia. Where did you see that?

simonw 5 days ago | parent [-]

OK, found it: https://en.wikipedia.org/wiki/Wikipedia:WikiProject_Encyclop...

"Starting in 2006, much of the still-useful text in the 1911 Encyclopaedia was adapted and absorbed into Wikipedia. Special focus was given to topics that had no equivalent in Wikipedia at the time. The process we used is outlined at the end of this page."

Wikipedia started in 2001. Looks like they absorbed a bunch of out-of-copyright Britannica 1911 content five years later.

There are still 13,000 pages on Wikipedia today that are tagged as deriving from that project: https://en.m.wikipedia.org/wiki/Template:EB1911

collinmcnulty 5 days ago | parent | prev [-]

It is about imposing costs on poorly behaved scraping in an attempt to change the scrapers behavior, under the assumption that the scrapers' creators are anti-social but economically rational. One blog doesn't make a huge difference but if enough new blogs contain tarpits that cost the scraper as much as the benefit of 100 other non-tarpit blogs, maybe the calculus for doing any new scraping changes and the scrapers start behaving.