Remix.run Logo
mmooss 15 hours ago

> Imagine you could interview thousands of educated individuals from 1913—readers of newspapers, novels, and political treatises—about their views on peace, progress, gender roles, or empire.

I don't mind the experimentation. I'm curious about where someone has found an application of it.

What is the value of such a broad, generic viewpoint? What does it represent? What is it evidence of? The answer to both seems to be 'nothing'.

TSiege an hour ago | parent | next [-]

I agree. This is just make believe based on a smaller subset of human writing than LLMs we have today. It's responses are in no way useful because it is a machine mimicking a subset of published works that survived to be digitized. In that sense the "opinions" and "beliefs" are just an averaging of a subset of a subset of humanity pre 1913. I see no value in this to historians. It is really more of a parlor trick, a seance masquerading as science.

mediaman 15 hours ago | parent | prev | next [-]

This is a regurgitation of the old critique of history: what's it's purpose? What do you use it for? What is its application?

One answer is that the study of history helps us understand that what we believe as "obviously correct" views today are as contingent on our current social norms and power structures (and their history) as the "obviously correct" views and beliefs of some point in the past.

It's hard for most people to view two different mutually exclusive moral views as both "obviously correct," because we are made of a milieu that only accepts one of them as correct.

We look back at some point in history, and say, well, they believed these things because they were uninformed. They hadn't yet made certain discoveries, or had not yet evolved morally in some way; they had not yet witnessed the power of the atomic bomb, the horrors of chemical warfare, women's suffrage, organized labor, or widespread antibiotics and the fall of extreme infant mortality.

An LLM trained on that history - without interference from the subsequent actual path of history - gives us an interactive compression of the views from a specific point in history without the subsequent coloring by the actual events of history.

In that sense - if you believe there is any redeeming value to history at all; perhaps you do not - this is an excellent project! It's not perfect (it is only built from writings, not what people actually said) but we have no other available mass compression of the social norms of a specific time, untainted by the views of subsequent interpreters.

vintermann 8 hours ago | parent | next [-]

One thing I haven't seen anyone bring up yet in this thread, is that there's a big risk of leakage. If even big image models had CSAM sneak into their training material, how can we trust data from our time hasn't snuck into these historical models?

I've used Google books a lot in the past, and Google's time-filtering feature in searches too. Not to mention Spotify's search features targeting date of production. All had huge temporal mislabeling problems.

DGoettlich 37 minutes ago | parent [-]

[dead]

mmooss 8 hours ago | parent | prev [-]

> This is a regurgitation of the old critique of history: what's it's purpose? What do you use it for? What is its application?

Feeling a bit defensive? That is not at all my point; I value history highly and read it regularly. I care about it, thus my questions:

> gives us an interactive compression of the views from a specific point in history without the subsequent coloring by the actual events of history.

What validity does this 'compression' have? What is the definition of a 'compression'? For example, I could create random statistics or verbiage from the data; why would that be any better or worse than this 'compression'?

Interactivity seems to be a negative: It's fun, but it would seem to highly distort the information output from the data, and omits the most valuable parts (unless we luckily stumble across it). I'd much rather have a systematic presentation of the data.

These critiques are not the end of the line; they are step in innovation, which of course raises challenging questions and, if successful, adapts to the problems. But we still need to grapple with them.

behringer 15 hours ago | parent | prev [-]

It doesn't have to be generic. You can assign genders, ideals, even modern ones, and it should do it's best to oblige.