Remix.run Logo
libraryofbabel 16 hours ago

I used to teach 19th-century history, and the responses definitely sound like a Victorian-era writer. And they of course sound like writing (books and periodicals etc) rather than "chat": as other responders allude to, the fine-tuning or RL process for making them good at conversation was presumably quite different from what is used for most chatbots, and they're leaning very heavily into the pre-training texts. We don't have any living Victorians to RLHF on: we just have what they wrote.

To go a little deeper on the idea of 19th-century "chat": I did a PhD on this period and yet I would be hard-pushed to tell you what actual 19th-century conversations were like. There are plenty of literary depictions of conversation from the 19th century of presumably varying levels of accuracy, but we don't really have great direct historical sources of everyday human conversations until sound recording technology got good in the 20th century. Even good 19th-century transcripts of actual human speech tend to be from formal things like court testimony or parliamentary speeches, not everyday interactions. The vast majority of human communication in the premodern past was the spoken word, and it's almost all invisible in the historical sources.

Anyway, this is a really interesting project, and I'm looking forward to trying the models out myself!

nemomarx 15 hours ago | parent | next [-]

I wonder if the historical format you might want to look at for "Chat" is letters? Definitely wordier segments, but it's at least the back and forth feel and we often have complete correspondence over long stretches from certain figures.

This would probably get easier towards the start of the 20th century ofc

libraryofbabel 15 hours ago | parent [-]

Good point, informal letters might actually be a better source - AI chat is (usually) a written rather than spoken interaction after all! And we do have a lot transcribed collections of letters to train on, although they’re mostly from people who were famous or became famous, which certainly introduces some bias.

dleeftink 15 hours ago | parent | prev | next [-]

While not specifically Victorian, couldn't we learn much from what daily conversations were like by looking at surviving oral cultures, or other relatively secluded communal pockets? I'd also say time and progress are not always equally distributed, and even within geographical regions (as the U.K.) there are likely large differences in the rate of language shifts since then, some possibly surviving well into the 20th century.

NooneAtAll3 8 hours ago | parent | prev | next [-]

don't we have parlament transcripts? I remember something about Germany (or maybe even Prussia) developing fast script to preserve 1-to-1 what was said

bryancoxwell 14 hours ago | parent | prev [-]

Fascinating, thanks for sharing