> I hope people realize that tools like caveman are mostly joke/prank projects

This seems to be a common thread in the LLM ecosystem; someone starts a project for shits and giggles, makes it public, most people get the joke, others think it's serious, author eventually tries to turn the joke project into a VC-funded business, some people are standing watching with the jaws open, the world moves on.

▲

simonw 7 hours ago | parent | next [-]

I was convinced https://github.com/memvid/memvid was a joke until it turned out it wasn't.

▲

embedding-shape 7 hours ago | parent | next [-]

To be fair, most of us looked at GPT1 and GPT2 as fun and unserious jokes, until it started putting together sentences that actually read like real text, I remember laughing with a group of friends about some early generated texts. Little did we know.

▲

Alifatisk 7 hours ago | parent | next [-]

Are there any public records I can see from GPT1 and GPT2 output and how it was marketed?

▲

embedding-shape 7 hours ago | parent | next [-]

HN submissions have a bunch of examples in them, but worth remembering they were released as "Look at this somewhat cool and potentially useful stuff" rather than what we see today, LLMs marketed as tools.

https://news.ycombinator.com/item?id=21454273 / https://news.ycombinator.com/item?id=19830042 - OpenAI Releases Largest GPT-2 Text Generation Model

HN search for GPT between 2018-2020, lots of results, lots of discussions: https://hn.algolia.com/?dateEnd=1577836800&dateRange=custom&...

▲

mlsu 6 hours ago | parent | prev | next [-]

I was first made aware of GPT2 from reading Gwern -- "huh, that sounds interesting" -- but really didn't start really reading model output until I saw this subreddit:

https://www.reddit.com/r/SubSimulatorGPT2/

There is a companion Reddit, where real people discuss what the bots are posting:

https://www.reddit.com/r/SubSimulatorGPT2Meta/

You can dig around at some of the older posts in there.

▲

walthamstow 7 hours ago | parent | prev | next [-]

I don't think it was marketed as such, they were research projects. GPT-3 was the first to be sold via API

▲

PufPufPuf 4 hours ago | parent | prev | next [-]

I used GPT-2 (fine-tuned) to generate Peppa Pig cartoons, it was cutely incoherent https://youtu.be/B21EJQjWUeQ

▲

maplethorpe 6 hours ago | parent | prev | next [-]

From a 2019 news article:

> New AI fake text generator may be too dangerous to release, say creators

> The Elon Musk-backed nonprofit company OpenAI declines to release research publicly for fear of misuse.

> OpenAI, an nonprofit research company backed by Elon Musk, Reid Hoffman, Sam Altman, and others, says its new AI model, called GPT2 is so good and the risk of malicious use so high that it is breaking from its normal practice of releasing the full research to the public in order to allow more time to discuss the ramifications of the technological breakthrough.

https://www.theguardian.com/technology/2019/feb/14/elon-musk...

▲

ethbr1 6 hours ago | parent [-]

Aka 'We cared about misuse right up until it became apparent that was profit to be had'

OpenAI sure speed ran the Google and Facebook 'Don't be evil' -> 'Optimize money' transition.

▲

sfn42 5 hours ago | parent [-]

Or - making sensational statements gets attention. A dangerous tool is necessarily a powerful tool, so that statement is pretty much exactly what you'd say if you wanted to generate hype, make people excited and curious about your mysterious product that you won't let them use.

	▲	eric_h 5 hours ago \| parent \| next [-]
		Much like what Anthropic very recently did re: Mythos
	▲	xpe an hour ago \| parent \| prev [-]
		Think about all the possible explanations carefully. Weight them based on the best information you have. (I think the most likely explanation for Mythos is that it's asymmetrically a very big deal. Come to your own conclusions, but don't simply fall back on the "oh this fits the hype pattern" thought terminating cliché.) Also be aware of what you want to see. If you want the world to fit your narrative, you're more likely construct explanations for that. (In my friend group at least, I feel like most fall prey to this, at least some of the time, including myself. These people are successful and intelligent by most measures.) Then make a plan to become more disciplined about thinking clearly and probabilistically. Make it a system, not just something you do sometimes. I recommend the book "the Scout Mindset". Concretely, if one hasn't spent a couple of quality hours really studying AI safety I think one is probably missing out. Dan Hendrycks has a great book.

▲

wat10000 6 hours ago | parent | prev | next [-]

You can run GPT2! Here's the medium model: https://huggingface.co/openai-community/gpt2-medium

I will now have it continue this comment:

I've been running gps for a long time, and I always liked that there was something in my pocket (and not just me). One day when driving to work on the highway with no GPS app installed, I noticed one of the drivers had gone out after 5 hours without looking. He never came back! What's up with this? So i thought it would be cool if a community can create an open source GPT2 application which will allow you not only to get around using your smartphone but also track how long you've been driving and use that data in the future for improving yourself...and I think everyone is pretty interested.

[Updated on July 20] I'll have this running from here, along with a few other features such as: - an update of my Google Maps app to take advantage it's GPS capabilities (it does not yet support driving directions) - GPT2 integration into your favorite web browser so you can access data straight from the dashboard without leaving any site! Here is what I got working.

[Updated on July 20]

▲

fancyfredbot 2 hours ago | parent [-]

Wow that is terrible. In my memory GPT 2 was more interesting than that. I remember thinking it could pass a Turing test but that output is barely better than a Markov chain.

I guess I was using the large model?

	▲	sillysaurusx 29 minutes ago \| parent \| next [-]
		There’s an art to GPT sampling. You have to use temperature 0.7. People never believe it makes such a massive difference, but it does.
	▲	wat10000 an hour ago \| parent \| prev \| next [-]
		Probably a much better prompt, too. I just literally pasted in the top part of my comment and let fly to see what would happen.
	▲	daveguy 2 hours ago \| parent \| prev [-]
		Here is the XL model. 20x the size of the medium model. Still just 2B parameters, but on the bright side it was trained pre-wordslop. https://huggingface.co/openai-community/gpt2-xl

▲

7 hours ago | parent | prev [-]

[deleted]

▲

Bombthecat 7 hours ago | parent | prev [-]

And now gpt is laughing,while it replaces coders lol

▲

MarcelOlsz 7 hours ago | parent | prev | next [-]

Why? Doesn't have jokey copy. Any thoughts on claude-mem[0] + context-mode[1]?

[0] https://github.com/thedotmack/claude-mem

[1] https://github.com/mksglu/context-mode

▲

simonw 7 hours ago | parent [-]

The big idea with Memvid was to store embedding vector data as frames in a video file. That didn't seem like a serious idea to me.

	▲	nico 7 hours ago \| parent \| next [-]
		Very cool idea. Been playing with a similar concept: break down one image into smaller self-similar images, order them by data similarity, use them as frames for a video You can then reconstruct the original image by doing the reverse, extracting frames from the video, then piecing them together to create the original bigger picture Results seem to really depend on the data. Sometimes the video version is smaller than the big picture. Sometimes it’s the other way around. So you can technically compress some videos by extracting frames, composing a big picture with them and just compressing with jpeg
	▲	jermaustin1 7 hours ago \| parent \| prev [-]
		> embedding vector data as frames in a video file Interesting, when I heard about it, I read the readme, and I didn't take that as literal. I assumed it was meant as we used video frames as inspiration. I've never used it or looked deeper than that. My LLM memory "project" is essentially a `dict<"about", list<"memory">>` The key and memories are all embeddings, so vector searchable. I'm sure its naive and dumb, but it works for my tiny agents I write.

▲

niuzeta 7 hours ago | parent | prev | next [-]

Just read through the readme and I was fairly sure this was a well-written satire through "Smart Frames".

Honestly part of me still thinks this is a satire project but who knows.

▲

DiffTheEnder 7 hours ago | parent | prev | next [-]

Is this... just one file acting as memory?

▲

7 hours ago | parent | prev [-]

[deleted]

▲

msikora an hour ago | parent | prev | next [-]

This has been a thing way before AI. Anyone remembers Yo, the single button social media app that raised $1M in 2014?

▲

combobyte 6 hours ago | parent | prev | next [-]

> most people get the joke

I hope you're right, but from my own personal experience I think you're being way too generous.

▲

dakolli 5 hours ago | parent | prev | next [-]

Its the same as cyrpto/nft hype cyles, except this time one of the joke projects is going to crash the economy.

▲

imiric 7 hours ago | parent | prev [-]

A major reason for that is because there's no way to objectively evaluate the performance of LLMs. So the meme projects are equally as valid as the serious ones, since the merits of both are based entirely on anecdata.

It also doesn't help that projects and practices are promoted and adopted based on influencer clout. Karpathy's takes will drown out ones from "lesser" personas, whether they have any value or not.