Remix.run Logo
snaking0776 3 hours ago

I wonder if our difference in view could be an instance of the jagged nature of AI’s intelligence. I do computational research in a basic science so write code or build models basically all day that is (occasionally) novel. I would say that I’ve noticed exponential improvements in parts of my job but certainly not all. For example, if I’m trying to visualize a concept from a paper I now go straight to Codex, give it the paper, and describe a webapp which allows me to play with the model in a way that wasn’t possible one year ago (this is great for teaching btw). If I have a script that I want to generalize, add in better metrics, or setup for running on a cluster I use codex and it does great.

Where it fails me though is exactly when I’m doing something novel like developing a new model or trying to develop some new method to process data. I’ve tried many times to one shot these ideas with detailed descriptions of what I want, how I’d like to generate abstractions, etc and it almost always ends up changing what I want to what I can only describe as something which better matches its training data. It often quietly changes key details that means that I have to delete the whole thing and start over. Just today this happened. On this level of task I’ve found that my workflow and pace of iteration hasn’t really changed at all in the last year. I still have to go and explain in detail on a function by function level what I want in much the same way I did a year ago. While that’s obviously a harder task, it seems to me like the task this whole long term exponential argument hinges on. I obviously could be wrong and maybe LLM with eval loop will do all of this for us but it seems still quite bad at anything without a clear definition of “good”.

I’m personally much more concerned about autonomous weapons, surveillance, and people plugging these things into places they don’t belong to avoid responsibility than I am the general possibility of these models being smarter than me in every way but obviously I could be wrong on this and am just using it incorrectly, hence the question.