Remix.run Logo
mindwok 2 days ago

I'm not yet convinced (though I remain open to the idea) that AI agents are going to be a widely adopted pattern in the way people on LinkedIn suggest.

The way I use AI today is by keeping a pretty tight leash on it, a la Claude Code and Cursor. Not because the models aren't good enough, but because I like to weigh in frequently to provide taste and direction. Giving the AI more agency isn't necessarily desirable, because I want to provide that taste.

Maybe that'll change as I do more and new ergonomics reveal themselves, but right now I don't really want AI that's too agentic. Otherwise, I kind of lose connection to it.

thimabi 2 days ago | parent | next [-]

Do you think that, over time, knowing how the models behave, simply providing more/better context and instructions can fill this gap of wanting to provide taste and direction to the models’ outputs and actions?

My experience is that, for many workflows, well-done “prompt engineering” is more than enough to make AI models behave more like we’d like without constantly needing us to weight in.

mindwok 2 days ago | parent | next [-]

I suppose it's possible, although the models would have to have a really nuanced understanding about my tastes and even then it seems doubtful.

If we use a real world analogy, think of someone like an architect designing your house. I'm still going to be heavily involved in the design of my house, regardless of how skilled and tasteful the architect is. It's fundamentally an expression of myself - delegating that basically destroys the point of the exercise. I feel the same for a lot of the stuff I'm building with AI now.

thimabi 2 days ago | parent [-]

Can you share some examples of things you’ve been building with AI?

From your comments, I’d venture a guess that you see your AI-assisted work as a creative endeavor — an expression of your creativity.

I certainly wouldn’t get my hopes up for AI to make innovative jokes, poems and the like. Yet for things that can converge on specific guidelines for matters of taste and preferences, like coding, I’ve been increasingly impressed by how well AI models adapt to our human wishes, even when expressed in ever longer prompts.

QuadmasterXLII 2 days ago | parent | next [-]

One example: as a trial, I wanted to work out how frequently an 1400 rated chess player can get a particular opening trap. I intended to check this for all the traps, so it needed to be fast. With a surprising amount of handholding, claude code downloaded the relevant file from lichess. Its method of computing the probability was wrong, so I told it the formula to use and it got the right answer, but incredibly slowly. I asked it to precompute and cache a datas structure for accelerating these queries and it splashed around ineffectually for a long time with sqlite while I made dinner. I came back and clarified that just sorting all the games in the rating range and pickling that list of strings was a fine datastructure, then use binary search to do the probability in log(n) time. It managed to use binary search in o(n) time so I folded and wrote the hot loop myself. this got the query back to ~1 ms.

In the end the agentic coding bit was garbage, but i appreciated claude’s help on writing the boilerplate to interface with stockfish

mindwok 2 days ago | parent | prev [-]

I use AI for coding - most of the projects I've built have been fun toys (chore tracking apps, Flutter apps to help my parents), but I've also built one commercial money making app.

I do agree - the models have good taste and often do things that delight me, but there's always room for me to inject my taste. For example, I don't want the AI to choose what state management solution I use for my Flutter app because I have strong opinions about that.

aabaker99 2 days ago | parent [-]

What’s the best state management in Flutter?

mindwok 2 days ago | parent [-]

Oh no we've wandered into a flamewar...

I like Bloc the most!

heavyset_go 2 days ago | parent | prev | next [-]

Look at what happens whenever models are updated or new models come out: previous "good" prompts might not return the expected results.

What's good prompting for one model can be bad for another.

apwell23 2 days ago | parent | prev | next [-]

taste cannot be reduced to a bunch of instructions.

troupo 2 days ago | parent | prev [-]

> knowing how the models behave, simply providing more/better context and instructions can fill this gap

No.

--- start quote ---

prompt engineering is nothing but an attempt to reverse-engineer a non-deterministic black box for which any of the parameters below are unknown:

- training set

- weights

- constraints on the model

- layers between you and the model that transform both your input and the model's output that can change at any time

- availability of compute for your specific query

- and definitely some more details I haven't thought of

https://dmitriid.com/prompting-llms-is-not-engineering

--- end quote ---

2 days ago | parent | next [-]
[deleted]
A4ET8a8uTh0_v2 a day ago | parent | prev [-]

I think you are being unfairly downvoted as you raise a valid point. The real question is whether 'prompt engineering' has an edge over 'human resource management' ( as this is the obvious end goal here ). At this time, the answer is relatively simple, but I am not certain it will remain so.

prmph 2 days ago | parent | prev | next [-]

Exactly. I made a similar comment as this elsewhere on this discussion:

The old adage still applies: there is no free lunch. It makes sense that LLMs are not going to be able to take humans entirely out of the loop.

Think about what it would mean if that were the case: if people, on the basis of a few simple prompts could let the agents loose and create sophisticated systems without any further input, the there would be nothing to differentiate those systems, and thus they would lose their meaning and value.

If prompting is indeed the new level of abstraction we are working at, then what value is added by asking Claude: make me a note-taking app? A million other people could also issue this same low-effort prompt; thus what is the value added here by the prompter?

chamomeal 2 days ago | parent [-]

I’ve been thinking about that too! If you can only make an app by “vibe coding” it, then anybody else in the world with internet access can make it, too!

Although sometimes the difficult part is knowing what to make, and LLMs are great for people who actually know what they want, but don’t know how to do it

afc 2 days ago | parent | prev [-]

My thinking is that over time I can incrementally codify many of these individual "taste" components as prompts that each review a change and propose suggestions.

For example, a single prompt could tell an llm to make sure a code change doesn't introduce mutability when the same functionality can be achieved with immutable expressions. Another one to avoid useless log statements (with my specific description of what that means).

When I want to evaluate a code change, I run all these prompts separately against it, collecting their structured (with MCP) output. Of course, I incorporate this in my code-agent to provide automated review iterations.

If something escapes where I feel the need to "manually" provide context, I add a new prompt (or figure out how to extend whichever one failed).