Building your AI agent "toolkit" is becoming the equivalent of the perfect "productivity" setup where you spend your time reading blog posts, watching YouTube videos telling you how to be productive and creating habits and rituals...only to be overtaken by a person with a simple paper list of tasks that they work through.

Plain Claude, ask it to write a plan, review plan, then tell it to execute still works the best in my experience.

▲

cornholio 43 minutes ago | parent | next [-]

Let me give you a counterexample. I'm working on a product for the national market, and i need to do all financial tasks, invoicing, submit to national fiscal databse etc. through a local accounting firm. So i integrate their API in the backend; this is a 100% custom API developed by this small european firm, with a few dozen restful enpoints supporting various accounting operations, and I need to use it programmatically to maintain sync for legal compliance. No LLM ever heard of it. It has a few hundred KB of HTML documentation that Claude can ingest perfectly fine and generate a curl command for, but i don't want to blow my token use and context on every interaction.

So I naturally felt the need to (tell Claude to) build a MCP for this accounting API, and now I ask it to do accounting tasks, and then it just does them. It's really ducking sweet.

Another thing I did was, after a particularly grueling accounting month close out, I've told Claude to extract the general tasks that we accomplished, and build a skill that does it at the end of the month, and now it's like having a junior accountant in at my disposal - it just DOES the things a professional would charge me thousands for.

So both custom project MCPs and skills are super useful in my experience.

	▲	endofreach 11 minutes ago \| parent [-]
		What exactly does it do that a professional would charge you thousands for? (I'm genuinely asking)

▲

lanthissa an hour ago | parent | prev | next [-]

its not though if you're working in a massive codebase or on a distributed system that has many interconnected parts.

skills that teach the agent how to pipe data, build requests, trace them through a system and datasources, then update code based on those results are a step function improvement in development.

ai has fundamentally changed how productive i am working on a 10m line codebase, and i'd guess less than 5% of that is due to code gen thats intended to go to prod. Nearly all of it is the ability to rapidly build tools and toolchains to test and verify what i'm doing.

▲

sillysaurusx an hour ago | parent [-]

But... plain Claude does that. At least for my codebase, which is nowhere close to your 10m line. But we do processing on lots of data (~100TB) and Claude definitely builds one-off tools and scripts to analyze it, which works pretty great in my experience.

What sort of skills are you referring to?

	▲	jmalicki 20 minutes ago \| parent [-]
		If you build up and save some of those scripts, skills help Claude remember how and when to use them. Skills are crazy useful to tell Claude how to debug your particular project, especially when you have a library of useful scripts for doing so.

▲

thisrobot an hour ago | parent | prev | next [-]

This resonates with me. Sometimes I build up some artifacts within the context of a task, but these almost always get thrown away. There are primarily three reason I prefer a vanilla setup.

1. I have many and sometimes contradictory workflows: exploration, prototyping, bug fixing debugging, feature work, pr management, etc. When I'm prototyping, I want reward hacking, I don't care about tests or lint's, and it's the exact opposite when I manage prs.

2. I see hard to explain and quantify problems with over configuration. The quality goes down, it loses track faster, it gets caught in loops. This is totally anecdotal, but I've seen it across a number of projects. My hypothesis is that is related to attention, specifically since these get added to the system prompt, they pull the distribution by constantly being attended to.

3. The models keep getting better. Similar to 2, sometime model gains are canceled out by previously necessary instructions. I hear the anthropic folks clear their claude.md every 30 days or so to alleviate this.

	▲	adshotco 22 minutes ago \| parent [-]
		[dead]

▲

obsidianbases1 3 hours ago | parent | prev | next [-]

Lots of money being made by luring people into this trap.

The reality is that if you actually know what you want, and can communicate it well (where the productivity app can be helpful), then you can do a lot with AI.

My experience is that most people don't actually know what they want. Or they don't understand what goes into what they want. Asking for a plan is a shortcut to gaining that understanding.

▲

pfortuny 3 hours ago | parent | prev | next [-]

Emacs init file bikeshedding comes to mind…

▲

wilkystyle 3 hours ago | parent [-]

but now you can build your AI agent toolkit to work on your init file for you

	▲	dotancohen an hour ago \| parent [-]
		My init.el file went from some 300 lines to under 50 with Claude's assistance. Some of that had to do with updating Emacs, but I really only use Emacs for Org mode so that contribution was minimal.

▲

ctoth 2 hours ago | parent | prev | next [-]

> Plain Claude, ask it to write a plan, review plan, then tell it to execute still works the best in my experience.

Working on an unspecified codebase of unknown size using unconfigured tooling with unstated goals found that less configuration worked better than more.

▲

sockgrant 2 hours ago | parent | prev | next [-]

if you work on platforms, frameworks, tools that are public knowledge, then yeah. If there’s nothing unique to your project or how to write code in it, build it, deploy it, operate it, yeah.

But for some projects there will be things Claude doesn’t know about, or things that you repeatedly want done a specific way and don’t want to type it in every prompt.

▲

dominotw 2 hours ago | parent | prev [-]

example https://news.ycombinator.com/item?id=47501214