Remix.run Logo
jmward01 a day ago

The Programmatic Tool Calling has been an obvious next step for a while. It is clear we are heading towards code as a language for LLMs so defining that language is very important. But I'm not convinced of tool search. Good context engineering leaves the tools you will need so adding a search if you are going to use all of them is just more overhead. What is needed is a more compact tool definition language like, I don't know, every programming language ever in how they define functions. We also need objects (which hopefully Programatic Tool Calling solves or the next version will solve). In the end I want to drop objects into context with exposed methods and it knows the type and what is callable on they type.

fny a day ago | parent | next [-]

Why exactly do we need a new language? The agents I write get access to a subset of the Python SDK (i.e. non-destructive), packages, and custom functions. All this ceremony around tools and pseudo-RPC seems pointless given LLMs are extremely capable of assembling code by themselves.

delaminator 14 hours ago | parent | next [-]

I'm imagining something more like Rexx with quite high level commands. But that certainly blurs the line between programming language and shell.

The reason for choosing higher level constructs is token use. We certainly reduce the number of tokens by using a shell like command language, But of course that also reduces expressiveness.

I've been meaning to get round to Plan 9 style where the LLM reads and writes from files rather than running commands. I'm not sure whether that's going to be more useful than just running commands. is for an end user because they only have to think about one paradigm - reading/writing files.

never_inline 18 hours ago | parent | prev | next [-]

Does this "non destructive subset of python SDK" exist today, without needing to bring, say, a whole webassembly runtime?

I am hoping something like CEL (with verifiable runtime guarantees) but the syntax being a subset of Python.

FridgeSeal a day ago | parent | prev [-]

Woah woah woah, you’re ignoring a whole revenue stream caused by deliberately complicating the ecosystem, and then selling tools and consulting to “make it simpler”!

Think of all the new yachts our mega-rich tech-bros could have by doing this!

zeroq a day ago | parent | next [-]

my VS fork brings all the boys to the yard and they're like it's better than yours, damn right, it's better than yours

rekttrader 19 hours ago | parent | next [-]

I can teach you, but I’ll have to charge

checker659 21 hours ago | parent | prev [-]

This is the most creative comment I've read on HN as of late.

DANmode 19 hours ago | parent | next [-]

…don’t read many comments?

CSSer 17 hours ago | parent [-]

A social lesson: don't yuck other people's yum.

DANmode 15 minutes ago | parent [-]

If your “yum” lowers the quality of a community I’m in, welp,

zeroq 20 hours ago | parent | prev [-]

<3

Thanks, most of the times when I do that people tell me to stop being silly and stop saying nonsense.

¯\_(ツ)_/¯

ElasticBottle 2 hours ago | parent [-]

don't listen to the naysayers! Had a chuckle as well LoL

DANmode 14 minutes ago | parent [-]

The humor wasn’t what was nay—sayed.

Someone said it was a creative comment.

I blew air through my nose, too,

but let’s be real.

dalemhurley 16 hours ago | parent | prev [-]

Tool search is formalising what a lot of teams have been working towards. I had previously called it tool caller, the LLM knew there was tools for domains and then when the domain was mentioned, the tools for the domain would be loaded, this looks a bit smarter.

mirekrusin a day ago | parent | prev | next [-]

Exactly, instead of this mess, you could just give it something like .d.ts.

Easy to maintain, test etc. - like any other library/code.

You want structure? Just export * as Foo from '@foo/foo' and let it read .d.ts for '@foo/foo' if it needs to.

But wait, it's also good at writing code. Give it write access to it then.

Now it can talk to sql server, grpc, graphql, rest, jsonrpc over websocket, or whatever ie. your usb.

If it needs some tool, it can import or write it itself.

Next realisation may be that jupyter/pluto/mathematica/observable but more book-like ai<->human interaction platform works best for communication itself (too much raw text, I'd take you days to comprehend what it spit out in 5 minutes - better to have summary pictures, interactive charts, whatever).

With voice-to-text because poking at flat squares in all of this feels primitive.

For improved performance you can peer it with other sessions (within your team, or global/public) - surely others solved similar problems to yours where you can grab ready solutions.

It already has ablity to create tool that copies itself and can talk to a copy so it's fair to call this system "skynet".

cjmcqueen 13 hours ago | parent [-]

Skynet is exactly where I thought this was heading...

zbowling 5 hours ago | parent | prev | next [-]

I specifically built this as an MCP server. It works like an MCP server that proxies to other MCP servers and converts the tool defintions in to typescript anotations and asks your llm to generate typescript that runs in a restricted VM to make tools calls that way. It's based on the apple white paper on this topic from last year. https://github.com/zbowling/mcpcodeserver

menix a day ago | parent | prev | next [-]

The latest MCP specifications (2025-06-18+) introduced crucial enhancements like support for Structured Content and the Output Schema.

Smolagents makes use of this and handles tool output as objects (e.g. dict). Is this what you are thinking about?

Details in a blog post here: https://huggingface.co/blog/llchahn/ai-agents-output-schema

jmward01 a day ago | parent [-]

We just need simple language syntax like python and for models to be trained on it (which they already mostly are):

class MyClass(SomeOtherClass):

  def my_func(a:str, b:int) -> int: 

    #Put the description (if needed) in the body for the llm.
That is way more compact than the json schema out there. Then you can have 'available objects' listed like: o1 (MyClass), o2 (SomeOtherClass) as the starting context. Combine this with programatic tool calling and there you go. Much much more compact. Binds well to actual code and very flexible. This is the obvious direction things are going. I just wish Anthropic and OpenAI would realize it and define it/train models to it sooner rather than later.

edit: I should also add that inline response should be part of this too: The model should be able to do ```<code here>``` and keep executing with only blocking calls requiring it to stop generating until the block frees up. so, for instance, the model could ```r = start_task(some task)``` generate other things ```print(r.value())``` (probably with various awaits and the like here but you all get the point).

schmuhblaster 12 hours ago | parent | prev | next [-]

I've been experimenting with giving the LLM a Prolog-based DSL, used in a CodeAct style pattern similar to Huggingface's smolagents. The DSL can be used to orchestrate several tools (MCP or built in) and LLM prompts. It's still very experimental, but a lot of fun to work with. See here: https://github.com/deepclause/deepclause-desktop.

ctoth a day ago | parent | prev | next [-]

I'm not sure that we need a new language so much as just primitives from AI gamedev, like behavior trees along with the core agentic loop.

sandbags 16 hours ago | parent [-]

After implementing a behaviour tree library and realising the power of select & sequence I found myself wondering why they aren’t used more widely.

I’ve never done anything in crypto but watched in horror as people created immutable contracts with essentially Javascript programs. Surely it would be much easier to reason about/verify scripts written as a behaviour tree with a library of queries and actions. Even being able to limit the scope of modifications would be a win.

delaminator 14 hours ago | parent [-]

Seeing as it was your inspiration, here is a summary of a discussion with Claude on this topic. (not the crypto part.)

https://claude.ai/public/artifacts/2b23f156-c9b5-42df-9a83-f...

vendiddy 15 hours ago | parent | prev | next [-]

Giving the AI an actual programming language (functions + objects) genuinely does seem like a good alternative to the MCP mess we have right now.

stingraycharles a day ago | parent | prev | next [-]

Reminds me a bit of the problem that GraphQL solves for the frontend, which avoids a lot of round-trips between client and server and enables more processing to be done on the server before returning the result.

politelemon 21 hours ago | parent [-]

And introduce a new set of problems in doing so.

malnourish 12 hours ago | parent [-]

Complexity doesn't go away, it just moves somewhere else.

knowsuchagency a day ago | parent | prev | next [-]

I completely agree. I wrote an implementation of this exact idea a couple weeks ago https://github.com/Orange-County-AI/MCP-DSL

user3939382 a day ago | parent | prev [-]

Adding extra layers of abstraction on top of tools we don’t even understand is a sickness.