Remix.run Logo
jdauriemma 18 hours ago

> they're performing at at least graduate student level across most tasks

I strongly disagree with this characterization. I have yet to find an application that can reliably execute this prompt:

"Find 90 minutes on my calendar in the next four weeks and book a table at my favorite Thai restaurant for two, outside if available."

Forget "graduate-level work," that's stuff I actually want to engage with. What many people really need help with is just basic administrative assistance, and LLMs are way too unpredictable for those use cases.

DanMcInerney 16 hours ago | parent | next [-]

This is absolutely doable right now. Just hook claude code up with your calendar MCP server and any one of these restaurant/web browser MCP servers and it'll do this for you.

https://apify.com/canadesk/opentable/api/mcp https://github.com/BrowserMCP/mcp https://github.com/samwang0723/mcp-booking

babelfish 17 hours ago | parent | prev [-]

OpenAI Operator can do that task easily, assuming you've configured it with your calendar and Yelp login.

jdauriemma 17 hours ago | parent [-]

That's great to hear - do you know what success rate it might have? I've used scheduled tasks in ChatGPT and they fail regularly enough to fall into the "toy" category for me. But if Operator is operating significantly above that threshold, that would be remarkable and I'd gladly eat my words.