Remix.run Logo
wxw 5 hours ago

> We’re releasing ten ready-to-run agent templates for the most time-consuming work in financial services

The templates being: pitch builder, meeting preparer, earnings reviewer, model builder, market researcher, valuation reviewer, general ledger reconciler, month-end closer, statement auditor, KYC (Know Your Customer) screener.

Seems pretty scattershot. Reminds me of GPT Store.

order-matters 5 hours ago | parent | next [-]

the details are key here. there is plenty of automatable financial work, sure, but also when it comes to reporting finances/costs (formally or informally) and having a real human being be accountable for them, you REALLY need to trust that nothing is hallucinated.

Any idea how they ensure this doesnt happen? As in, how can a user verify that the model did not touch any of the numbers and that it only built pipelines for them.

what I've been telling my CFO who wants to get AI involved in things is that for a lot of accounting and finance work "Trust but verify" doesnt work because verify is often the same process as doing the work.

infecto 4 hours ago | parent | next [-]

To be honest I am having a hard time remembering the last time a LLM hallucinated in our pipelines. Make mistakes, sure but not make things up. For a daily recon process this is a solved problem imo.

fnordpiglet 2 hours ago | parent | next [-]

I see it hallucinate quite often in development but mostly in getting small details wrong that are automatically corrected by lint processes. Large scale hallucination seems better guarded but I also suspect it’s because latitude is constrained by context and harnesses like lint, type systems, as well as fine tuned tool flows in coding models to control for divergence. But I would classify making mistakes like variable names wrong or package naming or signatures wrong as hallucations.

KellyCriterion 3 hours ago | parent | prev [-]

Curious! Could you elaborate a little bit on your pipeline as we are currently looking to solve this for our internal processes in which we have to deal with lots of financial information from outside, containing mass of numbers, like annual reports, bank statements, balance sheets etc.

tyre 2 hours ago | parent [-]

Not who you’re replying for but I can give some thoughts.

For anything math, it’s much more reliable to give agents tools. So if you want to verify that your real estate offer is in the 90–95th percentile of offerings in the past three months, don’t give Claude that data and ask it to calculate. Offload to a tool that can query Postgres.

Similar with things needing data from an external source of truth. For example, what payers (insurance companies) reimburse for a specific CPT code (medical procedure) can change at any time and may be different between today and when the service was provided two months ago. Have a tool that farms out the calculation, which itself uses a database or whatever to pull the rate data.

The LLM can orchestrate and figure out what needs to be done, like a human would, but anything else is either scary (math) or expensive (it using context to constantly pull documentation.)

tomrod 4 hours ago | parent | prev [-]

> Any idea how they ensure this doesnt happen?

Build a deterministic query set and automate it for monthly or daily reporting reconcilliation.

Leave AI out of it.

GCUMstlyHarmls 5 hours ago | parent | prev | next [-]

I'll be honest, I thought the first few items on your list of time consuming work was sarcasm.

moregrist 3 hours ago | parent [-]

A recent episode of Matt Levine’s podcast (Money Stuff) covered this: apparently investment bankers spend a huge amount of time preparing pitch decks for companies that don’t want them. Apparently Claude is quite good at making a pitch deck that no one but your boss wants or cares about.

I feel like there’s a metaphor in there... maybe I’ll ask Claude about it.

infecto 5 hours ago | parent | prev | next [-]

Reads different to me. Some examples to go run with and build your own. Covers cases from the investment side and then the obvious ones in an accounting perspective. It would be highly surprising that any of these would be use in production without modification. I am sure it will happen but the intent to me is to take this and run with your own process.

rubenflamshep 5 hours ago | parent | prev [-]

I find all of these .md files released by the labs to be ai generated slop. The only exception being maybe the /simplify command

subscribed 10 minutes ago | parent | next [-]

"Claude, build me 50 skills an Account Analyst would find useful, then run them through the agent at maxxxx thinking and ship the top 10 of them"

My money's on that.

sothatsit 26 minutes ago | parent | prev | next [-]

It still surprises me how effective the /simplify skill is.

I’ve also had some great results with a /reflect skill that asks the agent to look at the work in the broader context of the project. But those are the only two skills I use regularly that aren’t specific to our company, codebase, or tools.

tantalor 4 hours ago | parent | prev [-]

No surprise there. Of course the skill files are not human written.

The AI is an expert in both following and generating prompts.

sumeno 34 minutes ago | parent [-]

Why do you think it is an expert in generating prompts? It has no additional insight into how it works internally than anyone else