Remix.run Logo
ahamilton454 6 hours ago

Yeah exactly, I literally don’t know how to change my spec until I’ve gathered more data.

I was building a transaction classifier recently and I initially thought it would be a trivial “solved” problem. Throw transactions into a tiny local LLM, let it classify. But that approach was too slow, and not accurate enough. I didn’t know that though until I tried and then needed to change the spec.

woadwarrior01 4 hours ago | parent | next [-]

You'd probably get much further along by fine tuning a small BERT style encoder model based classifier for it. IMO, even something as simple as training a linear classifier on the CLS token embeddings from a frozen encoder might work.

ahamilton454 3 hours ago | parent [-]

Yeah, Ive tried a bi-encoder, cross encoder and some small LLMs so far. I think I’ll do BERT soon too

spwa4 6 hours ago | parent | prev [-]

So wait ... you're not even going to train based on what you want, just "throw into"? Did you actually put in work on a very clear and accurate prompt with a full manual on what to do?

ahamilton454 3 hours ago | parent | next [-]

Throwing a tiny little LLM at it helped me assess that it was far too slow for me to reasonably use at the scale I needed. So it didn’t really matter how accurate the prompt was. I was more just pointing out that I didn’t know if would be too slow without trying it. I maybe could have done some simple math in retrospect, but trying it out was easy enough

Exoristos 6 hours ago | parent | prev [-]

Not every lottery winner has a detailed strategy.