Remix.run Logo
perching_aix 6 days ago

Is it time for me to finally package a language model into my Lambda deployment zips and cut through the corporate red tape at my place around AI use?

Update #1:

Tried it. Well, dreams dashed - would now fit space wise (<250 MB despite the name), but it sadly really doesn't seem to work for my specific prospective workload.

I'd have wanted it to perform natural-language to command-invocation translation (or better, emit me some JSON), but it's super not willing to do that, not in the lame way I'm trying to make it do so at least (literally just prompting it to). Oh well.

Update #2:

Just found out about grammar-constrained decode, maybe there's still hope for me in the end. I don't think I can amend this comment today with any more updates, but will see.

dmayle 6 days ago | parent [-]

Did you finetune it before trying? Docs here:

https://ai.google.dev/gemma/docs/core/huggingface_text_full_...

Workaccount2 6 days ago | parent | next [-]

How well does using a SOTA model for fine-tuning work? I'm sure people have tried

perching_aix 6 days ago | parent | prev [-]

Thanks, will check that out as well tomorrow or during the weekend!

canyon289 6 days ago | parent [-]

If you know you want JSON for sure constrained decoding in an inference framework will help. The model is just one part of an overall inference system. I hope this model paired with other tools help you get done whatever it is you're looking to get done