If you are going to go to the bother of fine tuning for trivial problems like subject classification then I think you'll find Scikit Learn with a SGDClassifier on 2-grams will do probably just as well and be under 1MB for the trained classifier.

You can train it in under a minute, and it will work perfectly well on embedded devices.

Small LLMs are good choices for text classification in two cases:

- If you next to provide in-context examples and classifier based on them.

- Your classification goes beyond simple subject-type classifiers. For example, multiple choice question answering is classification where small LLM will work but traditional ML methods won't/

▲

djsjajah 5 hours ago | parent | next [-]

Not with 800 examples. If you are going to consider an ngram model, I think you are better off getting a frontier llm to write you an absurd regex.

	▲	nl 2 hours ago \| parent [-]
		Hmm maybe. Turns out the author trained a logistic-regression classifier on the embeddings too, but didn't report the results: https://github.com/thelgevold/fine-tuned-classifier/blob/mai...

▲

brokensegue 4 hours ago | parent | prev [-]

there are models between 2-grams and 600m param models that would be good options. i don't expect a 2-gram to do very well here. also i'm not sure why this model isn't a fine choice if it solves their problem

	▲	throwa356262 3 hours ago \| parent [-]
		What would you suggest instead?