Remix clone Hacker News

new | show | ask | jobs Github

	▲	minimaxir 6 days ago
		ModernBERT may be a better base model if finetuning a model for a specific use case: https://huggingface.co/blog/modernbert
	▲	diwank 6 days ago \| parent [-]
		also ettin is a new favorite and a solid alternative: https://huggingface.co/jhu-clsp/ettin-encoder-1b I'd encourage you to give setfit a try, along with aggressively deduplicating your training set, finding top ~2500 clusters per label, and using setfit to train multilabel classifier on that. Either way- would love to know what worked for you! :)