| ▲ | mock-possum 7 hours ago | ||||||||||||||||||||||
> we’re not there yet, in part because of how much more powerful connected frontier models are Is that why though? You need a beast of a machine to run a functional local model in my experience. I think the big part is there’s significant sticker shock to buying capable hardware. That said, > weekend. I chose to try fine-tuning on two models, Llama 3.1 8B Instruct and Qwen 2.5 7B Instruct. At their size (around 8B) they run comfortably on a MacBook Air Perhaps I spoke too soon? Anyway > I chose the Microsoft collection as the source of training materials. The collection contains out-of-print docs published between 1977 and 2005: more than 37 million words, covering old systems and SDKs this strikes me as a very specific brand of 1995’s prose, spanning about 30 years. It’s a cool article though, so maybe that’s a forgivably clickbaity title. | |||||||||||||||||||||||
| ▲ | OJFord 7 hours ago | parent | next [-] | ||||||||||||||||||||||
> this strikes me as a very specific brand of 1995’s prose, spanning about 30 years. It's probably a fair approach to say the significant influence (training dataset) on writing at a particular time is the preceeding 30 years' material? It's certainly not only what's already written that year (nor anything since). | |||||||||||||||||||||||
| |||||||||||||||||||||||
| ▲ | mschild 7 hours ago | parent | prev [-] | ||||||||||||||||||||||
Running models locally is surprisingly easy and possible even on older hardware. Obviously not the largest, up-to-date models but for what I expect most people use them for, even on hn, there are some shockingly good models that dont require €4k machines. I have a desktop with an AMD 6900XT and 5600 with 32GB ram. Obviously no slouch but its several years old at this point. I can comfortably run qwen 3.5 9b and get a speedy 60 token/sec output with decent results. | |||||||||||||||||||||||
| |||||||||||||||||||||||