| ▲ | xtiansimon 9 hours ago | |
Very good for a budget. And as a text file, they rot just about the same speed as the media. But what about basic Cost Of Goods Eaten? I have fading thermal tapes in boxes with grocery store purchases. They get scanned once a year into large PDFs: grocery, home goods, repairs (large purchases are kept separately for easier finding). I’m considering if a personal AI subscription to manage the data interrogation is worth the cost (not excited about the $20/mo cost. NPR should get the next $5 of my monthly). Now here’s the funny part. The data sits in a box all year or in PDFs for years, and gets little attention. What janky home server AI could I spin up to perform as bad as me (but no worse)? Maybe move the data in those text files and PDFs into SQLite? | ||
| ▲ | wongarsu 9 hours ago | parent | next [-] | |
If you just want to ingest varied data into a consistent format, qwen2.5vl:7b works well (in my use cases better than qwen3vl). The ollama version is quantized, perfectly adequate, and runs on normal consumer hardware (even more so if you don't care about speeds that feel interactive) | ||
| ▲ | mrngm 7 hours ago | parent | prev [-] | |
paperless-ngx and their built-in OCR might help here for the data transformation. | ||