| ▲ | gnfargbl 8 hours ago | |
Deterministic scrapers are almost certainly the right answer for this task, because once those special snowflakes have paid for their bespoke IT system, they'll never change it. On the grind, why not get an agent to help you build the long tail of deterministic scrapers? Claude etc is really shockingly good at this kind of moderate-complexity iterative work, it will just keep going around the fetch/parse/understand loop until it has what you're looking for. | ||
| ▲ | mebkorea 8 hours ago | parent [-] | |
Yeah, that's essentially what I'm doing. Claude handles most of the look at the portal, work out the search form, write the config loop. The actual bottleneck isn't code tbh, it's that every (snowflake) council needs like 30+ minutes of investigation before you can even get going, and a chunk deadend because the portal's broken or migrated. I already hit three this morning. Worcester returns connection refused, Breckland's URL is dead, Rother migrated to a different platform. The grind is "is this portal even alive" more than the scraper itself. | ||