Remix.run Logo
cuku0078 2 hours ago

What specific resources are we referring to here? Are AI vendors re-crawling the whole blog repeatedly, or do they rely on caching primitives like ETag/If-Modified-Since (or hashes) to avoid fetching unchanged posts? Also: is the scraping volume high enough to cause outages for smaller sites?

Separately, I see a bigger issue: blog content gets paraphrased and reproduced by AIs without clearly mentioning the author or linking back to the original post. It feels like you often have to explicitly ask the model for sources before it will surface the exact citations.