▲ | andrethegiant 5 days ago | ||||||||||||||||||||||
I'm working on pure.md[1], which lets your scripts, APIs, apps, agents, etc reliably access web content in markdown format. Simply prefix any URL with `pure.md/` and you get the unblocked markdown content of that webpage. It avoids bot detection and renders JavaScript-heavy websites, and can convert HTML, PDFs, images, and more into pure markdown. pure.md acts as a global caching layer between LLMs and web content. I like to think of it like a CDN for LLMs, similar to how Cloudinary is a CDN for images. [1] https://pure.md | |||||||||||||||||||||||
▲ | shoebham 5 days ago | parent | next [-] | ||||||||||||||||||||||
Love the recursion redirect at pure.md/pure.md | |||||||||||||||||||||||
| |||||||||||||||||||||||
▲ | WillAdams 4 days ago | parent | prev | next [-] | ||||||||||||||||||||||
It seems to miss URLs? At: https://willadams.gitbook.io/design-into-3d/2d-drawing the links for: https://mathcs.clarku.edu/~djoyce/java/elements/elements.htm... https://mathcs.clarku.edu/~djoyce/java/elements/bookI/bookI.... https://mathcs.clarku.edu/~djoyce/java/elements/bookI/defI1.... are rendered as: _Elements_ _:_ _Book I_ _:_ _Definition 1_ Maybe detect when a page is on gitbook or some other site where there is .md source on github or some other site and grab the original instead? | |||||||||||||||||||||||
| |||||||||||||||||||||||
▲ | metadat 4 days ago | parent | prev | next [-] | ||||||||||||||||||||||
Cool project! Recently discussed, too: https://news.ycombinator.com/item?id=43462894 (10 comments) | |||||||||||||||||||||||
▲ | wild_egg 5 days ago | parent | prev | next [-] | ||||||||||||||||||||||
Thanks for sharing. I was planning on building something like this in April after hitting too many issues with Jina and Tavily but it looks like you've already done the hard work! | |||||||||||||||||||||||
| |||||||||||||||||||||||
▲ | wanderingbit 5 days ago | parent | prev | next [-] | ||||||||||||||||||||||
What a great idea, I will soon be a paying customer. This solves a problem of an app I'm using that I was hesitant to try to develop myself. | |||||||||||||||||||||||
| |||||||||||||||||||||||
▲ | hardlyfun 5 days ago | parent | prev | next [-] | ||||||||||||||||||||||
Very nice, how did you manage to bypass sites with cloudflare turnstile setup? | |||||||||||||||||||||||
| |||||||||||||||||||||||
▲ | erekp 4 days ago | parent | prev | next [-] | ||||||||||||||||||||||
how do you exactly fallback to common crawl? isn't the cost to even hold and query common crawl insane? | |||||||||||||||||||||||
| |||||||||||||||||||||||
▲ | m0rde 4 days ago | parent | prev | next [-] | ||||||||||||||||||||||
Is there an example we can see? | |||||||||||||||||||||||
| |||||||||||||||||||||||
▲ | sharpshadow 3 days ago | parent | prev | next [-] | ||||||||||||||||||||||
Works great on mobile thanks, helpful tool to bypass flaky websites, js and even some paywalls. | |||||||||||||||||||||||
▲ | udev4096 5 days ago | parent | prev [-] | ||||||||||||||||||||||
[flagged] | |||||||||||||||||||||||
|