Remix.run Logo
rickette 8 hours ago

LLMs.txt is also nonsense since it isn't adopted by any of the major AI players.

networked 7 hours ago | parent | next [-]

Google has recently added `llms.txt` to Chrome's Lighthouse check for agentic browsing (https://searchengineland.com/google-llms-txt-chrome-lighthou...), so adoption may be coming. Admittedly, I put more faith in

  <link rel="alternate" type="text/markdown" href="https://example.com/foo.md" title="Markdown version of the &lt;Foo&gt; page">
that I copied from Gwern.net. This convention is discoverable (just read the HTML) and naturally adapts to any website size and structure.

I have created an `llms.txt` for my website anyhow. I use a fixed LLM prompt to generate it from the internal links in `index.md`.

iamacyborg 6 hours ago | parent [-]

Giving a markdown version of a page seems like an interesting choice instead of just embedding a schema marked up one

vidarh 6 hours ago | parent | next [-]

Every page on code.claude.com has a markdown version available by just appending ".md", and Claude Code knows about it. E.g:

https://code.claude.com/docs/en/overview and

https://code.claude.com/docs/en/overview.md

9dev 6 hours ago | parent | next [-]

After some consideration, I also applied this convention to every site I build - including content negotiation: Clients can either send an Accept header with their preference, or append an explicit extension (.md|.markdown for Markdown, .json for JSON API responses, or .html for the human HTML page). Together with the content negotiation part, it feels very much like HTTP was intended to work - especially the fact that API clients, AI agents, and humans all use the same URLs, but get the content in the shape they need.

vidarh 5 hours ago | parent [-]

I've done this off and on for various sites over the years too, and probably should be more consistent about it. A number of sites do or used to do some variation of this, and I wish it was more widespread. E.g. Reddit will serve up a json version of a sub-dreddit if you do /r/subreddit.json

mceachen an hour ago | parent | prev | next [-]

Here's how to do it with more recent versions of Hugo:

https://photostructure.com/coding/hugo-markdown-output/

(It includes the grandparent's head link suggestion, but it's not just "change .html to .md" because I'm old skool and as a wee nerd was told that URLs ending in .html or .php or whatever we're frowned on, so the above link's markdown is available by appending /index.md )

6 hours ago | parent | prev [-]
[deleted]
chrisweekly 2 hours ago | parent | prev [-]

It gets even more "interesting" for markdown-based systems like Astro or Obsidian Publish: author in md -> ship html && optionally serve md?

dspillett 7 hours ago | parent | prev | next [-]

The same could be said of robots.txt

And anything else that might tell them not to access something.

reddalo 4 hours ago | parent [-]

robots.txt predates the modern web though

dspillett 14 minutes ago | parent [-]

My point was that llms.txt not working is no different from them ignoring everything else that came before and probably everything that is yet to come.

If they want it, they will take it, polite directives in text files will have no effect.

pfannl 3 hours ago | parent | prev [-]

To be fair, "not adopted by any major AI player" is probably the most web-standard-compliant phase of a new web standard.