llms.txt Generator for AI Agents

Build the curated Markdown index that ChatGPT, Claude, Perplexity and Gemini read in one fetch, with best-practice checks.

This llms.txt generator builds the small Markdown file AI agents like ChatGPT, Claude, Perplexity and Gemini read to understand your site in a single fetch. It lives at the root of your domain, right next to robots.txt, but does the opposite job: instead of blocking crawlers, it hands an agent a curated shortlist of the pages worth reading. Fill in a project name, a one-line summary, a context paragraph, then group your real links under Docs, Examples and Optional. The file rebuilds as you type, with no submit button and nothing sent to a server. A Best-practice checks tab watches over your shoulder: it flags a missing H1 or summary, counts your links and sections, and estimates the token cost the way a model would see it. You also get the longer llms-full.txt scaffold for inlining content, a live Markdown preview, and copy or download buttons. Everything stays in your browser.

100% in your browser. Nothing you type ever leaves this page.

llms.txt builder for AI agents

Think of llms.txt as a note you leave for an AI agent before it wanders off and invents things about your site. Tiny Markdown file. Jeremy Howard floated the idea in 2024, and by 2026 the big agents (ChatGPT, Claude, Perplexity, Gemini) actually read it. Sitemaps and robots.txt get written for crawlers. This one's for the model. Fill in the form and I'll build your llms.txt right here in the browser, plus the longer llms-full.txt if you want it. You point at the pages that matter, in the order they matter, and the bot quits scraping your noisy homepage to guess the answer.

Project name *

Site URL

One-line summary (after > in the file)

Context paragraph (free text, multi-line)

Docs / primary pages

Examples / use cases

Optional / nice-to-have

Drop the file you get at https://yourdomain.com/llms.txt and that's it, you're done. Skip llms-full.txt unless you actually want to paste in your docs instead of just linking to them. That's about the only time it earns its keep.

What an llms.txt generator does, and why it matters in 2026

An llms.txt generator builds the small Markdown file an AI agent reads to understand your site in a single fetch. It's a plain Markdown file. Jeremy Howard proposed it late in 2024, and the AI crowd picked it up fast through 2025 and on into 2026. It lives at the root of your domain, right next to robots.txt, except it does the opposite job. Where robots.txt says "stay out of these paths," this one says "here are the pages worth reading if someone asks about us." A normal file is barely anything. Your project name. A one-line summary, a paragraph or two of context, then a short list of links grouped under headings like Docs, Examples, Optional. Ask an assistant about your product and it can pull this one little file in a single fetch and know exactly where the real answer lives, instead of scraping your homepage or leaning on some stale Common Crawl snapshot from eight months back.

Here's why the shape of it matters: models read tokens, not your CSS. Thirty lines of Markdown with ten hand-picked links is dirt cheap to swallow. Crawling 200 pages and guessing which one's canonical, not so much. ChatGPT's browsing mode, Perplexity, Claude with web access, a pile of open-source agent frameworks, they all check for /llms.txt now when they're building an answer. And honestly, from what I've seen, the sites that ship a clean one tend to get quoted more, and quoted with the right context, than the ones that just leave the agent to fend for itself. Could be selection bias on my part. But that's the pattern.

How this llms.txt generator works

Everything happens as you type. The file rebuilds itself in the browser, no submit button, no round trip to a server. You need four things to start: a project name, the site URL, a one-line summary, a paragraph of context in your own words. Then three optional link lists. Docs is your real documentation. Examples is the hands-on stuff, your tutorials and walkthroughs, the "here's how you actually do it" pages. Optional catches the nice-to-haves that don't fit either bucket. Every link gets a short title (that's the anchor) and a one-line description the agent reads sitting right next to the URL. While you edit, the Best-practice checks tab is watching over your shoulder. It'll flag what's missing, tell you when a link's run too long, point out when two descriptions are basically the same sentence, give you a rough token cost for the whole thing.

Start with the H1 line: one hash, then your project name. It has to be the very first line. No exceptions, the spec is strict about that.
Add the summary: a blockquote (the line starting with >). This is the one sentence an agent skims to decide whether your file's even worth reading, so make it count.
Add context: a paragraph or two, plain Markdown, saying what you do and what kind of questions this file's here to answer.
Group links under H2 headings: Docs, Examples, Optional, or whatever buckets fit you. Each entry's a bullet with a Markdown link and a description after a colon.
Optionally publish llms-full.txt: the long cousin, with your actual docs pasted in. Reach for it only when you'd rather the agent answer straight off without going back for more.

Common use cases for publishing llms.txt

SaaS documentation site. Point at your "Getting started," "Reference," "Pricing" and "Support" pages, and an assistant can field "how do I do X with your product" without ever wading through your marketing copy.
API publisher. Hand the agent your OpenAPI spec, the SDK docs, the changelog. Anything that helps a developer wire up your API lands on the right page in one shot, no digging.
Open-source library. List the README, CONTRIBUTING, the API reference, the FAQ. Coding assistants like Claude Code and ChatGPT Code Interpreter lean on exactly these to answer "how do I install this" and "what's the current version".
Knowledge base or help centre. Flag the categories that actually matter (refunds, account, security) so your support bot grounds its answers in your real content instead of improvising.
Tools hub like PeopleAreGeek. Sort tools by hub (Network, SEO, Cyber) and an agent fielding "is my DNS propagated yet" sends the person straight to the propagation checker.
Personal portfolio or blog. Drop in your best-known articles and a short bio. Asked "who is your name," the agent reads that instead of scraping some half-abandoned profile from years ago.

Limitations and adoption notes

Let me be straight with you: this thing is still young. Adoption's climbing fast in 2026, but plenty of agents don't bother checking yet, and nothing forces them to. An agent can ignore your llms.txt exactly the way it can ignore robots.txt. So treat it as a strong invitation, not a binding contract. The upside is that the cost is almost nothing. Most files come in under 100 lines, and publishing one is just dropping a single Markdown file at the root of your domain. The trap, the one that gets people, is going stale. Reshuffle your docs and forget to update this, and you'll have agents cheerfully pointing folks at dead anchors. As for llms-full.txt: it buys you fewer follow-up fetches at the price of a bigger file. So only ship it if you genuinely want agents chewing through long content in one go. And it all stays on your machine. Your name, the summary, the paragraph, the whole link list, handled right here in the browser. Nothing gets sent to PeopleAreGeek. The full proposal lives at llmstxt.org if you want the source.

Frequently asked questions

How is llms.txt different from robots.txt and sitemap.xml?

Different jobs entirely. robots.txt tells crawlers where not to go. sitemap.xml dumps every URL you've got so search engines can index them. llms.txt is the hand-picked, human-written shortlist of the pages that actually matter to an AI agent. Short, opinionated, written to be read by a model instead of just parsed by a machine.

Do I need llms.txt if I already have a sitemap?

They're not interchangeable. A sitemap is for Google to index you for search. llms.txt is for an AI agent to answer a question about you without crawling the whole place first. Two different jobs. Honestly, most teams I've seen just publish both and move on.

Where do I upload the generated file?

Root of your domain, at https://yourdomain.com/llms.txt, same place robots.txt lives. Then hit that URL with curl to confirm it's really being served and not 404ing or redirecting somewhere weird. A few agents will also peek at https://yourdomain.com/llms-full.txt if you've published the longer one.

Should the file include every page on my site?

Please don't. The whole point is that it's curated, not a phone book. Put in the handful of entry points an agent needs to field the questions people actually ask about you, and let sitemap.xml carry the rest. A bloated llms.txt just drowns the links that mattered.

Will AI agents always honour what I put in llms.txt?

About as reliably as they honour robots.txt, which is to say the big names (Perplexity, ChatGPT, Claude, Gemini) read it and the random anonymous scraper probably won't. No enforcement here, no cop standing at the door. It's a strong invitation that the well-behaved agents accept, and honestly that's most of the traffic you actually care about.