llms.txt for Medical Websites: What It Is and How to Set It Up for Your Clinic

What llms.txt is and why it matters for clinics

Fewer than 8% of medical websites have an llms.txt file, according to Clingeo audit data covering over 334 clinics. That gap is a concrete opportunity — because clinics that do have a correctly structured file show 20–35% higher citation rates in AI search results than those that don't.

llms.txt is a plain-text file placed at the root of your domain (e.g., yourclinic.com/llms.txt). It describes your organisation in a structured, machine-readable format intended for AI systems. Think of the relationship this way: sitemap.xml tells search engine crawlers where your pages are; llms.txt tells AI systems who you are and what you do.

The convention was introduced by Answer.AI founder Jeremy Howard in 2024 as a practical way for websites to communicate identity and context to large language models. It is not an official W3C or IETF standard, but it has been adopted by Anthropic for ClaudeBot indexing and is increasingly respected by GPT-4 browsing. The specification is maintained at llmstxt.org.

Medical websites benefit from this file more than most sectors. Healthcare content is classified as YMYL (Your Money or Your Life) — content where accuracy directly affects a person's health decisions. AI systems apply stricter selection criteria to YMYL sources. A well-formed llms.txt gives AI models explicit confidence signals: your clinic's name, specialties, credentials, and physician identities are all stated directly rather than having to be inferred from fragmented page content.

If you want a broader view of how technical setup connects to visibility, see our guide on technical GEO for medical websites.

How AI crawlers use llms.txt (vs robots.txt)

The two files serve different purposes and work best when used together. robots.txt controls access — it tells crawlers which paths they may or may not visit. llms.txt provides context — it tells AI systems what your organisation is, what it treats, and who works there. One manages permissions; the other manages understanding.

The main AI crawlers active today are:

Crawler	Parent company	What it indexes	Typical re-index frequency
GPTBot	OpenAI	Training data and browsing context for ChatGPT	Every 4–8 weeks (active sites)
ClaudeBot	Anthropic	Training data for Claude models; reads llms.txt explicitly	Every 4–8 weeks (active sites)
PerplexityBot	Perplexity AI	Real-time retrieval for Perplexity search answers	Frequent — near-real-time on priority sources
Google-Extended	Google	AI training data for Gemini and Google SGE	Variable; aligns with Googlebot schedule

GPTBot was launched by OpenAI in August 2023. To ensure these crawlers can access your llms.txt (and the rest of your site), add a dedicated AI bot block to your robots.txt. Here is a clean example:

# AI crawlers — allow full access
User-agent: GPTBot
Allow: /

User-agent: ClaudeBot
Allow: /

User-agent: Anthropic-AI
Allow: /

User-agent: PerplexityBot
Allow: /

User-agent: Google-Extended
Allow: /

User-agent: OAI-SearchBot
Allow: /

If your robots.txt currently has a catch-all disallow or blocks specific bots, check that none of the above are inadvertently blocked. This is one of the most common technical issues uncovered when you audit your AI visibility.

What a clinic's llms.txt should contain: the 6 required sections

Keep the file focused. AI crawlers that read llms.txt as part of a RAG (Retrieval-Augmented Generation) pipeline work best with concise, structured content. Aim for 400–800 words total. The six sections below cover everything an AI system needs to understand your clinic and cite it correctly.

Section 1: Organisation name and one-line description

State your full legal or trading name and a single sentence that describes what your clinic does, who it serves, and where it is. This becomes the primary named entity that AI systems attach all subsequent facts to.

Section 2: Location(s) with full address

Include the full postal address for each location. AI systems use this for local queries such as "dermatologist near Kyiv city centre". Consistent formatting with your Schema.org MedicalOrganization markup helps AI models recognise the same entity across sources.

Section 3: Medical specialties offered

List each specialty as a separate line item. Use the same terminology as your MedicalProcedure schema and your page headings. Inconsistent names (e.g., "orthopaedic surgery" in llms.txt vs "orthopaedics" in Schema) reduce confidence in named entity recognition.

Section 4: Physician names, specialties, and profile URLs

This is the highest-value section for medical websites. When someone asks an AI system "who is the best cardiologist in [city]?", the AI needs a doctor's full name, specialty, and a URL that proves the claim. For each physician, provide: full name, specialty, and the URL of their profile page.

Section 5: Key content pages

List URLs for your most important procedure pages, condition pages, and FAQs. These are the pages you want AI systems to pull from when answering patient questions. Do not include booking URLs, pricing pages, or admin paths — these have no value for AI retrieval and add noise.

Section 6: Accreditations, certifications, and licences

State any accreditations (JCI, ISO, national health ministry registration) and their issuing bodies. This is a direct E-E-A-T signal. AI models weigh accreditation data heavily when deciding whether a medical source is trustworthy enough to cite.

A complete llms.txt example for a multi-specialty clinic

Below is a full annotated example. Replace the placeholder values with your clinic's real data. The file uses plain Markdown formatting — no special syntax is required.

# Organisation
name: Meridian Medical Centre
description: Multi-specialty outpatient clinic in Kyiv, Ukraine, providing cardiology, orthopaedics, and dermatology services since 2009.
website: https://meridianmed.ua
language: uk, en

# Locations
- name: Kyiv Main
  address: 12 Khreshchatyk Street, Kyiv, 01001, Ukraine
  phone: +380 44 123 4567
  email: info@meridianmed.ua

# Medical Specialties
- Cardiology
- Orthopaedic Surgery
- Dermatology
- Internal Medicine
- Paediatrics

# Physicians
- name: Dr Olena Kovalenko
  specialty: Cardiology
  profile: https://meridianmed.ua/doctors/olena-kovalenko

- name: Dr Andriy Bondarenko
  specialty: Orthopaedic Surgery
  profile: https://meridianmed.ua/doctors/andriy-bondarenko

- name: Dr Iryna Savchenko
  specialty: Dermatology
  profile: https://meridianmed.ua/doctors/iryna-savchenko

# Key Content Pages
- url: https://meridianmed.ua/services/cardiology
  title: Cardiology Services

- url: https://meridianmed.ua/services/orthopaedics
  title: Orthopaedic Surgery

- url: https://meridianmed.ua/services/dermatology
  title: Dermatology

- url: https://meridianmed.ua/faq
  title: Patient FAQ

- url: https://meridianmed.ua/conditions/arrhythmia
  title: Arrhythmia Treatment

# Accreditations
- Ministry of Health of Ukraine — Medical Practice Licence ML-2009-0441
- ISO 9001:2015 — Quality Management System (certified 2021)
- Ukrainian Medical Association — Member since 2010

A few structural points worth noting: the file uses hash-prefixed section headers for readability. Each physician entry is a discrete block — name, specialty, and URL on separate lines. The description in the Organisation section mirrors the meta description of your homepage; consistency across sources strengthens entity recognition.

What to leave out: booking system URLs, patient portal login pages, pricing tables, and any admin or staff-only paths. These pages either require authentication (so AI crawlers cannot read them anyway) or contain pricing data that you may not want surfaced out of context.

Common mistakes that reduce llms.txt effectiveness

The file is simple to create, but easy to get wrong in ways that quietly reduce its value.

File too long. An llms.txt over 2,000 words is frequently truncated by AI crawlers. The crawlers allocate a limited context window to each file; content beyond that window is ignored. Keep the file between 400 and 800 words.

Inconsistent entity names. If your llms.txt says "Meridian Clinic" but your Schema.org markup says "Meridian Medical Centre" and your Google Business Profile says "Meridian Med", AI systems treat these as potentially different entities. Pick one canonical name and use it everywhere.

Missing physician profiles section. For medical websites, the physician list is the single most impactful section. Queries like "recommend a neurologist" return names, not clinic brands. If your doctors are not listed in llms.txt with profile URLs, they are unlikely to appear.

Not updating after changes. When you add a new physician, open a new location, or launch a new specialty, the llms.txt needs updating at the same time. Stale data erodes trust scores over time as AI systems compare llms.txt content against live page content.

Missing robots.txt rules for AI bots. An llms.txt file means nothing if GPTBot or ClaudeBot is blocked from accessing it. Always set the AI bot allow block in robots.txt alongside creating the llms.txt file. These two files are a pair, not alternatives.

How to verify your llms.txt is being read

Once the file is live, there are three practical ways to check it is working.

Manual query test. In ChatGPT with browsing enabled and in Perplexity, run the query "tell me about [your clinic name]". If the summary returned matches the description and specialties in your llms.txt, the file is being read. If the response is vague or incorrect, there is either an access issue or the file was recently created and has not yet been crawled.

Server log analysis. Filter your access logs for requests to /llms.txt. You should see hits from GPTBot and ClaudeBot user-agent strings within 4–8 weeks of publishing the file on an active medical website. No hits after 8 weeks suggests a robots.txt block or a crawl priority issue.

Clingeo technical audit. Clingeo's audit tool checks AI crawler access, validates llms.txt structure against the six-section framework, and returns a completeness score. It also flags robots.txt conflicts that would prevent the file from being read. This is the fastest way to get a definitive answer, particularly if you are managing multiple clinic locations.

For a full picture of your clinic's current AI search performance, you can audit your AI visibility using a structured checklist.

Get your clinic indexed by AI search engines

Clingeo audits your clinic's AI search visibility across ChatGPT, Perplexity, Gemini, and Google SGE — checking llms.txt completeness, robots.txt AI bot rules, Schema.org markup, and citation frequency. Start with a free audit to see exactly where your clinic stands today.

FAQ

Is llms.txt required for ChatGPT to find my clinic?

No, it is not required — ChatGPT can index your site without it. But clinics with a correctly structured llms.txt show 20–35% higher citation rates, according to Clingeo data from 334 clinic audits. The file makes it significantly easier for AI systems to understand and cite you accurately.

How is llms.txt different from structured data (Schema.org)?

Schema.org markup is embedded in individual page HTML and tells search engines about the content of that specific page. llms.txt is a single file at the domain root that gives AI systems a high-level summary of your entire organisation. The two approaches complement each other — Schema provides page-level detail; llms.txt provides entity-level context.

How often should I update my llms.txt?

Update it any time you add a physician, open a new location, launch a new specialty, or change your accreditation status. AI crawlers typically re-index active medical sites every 4–8 weeks, so changes will be picked up within that window after you update the file.

Can llms.txt harm my traditional SEO rankings?

No. Traditional search engine crawlers such as Googlebot do not use llms.txt — it has no effect, positive or negative, on your standard Google rankings. The file is read only by AI-specific crawlers and browsing agents.

What if I have multiple clinic locations?

List each location as a separate entry in the Locations section, with its own full address and phone number. If the locations operate under different brand names, create a separate llms.txt file at each domain or subdomain. A single file covering all locations is fine when they share one domain and one brand name.