The Problem We’re Solving
The internet is undergoing its most profound transformation since the dawn of the search engine. The open web has become the primary dataset for a new generation of artificial intelligence, yet this new relationship operates without a clear framework for consent or control.The Current State
Today’s AI landscape operates on an implicit contract:- AI companies scrape content without explicit permission
- Publishers have no standardized way to communicate their preferences
- The
robots.txt
protocol only controls access, not usage - Content creators lack control over how their work is used for AI training
The
robots.txt
protocol answers: “Can a bot access this URL?” But it was never designed to answer the critical question: “Once accessed, what are you permitted to do with the content?”Our Solution: Explicit Over Implicit
llmtag
moves the web from an ambiguous, implicit contract to an explicit, transparent one.
Core Philosophy
Separation of Concerns
robots.txt
handles Access Controlllmtag.txt
handles Usage ControlAI agents must first be allowed to access a URL by robots.txt
before they can read the usage policies in llmtag.txt
.Explicit over Implicit
If a rule is not defined, the default policy (
allow
) applies. This encourages participation without breaking functionality. Publishers can opt for a stricter default (disallow
) if they choose.Granularity and Extensibility
The standard is designed to be powerful, allowing rules to be set per-agent, per-path, and even per-content-type.
Machine-Readable
Policies are expressed in a structured, parseable format that AI agents can automatically understand and implement.
Design Principles
1. Simplicity First
Thellmtag.txt
format is intentionally simple and human-readable:
Just like
robots.txt
, anyone can create and understand an llmtag.txt
file without technical expertise.2. Backward Compatibility
The protocol is designed to be:- Non-breaking: Websites without
llmtag.txt
continue to function normally - Progressive: Publishers can adopt the standard incrementally
- Future-proof: New directives can be added without breaking existing implementations
3. Publisher-Centric
The standard prioritizes publisher control and choice:- Opt-in granularity: Publishers choose exactly what to allow or disallow
- Flexible defaults: Support both permissive and restrictive default policies
- No enforcement burden: Publishers aren’t responsible for enforcing compliance
4. AI Agent Friendly
The protocol is designed to be easily implementable by AI companies:- Clear syntax: Unambiguous parsing rules
- Standardized format: Consistent structure across all implementations
- Discovery mechanism: Automatic detection via standard HTTP requests
The Vision: A New Social Contract
Today’s Web
Tomorrow’s Web with LLMTAG
Why This Matters
For Publishers
Control
Take back control over how your content is used by AI systems
Transparency
Make your AI usage policies explicit and discoverable
Flexibility
Set different policies for different content types and AI agents
Future-Proofing
Establish clear boundaries before AI usage becomes even more widespread
For AI Companies
Legal Clarity
Clear, machine-readable policies reduce legal uncertainty
Ethical Compliance
Respect publisher preferences and build trust with content creators
Implementation Simplicity
Standardized format makes compliance straightforward to implement
Industry Leadership
Be part of establishing ethical AI practices from the ground up
For the Web Ecosystem
Sustainable AI
Create a sustainable relationship between AI and content creation
Innovation Protection
Protect content creators while enabling AI innovation
Global Standard
Establish a universal protocol that works across all platforms and languages
Trust Building
Build trust between AI companies and content creators
The Path Forward
Phase 1: Early Adoption
- Publishers implement
llmtag.txt
on their websites - AI companies begin reading and respecting the standard
- Community builds tools and integrations
Phase 2: Industry Standard
- Major platforms adopt the standard (WordPress, Drupal, etc.)
- AI companies make compliance a standard practice
- Legal frameworks begin recognizing the protocol
Phase 3: Universal Protocol
llmtag.txt
becomes as ubiquitous asrobots.txt
- AI agents universally respect publisher preferences
- New web standards emerge based on explicit consent
Join the Movement
Be part of the solution
Help us establish
llmtag
as the universal standard for AI content policies. Your participation shapes the future of the web.The LLMTAG protocol is open source and community-driven. We believe that the future of AI and content should be shaped by the people who create and consume it, not just the companies that build the technology.