Philosophy & Design Principles

The Problem We’re Solving

The internet is undergoing its most profound transformation since the dawn of the search engine. The open web has become the primary dataset for a new generation of artificial intelligence, yet this new relationship operates without a clear framework for consent or control.

The Current State

Today’s AI landscape operates on an implicit contract:

AI companies scrape content without explicit permission
Publishers have no standardized way to communicate their preferences
The robots.txt protocol only controls access, not usage
Content creators lack control over how their work is used for AI training

The robots.txt protocol answers: “Can a bot access this URL?” But it was never designed to answer the critical question: “Once accessed, what are you permitted to do with the content?”

Our Solution: Explicit Over Implicit

llmtag moves the web from an ambiguous, implicit contract to an explicit, transparent one.

Core Philosophy

Separation of Concerns

robots.txt handles Access Control
llmtag.txt handles Usage ControlAI agents must first be allowed to access a URL by robots.txt before they can read the usage policies in llmtag.txt.

Explicit over Implicit

If a rule is not defined, the default policy (allow) applies. This encourages participation without breaking functionality. Publishers can opt for a stricter default (disallow) if they choose.

Granularity and Extensibility

The standard is designed to be powerful, allowing rules to be set per-agent, per-path, and even per-content-type.

Machine-Readable

Policies are expressed in a structured, parseable format that AI agents can automatically understand and implement.

Design Principles

1. Simplicity First

The llmtag.txt format is intentionally simple and human-readable:

spec_version: 3.0
ai_training_data: disallow
ai_use: search_indexing, generative_synthesis

Just like robots.txt, anyone can create and understand an llmtag.txt file without technical expertise.

2. Backward Compatibility

The protocol is designed to be:

Non-breaking: Websites without llmtag.txt continue to function normally
Progressive: Publishers can adopt the standard incrementally
Future-proof: New directives can be added without breaking existing implementations

3. Publisher-Centric

The standard prioritizes publisher control and choice:

Opt-in granularity: Publishers choose exactly what to allow or disallow
Flexible defaults: Support both permissive and restrictive default policies
No enforcement burden: Publishers aren’t responsible for enforcing compliance

4. AI Agent Friendly

The protocol is designed to be easily implementable by AI companies:

Clear syntax: Unambiguous parsing rules
Standardized format: Consistent structure across all implementations
Discovery mechanism: Automatic detection via standard HTTP requests

Today’s Web

Publisher → [Implicit Permission] → AI Agent

Tomorrow’s Web with LLMTAG

Publisher → [Explicit llmtag.txt] → AI Agent → [Respects Policies]

Why This Matters

For Publishers

Control

Take back control over how your content is used by AI systems

Transparency

Make your AI usage policies explicit and discoverable

Flexibility

Set different policies for different content types and AI agents

Future-Proofing

Establish clear boundaries before AI usage becomes even more widespread

For AI Companies

Legal Clarity

Clear, machine-readable policies reduce legal uncertainty

Ethical Compliance

Respect publisher preferences and build trust with content creators

Implementation Simplicity

Standardized format makes compliance straightforward to implement

Industry Leadership

Be part of establishing ethical AI practices from the ground up

For the Web Ecosystem

Sustainable AI

Create a sustainable relationship between AI and content creation

Innovation Protection

Protect content creators while enabling AI innovation

Global Standard

Establish a universal protocol that works across all platforms and languages

Trust Building

Build trust between AI companies and content creators

The Path Forward

Phase 1: Early Adoption

Publishers implement llmtag.txt on their websites
AI companies begin reading and respecting the standard
Community builds tools and integrations

Phase 2: Industry Standard

Major platforms adopt the standard (WordPress, Drupal, etc.)
AI companies make compliance a standard practice
Legal frameworks begin recognizing the protocol

Phase 3: Universal Protocol

llmtag.txt becomes as ubiquitous as robots.txt
AI agents universally respect publisher preferences
New web standards emerge based on explicit consent

Join the Movement

Be part of the solution

Help us establish llmtag as the universal standard for AI content policies. Your participation shapes the future of the web.

The LLMTAG protocol is open source and community-driven. We believe that the future of AI and content should be shaped by the people who create and consume it, not just the companies that build the technology.

Getting started

Core specification

Implementation

​The Problem We’re Solving

​The Current State

​Our Solution: Explicit Over Implicit

​Core Philosophy

Separation of Concerns

Explicit over Implicit

Granularity and Extensibility

Machine-Readable

​Design Principles

​1. Simplicity First

​2. Backward Compatibility

​3. Publisher-Centric

​4. AI Agent Friendly

​The Vision: A New Social Contract

​Today’s Web

​Tomorrow’s Web with LLMTAG

​Why This Matters

​For Publishers

Control

Transparency

Flexibility

Future-Proofing

​For AI Companies

Legal Clarity

Ethical Compliance

Implementation Simplicity

Industry Leadership

​For the Web Ecosystem

Sustainable AI

Innovation Protection

Global Standard

Trust Building

​The Path Forward

​Phase 1: Early Adoption

​Phase 2: Industry Standard

​Phase 3: Universal Protocol

​Join the Movement

Be part of the solution

The Problem We’re Solving

The Current State

Our Solution: Explicit Over Implicit

Core Philosophy

Design Principles

1. Simplicity First

2. Backward Compatibility

3. Publisher-Centric

4. AI Agent Friendly

The Vision: A New Social Contract

Today’s Web

Tomorrow’s Web with LLMTAG

Why This Matters

For Publishers

For AI Companies

For the Web Ecosystem

The Path Forward

Phase 1: Early Adoption

Phase 2: Industry Standard

Phase 3: Universal Protocol

Join the Movement