Introduction
The LLMTAG protocol is a declarative standard that allows website publishers to communicate their content usage policies to AI agents in a machine-readable format. This document provides the complete technical specification for version 3.0.File Format
File Name and Location
The policy file must be namedllmtag.txt
and placed in the root directory of the website, accessible at:
This follows the same convention as
robots.txt
, making it familiar and discoverable for both humans and automated systems.File Structure
Thellmtag.txt
file is a plain text file with the following structure:
Core Directives
Required Directives
spec_version
Required. Declares the specification version being used.
3.0
- Current LLMTAG Protocol version (required)
All
llmtag.txt
files must include this directive. Files without it are considered invalid.Content Usage Directives
ai_training_data
Controls whether content can be used as training data for machine learning models.
Values:
allow
- Content may be used for AI trainingdisallow
- Content may not be used for AI training
allow
(if not specified)
ai_use
Controls specific AI applications and use cases.
Values: Comma-separated list of allowed use cases:
search_indexing
- Traditional search engine indexinggenerative_synthesis
- Generating answers, summaries, or new contentcommercial_products
- Use within paid AI features or productsresearch
- Academic or non-commercial research purposespersonal_assistance
- Personal AI assistants and chatbots
search_indexing
(if not specified)
Scope Blocks
User-agent Block
Allows setting different policies for specific AI agents or crawlers.User-agent names are case-insensitive. Use the exact user-agent string as reported by the AI agent.
Path Block
Allows setting different policies for specific URL paths or patterns.- Use forward slashes (
/
) as path separators - Trailing slashes are optional
- Wildcards are not supported in v3.0
- Path matching is prefix-based
Advanced Features
Verification Challenge
Theverification_challenge
directive establishes a cryptographic handshake to verify that an AI agent has actually read and understood the rules.
This is an advanced feature for publishers who want to implement verification mechanisms. Most implementations can ignore this directive.
Comments
Use#
to add comments to your llmtag.txt
file:
Processing Rules
Precedence
Directives are processed in the following order of precedence:- Path-specific directives (highest priority)
- User-agent specific directives
- Global directives (lowest priority)
Inheritance
When a directive is not specified at a more specific level, it inherits from the global level:Default Values
If no directive is specified at any level, these defaults apply:ai_training_data: allow
ai_use: search_indexing
Discovery Mechanism
Automatic Discovery
AI agents should automatically check forllmtag.txt
by making a GET request to:
HTTP Headers
The server should respond with appropriate headers:Error Handling
- 404 Not Found: No
llmtag.txt
file exists - apply default policies - 403 Forbidden: File exists but access is denied - apply default policies
- 500 Server Error: Server error - apply default policies
- Invalid Format: Malformed file - apply default policies
Compliance Requirements
For AI Agents
AI agents that claim compliance with the LLMTAG protocol must:- Check for
llmtag.txt
before processing any content - Parse the file correctly according to this specification
- Respect all applicable directives based on the agent’s identity and content path
- Handle errors gracefully by applying default policies when files are inaccessible
For Publishers
Publishers implementingllmtag.txt
should:
- Include required directives (
spec_version
) - Use valid syntax as defined in this specification
- Test their implementation to ensure the file is accessible
- Keep policies up to date as their preferences change
Version History
Version 3.0 (Current)
- Added
verification_challenge
directive - Improved path matching rules
- Enhanced error handling specifications
- Added comprehensive compliance requirements
Version 2.0
- Added
ai_use
directive with granular control - Introduced path-based policies
- Added user-agent specific rules
Version 1.0
- Initial specification
- Basic
ai_training_data
directive - Global policy support only