AI Agent Blocking Overview
The LLMTAG plugin maintains a comprehensive database of 60+ known AI agents and crawlers, allowing you to selectively block or allow them based on your content protection needs.60+ AI Agents Blocked
Proactive Protection • Selective Blocking • Real-time Updates • Custom Agent Management
How AI Agent Blocking Works
Blocking Mechanism
The plugin uses multiple layers of protection to block AI agents:1
User-Agent Detection
Analyze incoming requests to identify AI agents by their user-agent strings.
2
Database Lookup
Check the AI agent database to determine if the agent should be blocked.
3
Policy Application
Apply your configured blocking rules and exceptions.
4
Request Blocking
Block the request before it reaches your content, returning a 403 Forbidden response.
5
Logging and Analytics
Log the blocked request for monitoring and analysis.
Blocking Methods
- Server-Level Blocking
- Application-Level Blocking
- Hybrid Blocking
Method: .htaccess rules and server configuration
Advantages: Fast, efficient, works at the server level
Best for: Most websites, high-traffic sites
AI Agent Database
Agent Categories
The plugin organizes AI agents into logical categories for easy management:OpenAI Agents
GPTBot
Purpose: OpenAI’s web crawler for training GPT models
Default: Blocked
User-Agent:
GPTBot
ChatGPT-User
Purpose: ChatGPT browsing feature
Default: Blocked
User-Agent:
ChatGPT-User
OpenAI-Web
Purpose: General OpenAI web crawling
Default: Blocked
User-Agent:
OpenAI-Web
Google AI Agents
Google-Extended
Purpose: Google’s AI training crawler
Default: Blocked
User-Agent:
Google-Extended
Bard-Web
Purpose: Google Bard web crawling
Default: Blocked
User-Agent:
Bard-Web
Gemini-Crawler
Purpose: Google Gemini model training
Default: Blocked
User-Agent:
Gemini-Crawler
Anthropic Agents
Claude-Web
Purpose: Anthropic’s Claude web crawler
Default: Blocked
User-Agent:
Claude-Web
Anthropic-Bot
Purpose: General Anthropic web crawling
Default: Blocked
User-Agent:
Anthropic-Bot
Other AI Services
PerplexityBot
Purpose: Perplexity AI search engine
Default: Blocked
User-Agent:
PerplexityBot
Copilot-Web
Purpose: GitHub Copilot web crawling
Default: Blocked
User-Agent:
Copilot-Web
AI-Writing-Tools
Purpose: Various AI writing and content generation tools
Default: Blocked
User-Agent: Various
Agent Management
Category-Based Management
1
Select Categories
Use category checkboxes to block or allow entire groups of AI agents at once.
2
Individual Agent Control
Fine-tune by selecting or deselecting specific agents within categories.
3
Custom Agent Addition
Add custom user-agent strings for agents not in the database.
4
Whitelist Exceptions
Create exceptions for specific agents you want to allow.
Agent Status
Each AI agent has a configurable status:Blocked
Status: Agent is blocked from accessing your content
Response: 403 Forbidden
Logging: Blocked requests are logged
Allowed
Status: Agent can access your content
Response: Normal content delivery
Logging: Access is logged for monitoring
Monitored
Status: Agent is allowed but closely monitored
Response: Normal content delivery with enhanced logging
Logging: Detailed access logs maintained
Configuration Options
Global Blocking Settings
Default Policy
Set the default behavior for new or unknown AI agents:- Block by Default
- Allow by Default
- Monitor by Default
Policy: Block all AI agents unless specifically allowed
Use when: You want maximum protection
Security: Highest
Blocking Response
Configure what happens when an AI agent is blocked:403 Forbidden
Response: Standard HTTP 403 error
Message: “Access Denied”
Use for: Most websites
Custom Response
Response: Custom error page or message
Message: Your custom content
Use for: Branded error pages
Redirect
Response: Redirect to another page
Message: Redirect to robots.txt or policy page
Use for: Educational purposes
Advanced Blocking Rules
Time-Based Blocking
Block AI agents during specific time periods:IP-Based Blocking
Block AI agents from specific IP addresses or ranges:Rate-Limited Blocking
Block AI agents that exceed request rate limits:Custom Agent Management
Adding Custom Agents
1
Identify the Agent
Use browser developer tools or server logs to identify the user-agent string.
2
Add to Database
Add the agent to your custom agent database with appropriate metadata.
3
Set Blocking Policy
Configure whether to block, allow, or monitor the new agent.
4
Test Configuration
Verify that the new agent is properly handled.
Custom Agent Configuration
Agent Metadata
Each agent in the database includes:Monitoring and Analytics
Real-Time Monitoring
Track AI agent activity in real-time:Live Dashboard
Shows: Current AI agent activity
Updates: Real-time
Purpose: Immediate threat assessment
Blocked Requests
Shows: Recently blocked AI agent requests
Updates: Real-time
Purpose: Security monitoring
Analytics and Reporting
Daily Reports
1
Agent Activity Summary
Overview of all AI agent activity for the day
2
Blocked Requests Count
Number of blocked requests by agent type
3
Top Blocked Agents
Most frequently blocked AI agents
4
Geographic Distribution
Geographic distribution of blocked requests
Weekly Reports
Trend Analysis
Content: AI agent activity trends over time
Purpose: Identify patterns and changes
Threat Assessment
Content: Security threat level assessment
Purpose: Evaluate protection effectiveness
Alert System
Configure alerts for important events:Performance Optimization
Blocking Performance
The plugin is optimized for high-performance blocking:Fast Lookups
Method: In-memory database caching
Speed: < 1ms per request
Memory: Optimized for efficiency
Minimal Overhead
Impact: < 5ms added to page load
CPU: < 2% additional usage
Memory: < 10MB additional usage
Caching Strategy
1
Agent Database Caching
Cache the AI agent database in memory for fast lookups
2
Blocking Rules Caching
Cache blocking rules to avoid repeated processing
3
Response Caching
Cache blocking responses for common scenarios
4
Analytics Caching
Cache analytics data for faster reporting
Troubleshooting
Common Issues
AI agents not being blocked
AI agents not being blocked
Possible causes:
- Agent not in database
- Blocking not enabled
- Server configuration issues
- Caching problems
- Add agent to custom database
- Verify blocking is enabled
- Check .htaccess rules
- Clear all caches
False positives (legitimate users blocked)
False positives (legitimate users blocked)
Possible causes:
- Overly broad user-agent matching
- Incorrect agent identification
- Browser extensions mimicking AI agents
- Refine user-agent matching rules
- Add exceptions for legitimate users
- Review blocked request logs
- Adjust blocking sensitivity
Performance issues with blocking
Performance issues with blocking
Possible causes:
- Large agent database
- Inefficient blocking rules
- Server resource limitations
- Optimize agent database
- Simplify blocking rules
- Enable caching
- Upgrade server resources
Debugging Tools
Blocking Test Tool
Use the built-in test tool to verify blocking:1
Access Test Tool
Go to LLMTAG > Tools > Blocking Test
2
Enter User-Agent
Enter the user-agent string you want to test
3
Run Test
Click Test Blocking to see if the agent would be blocked
4
Review Results
Check the test results and adjust configuration if needed
Log Analysis
Analyze blocking logs to identify issues:Best Practices
Agent Management
Regular Updates
Keep the AI agent database updated with the latest agents and threats
Selective Blocking
Block only the agents that pose a real threat to your content
Monitor Effectiveness
Regularly review analytics to ensure blocking is working effectively
Test Changes
Test blocking changes in a staging environment before applying to production
Performance Optimization
Follow these tips to optimize blocking performance:
- Enable caching for the AI agent database
- Use efficient blocking rules to minimize processing overhead
- Monitor resource usage and adjust as needed
- Regular cleanup of old analytics data
- Optimize server configuration for blocking operations
Security Considerations
Always follow security best practices when configuring AI agent blocking:
- Regular security updates for the plugin and WordPress
- Monitor for new threats and update blocking rules accordingly
- Use strong authentication for admin access
- Backup configurations before making changes
- Test blocking rules to ensure they work as expected