Bot Detection Configuration
The [bot] section configures how PRISM identifies bot user agents. In bot-only mode, only requests from detected bots are rendered; all other requests are proxied directly to the origin.
TOML Example
[bot]
patterns = [
# Search engines
"Googlebot",
"Googlebot-Image",
"Googlebot-Video",
"Googlebot-News",
"Storebot-Google",
"Google-InspectionTool",
"GoogleOther",
"bingbot",
"Baiduspider",
"YandexBot",
"DuckDuckBot",
"DuckAssistBot",
"Slurp",
"Applebot",
"Applebot-Extended",
"PetalBot",
"Sogou",
"SeznamBot",
"Amazonbot",
"Bravebot",
# AI/LLM crawlers
"GPTBot",
"ChatGPT-User",
"OAI-SearchBot",
"ClaudeBot",
"Claude-User",
"Claude-SearchBot",
"anthropic-ai",
"PerplexityBot",
"Bytespider",
"meta-externalagent",
"Meta-ExternalFetcher",
"FacebookBot",
"CCBot",
"DeepSeekBot",
"cohere-ai",
"Diffbot",
"YouBot",
"PhindBot",
"FirecrawlAgent",
"Timpibot",
"ImagesiftBot",
# Social / link preview
"facebookexternalhit",
"Twitterbot",
"LinkedInBot",
"Pinterestbot",
"Discordbot",
"WhatsApp",
"TelegramBot",
"Slackbot",
"redditbot",
"Snap URL Preview",
"Bluesky",
"Mastodon",
"Viber",
"kakaotalk-scrap",
"Iframely",
"FlipboardProxy",
# SEO tools
"AhrefsBot",
"SemrushBot",
"MJ12bot",
"DotBot",
"DataForSeoBot",
"ContentKingApp",
"Screaming Frog",
"Embedly",
"Quora Link Preview",
# Archive
"ia_archiver",
]
Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
patterns | Array of Strings | (60+ patterns, see above) | User-agent substrings that identify bot traffic |
Detection Strategy
PRISM uses a two-layer bot detection approach:
-
Primary:
isbotcrate -- PRISM first checks theUser-Agentheader against theisbotlibrary, which maintains a comprehensive, regularly updated database of known bot signatures. -
Fallback:
patternslist -- Ifisbotdoes not match, PRISM checks whether theUser-Agentcontains any of the configured pattern strings as a substring. Matching is case-insensitive.
This dual approach ensures reliable detection even for new or niche crawlers not yet in the isbot database.
Detailed Explanation
Pattern matching
Each entry in the patterns array is treated as a case-insensitive substring match against the full User-Agent header. For example, the pattern "Googlebot" matches:
Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)Googlebot-Image/1.0Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; Googlebot/2.1)
Default pattern categories
The default list covers four major categories:
- Search engines (20 patterns): Google, Bing, Baidu, Yandex, DuckDuckGo, Apple, and regional search engines
- AI/LLM crawlers (21 patterns): GPTBot, ClaudeBot, PerplexityBot, DeepSeekBot, and other AI training/search crawlers
- Social/link preview (16 patterns): Facebook, Twitter/X, LinkedIn, Discord, WhatsApp, Telegram, Slack, Reddit, Bluesky, Mastodon
- SEO tools and archives (10 patterns): Ahrefs, Semrush, Screaming Frog, Internet Archive
Overriding the default list
Setting patterns in your config replaces the entire default list. If you only want to add a custom bot, you must include the defaults plus your additions:
[bot]
patterns = [
"Googlebot",
"bingbot",
# ... include defaults you need ...
"MyCustomCrawler",
]
Example Use Cases
Minimal bot list for testing
[bot]
patterns = ["Googlebot", "bingbot"]
Adding a custom internal crawler
[bot]
patterns = [
# Keep all defaults plus your custom bot
"Googlebot",
"bingbot",
"Baiduspider",
"YandexBot",
# ... other defaults ...
"InternalMonitorBot",
"MyCompanyCrawler",
]
Rendering for all traffic (no bot detection needed)
If you use render-all mode, the bot patterns list is not consulted for routing decisions, but it is still used for analytics and the X-Prism-Bot response header.
[server]
mode = "render-all"
# Bot patterns are still used for tagging, not routing
[bot]
patterns = ["Googlebot", "bingbot"]