A living reference table of every known AI bot user agent string on the web. Search, sort, and filter 80 crawlers across training, retrieval, search, and agentic categories - with the exact robots.txt directive to block or allow each one. Pairs perfectly with the Robots.txt AI Bot Checker.
Every entry includes the exact user-agent token, parent company, purpose, robots.txt directive, and the date it was first spotted. Last updated April 2026.
| Bot | Company | User-Agent | Purpose | Robots.txt | First Spotted | |
|---|---|---|---|---|---|---|
| GPTBot | OpenAI | GPTBot |
Training | Yes | 2023-08 | |
| OAI-SearchBot | OpenAI | OAI-SearchBot |
AI Search | Yes | 2024-05 | |
| ChatGPT-User | OpenAI | ChatGPT-User |
User-Triggered | Partial | 2023-06 | |
| ChatGPT Atlas | OpenAI | (Chrome UA) |
Agentic Browser | No | 2024-10 | |
| Operator (Agent Mode) | OpenAI | (Chrome UA) |
Agentic Browser | No | 2025-01 | |
| ClaudeBot | Anthropic | ClaudeBot |
Training | Yes | 2024-04 | |
| Claude-SearchBot | Anthropic | Claude-SearchBot |
AI Search | Yes | 2026-02 | |
| Claude-User | Anthropic | Claude-User |
User-Triggered | Yes | 2024-08 | |
| anthropic-ai | Anthropic | anthropic-ai |
Training (Legacy) | Yes | 2023-06 | |
| claude-web | Anthropic | claude-web |
Browsing (Legacy) | Yes | 2023-07 | |
| Google-Extended | Google-Extended |
Training Control | Yes | 2023-09 | ||
| Gemini-AI | Gemini |
AI Search | Yes | 2024-02 | ||
| Google-CloudVertexBot | Google-CloudVertexBot |
AI Platform | Yes | 2024-04 | ||
| GoogleAgent-Mariner | GoogleAgent-Mariner |
Agentic Browser | Yes | 2025-05 | ||
| Google-NotebookLM | Google-NotebookLM |
User-Triggered | Yes | 2024-06 | ||
| Gemini-Deep-Research | Gemini-Deep-Research |
AI Search | Yes | 2025-03 | ||
| GoogleAgent-URLContext | GoogleAgent-URLContext |
User-Triggered | Yes | 2025-04 | ||
| PerplexityBot | Perplexity | PerplexityBot |
AI Search | Disputed | 2022-12 | |
| Perplexity-User | Perplexity | Perplexity-User |
User-Triggered | No | 2024-06 | |
| Bingbot | Microsoft | Bingbot |
Search + AI | Yes | 2010-10 | |
| Applebot | Apple | Applebot |
Search + AI | Yes | 2015-05 | |
| Applebot-Extended | Apple | Applebot-Extended |
Training Control | Yes | 2024-06 | |
| FacebookBot | Meta | FacebookBot |
Social + AI | Yes | 2019-04 | |
| meta-externalagent | Meta | meta-externalagent |
AI Training | Yes | 2024-07 | |
| Meta-ExternalFetcher | Meta | Meta-ExternalFetcher |
AI Fetcher | Yes | 2024-09 | |
| Bytespider | ByteDance | Bytespider |
Training + AI | Disputed | 2021-07 | |
| TikTokSpider | ByteDance | TikTokSpider |
Social + AI | Unclear | 2023-11 | |
| GrokBot | xAI | GrokBot |
AI Search / Training | Unclear | 2024-02 | |
| xAI-Grok | xAI | xAI-Grok |
AI Search | Unclear | 2024-03 | |
| Grok-DeepSearch | xAI | Grok-DeepSearch |
AI Search | Unclear | 2024-11 | |
| xAI-Bot | xAI | xAI-Bot |
Training | Unclear | 2024-02 | |
| DeepSeekBot | DeepSeek | DeepSeekBot |
Training | Unclear | 2024-01 | |
| MistralAI-User | Mistral | MistralAI-User |
User-Triggered | Yes | 2024-10 | |
| MistralBot | Mistral | MistralBot |
Training | Yes | 2024-05 | |
| Amazonbot | Amazon | Amazonbot |
AI Assistant | Yes | 2018-11 | |
| bedrockbot | Amazon | bedrockbot |
AI Platform | Yes | 2024-07 | |
| CCBot | Common Crawl | CCBot |
Open Data / Training | Yes | 2008-01 | |
| cohere-ai | Cohere | cohere-ai |
Training | Yes | 2023-05 | |
| DuckAssistBot | DuckDuckGo | DuckAssistBot |
AI Search | Yes | 2023-03 | |
| Bravebot | Brave | Bravebot |
AI Search | Yes | 2021-06 | |
| YouBot | You.com | YouBot |
AI Search | Yes | 2022-09 | |
| Diffbot | Diffbot | Diffbot |
Data Extraction | Yes | 2015-03 | |
| LinkedInBot | LinkedInBot |
Social + AI | Yes | 2015-01 | ||
| AI2Bot | Allen Institute | AI2Bot |
Research / Training | Yes | 2024-07 | |
| AI2Bot-Dolma | Allen Institute | AI2Bot-Dolma |
Training Data | Yes | 2024-07 | |
| HuggingFaceBot | Hugging Face | HuggingFaceBot |
Training | Yes | 2024-08 | |
| PetalBot | Huawei | PetalBot |
Search + AI | Yes | 2020-02 | |
| ChatGLM-Spider | Zhipu AI | ChatGLM-Spider |
Training | Unclear | 2023-10 | |
| Baidu-Spider-AI | Baidu | Baidu-Spider-AI |
Training + AI | Yes | 2023-03 | |
| TencentBot | Tencent | TencentBot |
AI Training | Unclear | 2023-06 | |
| 360Spider | Qihoo 360 | 360Spider |
Search + AI | Unclear | 2013-04 | |
| Sogou | Sogou | Sogou |
Search + AI | Yes | 2010-06 | |
| WRTNBot | WRTN | WRTNBot |
AI Search | Unclear | 2024-05 | |
| SBIntuitionsBot | SB Intuitions | SBIntuitionsBot |
AI Training | Unclear | 2024-04 | |
| Cloudflare-AI-Search | Cloudflare | Cloudflare-AI-Search |
AI Search | Yes | 2024-09 | |
| Cloudflare-AutoRAG | Cloudflare | Cloudflare-AutoRAG |
AI RAG | Yes | 2025-02 | |
| PhindBot | Phind | PhindBot |
AI Search | Yes | 2023-02 | |
| ExaBot | Exa | ExaBot |
AI Search | Yes | 2023-08 | |
| TavilyBot | Tavily | TavilyBot |
AI Search API | Yes | 2024-01 | |
| iaskspider | iAsk | iaskspider |
AI Search | Yes | 2023-09 | |
| AndiBot | Andi | AndiBot |
AI Search | Yes | 2022-04 | |
| kagi-fetcher | Kagi | kagi-fetcher |
AI Search | Yes | 2023-10 | |
| LinerBot | Liner | LinerBot |
AI Search | Unclear | 2024-03 | |
| Anomura | Anomura | Anomura |
AI Search | Unclear | 2024-08 | |
| Timpibot | Timpi | Timpibot |
Decentralized Search | Yes | 2023-07 | |
| Devin | Cognition | Devin |
AI Agent | Unclear | 2024-03 | |
| FirecrawlAgent | Firecrawl | FirecrawlAgent |
Scraping API | Partial | 2024-04 | |
| Crawl4AI | Crawl4AI | Crawl4AI |
Scraping Tool | Partial | 2024-06 | |
| ApifyBot | Apify | ApifyBot |
Scraping Platform | Partial | 2017-03 | |
| omgili | Omgili | omgili |
Forum Indexing | Yes | 2008-04 | |
| webzio-extended | Webz.io | webzio-extended |
Data Broker | Yes | 2023-11 | |
| ImagesiftBot | The Hive | ImagesiftBot |
Image AI | Unclear | 2023-05 | |
| Kangaroo Bot | Kangaroo | Kangaroo Bot |
AI Search | Unclear | 2024-02 | |
| Brightbot | Bright Data | Brightbot |
Data Collection | Unclear | 2023-12 | |
| SemrushBot-OCOB | Semrush | SemrushBot-OCOB |
SEO + AI | Yes | 2024-05 | |
| DataForSeoBot | DataForSEO | DataForSeoBot |
SEO Data + AI | Yes | 2021-08 | |
| TurnitinBot | Turnitin | TurnitinBot |
AI Detection | Yes | 2002-09 | |
| PanguBot | Huawei | PanguBot |
Training | Unclear | 2023-08 | |
| Sentibot | Sentibot | Sentibot |
Sentiment Analysis | Unclear | 2023-04 | |
| VelenPublicWebCrawler | Velen | VelenPublicWebCrawler |
Data Collection | Unclear | 2023-09 |
Run your robots.txt through our free AI Bot Checker - it maps every rule against this database and grades your configuration.
Check your robots.txt →The AI crawler landscape changes every month. We track new bots, deprecate old ones, and keep the user-agent strings accurate so you don't have to piece them together from scattered blog posts.
80 AI crawlers across OpenAI, Anthropic, Google, Perplexity, Meta, ByteDance, xAI, Baidu, Huawei, and dozens more - all in one searchable table.
Every entry is clearly labeled. Training crawlers feed future models. Retrieval crawlers drive citations and referral traffic. The distinction drives very different decisions.
Expand any row to see the exact User-agent / Disallow directive for that bot. Copy it straight into your robots.txt with one click.
Full-text search across all columns. Sort by company, purpose, or first-spotted date. Filter by purpose (training, retrieval, search, agentic) or robots.txt compliance.
Every bot is tagged by whether it respects robots.txt - yes, partial, disputed, unclear, or no. Stealth crawlers and residential-IP scrapers are called out explicitly.
Every entry includes the month we first documented the user-agent in the wild, so you can tell brand-new crawlers from established ones at a glance.
New bots appear, existing bots change user-agent strings, companies deprecate old crawlers. We refresh the database monthly so your reference stays current.
Our writers and editors audit robots.txt, schema, and fifty other AEO signals on every post we ship. First month is on us.
Claim Your Free Month →