AI Bot Directory
Every AI crawler, identified. Know what's visiting your site, who operates it, and whether it respects your rules.
GPTBot
OpenAI
AI model training data collection
ChatGPT-User
OpenAI
Real-time web browsing for ChatGPT users
OAI-SearchBot
OpenAI
AI-powered search indexing
ClaudeBot
Anthropic
AI model training data collection
PerplexityBot
Perplexity AI
AI search engine indexing and real-time retrieval
Google-Extended
AI model training (Gemini, Bard)
Gemini
Real-time web browsing for Gemini users
Bytespider
ByteDance
AI model training and content indexing
CCBot
Common Crawl
Open web dataset for AI training and research
Amazonbot
Amazon
AI assistant answers and product indexing
AppleBot-Extended
Apple
AI model training for Apple Intelligence
FacebookBot
Meta
Link previews and AI model training
Meta-ExternalAgent
Meta
AI model training for Llama and Meta AI
cohere-ai
Cohere
Enterprise AI model training
YouBot
You.com
AI search engine indexing
Diffbot
Diffbot
Structured data extraction and knowledge graph building
PetalBot
Huawei (Petal Search)
Search engine indexing for Petal Search
Barkrowler
Babbar
SEO analysis and backlink mapping
Timpibot
Timpi
Decentralized search indexing
Seekr
Seekr Technologies
AI-scored search and content evaluation
Kangaroo Bot
Kangaroo LTD
Content indexing and data analytics
Velenpublicwebcrawler
Velen
Web indexing and AI training
Omgili
Omgili (Webz.io)
Forum and discussion content crawling for AI datasets
ICC-Crawler
NICT (Japan)
Multilingual AI research and translation training
Brightbot
Bright Data
Large-scale data collection for AI and analytics
Scrapy
Open Source (Zyte)
General-purpose web scraping framework
xAI-Bot
xAI (Elon Musk)
AI model training for Grok — real-time knowledge and news indexing
DuckAssistBot
DuckDuckGo
Real-time page fetching for DuckAssist AI summaries
Bingbot
Microsoft
Search indexing for Bing + real-time grounding for Microsoft Copilot
Ai2Bot
Allen Institute for AI (Ai2)
Open research dataset collection for AI model training
MistralBot
Mistral AI
AI model training data collection
Googlebot
Search indexing and Google AI Overviews data sourcing
Applebot
Apple
Siri answers, Spotlight search, and Safari web suggestions
AhrefsBot
Ahrefs
Backlink analysis and SEO intelligence platform data collection
SemrushBot
Semrush
SEO competitive intelligence and site audit data collection
YandexBot
Yandex
Yandex search indexing and YaGPT AI training
Baiduspider
Baidu
Baidu search indexing and ERNIE Bot AI training
BraveBot
Brave
Independent search indexing for Brave Search
NaverBot
Naver
Naver search indexing and HyperCLOVA X AI training
HuggingFaceBot
Hugging Face
AI dataset collection for Hugging Face Hub
DuckDuckBot
DuckDuckGo
Search indexing and DuckAssist AI answer support
LinkedInBot
LinkedIn (Microsoft)
Link preview metadata and LinkedIn AI features
archive.org_bot
Internet Archive
Web preservation and Wayback Machine archiving
TurnitinBot
Turnitin
Academic plagiarism detection and AI content indexing
iaskspider
iAsk.ai
AI-powered search engine indexing
Sogou
Tencent
Sogou search indexing and Tencent Hunyuan AI training
DataForSeoBot
DataForSEO
SEO data collection and AI-powered marketing research
MJ12bot
Majestic
Link intelligence database and AI SEO tool support
rogerbot
Moz
SEO metrics and AI content tool support
DeepSeekBot
DeepSeek
AI model training data collection
Firecrawl
Mendable / Firecrawl
Web scraping for AI agent applications
JinaBot
Jina AI
LLM-ready content extraction and AI embedding
ExaBot
Exa AI
Neural search indexing for AI applications
MojeekBot
Mojeek
Privacy-focused independent search index
ApifyBot
Apify
Web scraping and AI training data collection
QwenBot
Alibaba Cloud
AI model training data collection
YandexGPT
Yandex
AI search summaries and LLM training
img2dataset
HuggingFace / Community
Bulk image dataset collection for AI training
NewsPlease
Community / Various
News and media dataset collection for NLP/LLM training
Slurp
Yahoo / Verizon Media
Web indexing for Yahoo Search and partner properties
Twitterbot
X (formerly Twitter)
Link card previews, Open Graph metadata fetching, and AI training data collection for Grok
Are these bots crawling your site?
Run a free scan to see which AI bots are accessing your website right now.
Check Your Site →