Skip to content
Free tool · No account required

robots.txt Generatorfor AI Bots

Choose which AI crawlers can access your site. Get a copy-paste robots.txt snippet in seconds. Covers 49+ AI bots including GPTBot, ClaudeBot, PerplexityBot, and Bingbot.

Quick presets

Filter by type

GPTBotOpenAIAI Training

AI model training data collection

ChatGPT-UserOpenAIAI Assistant

Real-time web browsing for ChatGPT users

OAI-SearchBotOpenAIAI Search

AI-powered search indexing

ClaudeBotAnthropicAI Training

AI model training data collection

PerplexityBotPerplexity AIAI Search

AI search engine indexing and real-time retrieval

Google-ExtendedGoogleAI Training

AI model training (Gemini, Bard)

GeminiGoogleAI Assistant

Real-time web browsing for Gemini users

BytespiderByteDanceAI Trainingignores robots.txt

AI model training and content indexing

CCBotCommon CrawlAI Training

Open web dataset for AI training and research

AmazonbotAmazonAI Shopping

AI assistant answers and product indexing

AppleBot-ExtendedAppleAI Training

AI model training for Apple Intelligence

FacebookBotMetaSocial

Link previews and AI model training

Meta-ExternalAgentMetaAI Training

AI model training for Llama and Meta AI

cohere-aiCohereAI Training

Enterprise AI model training

YouBotYou.comAI Search

AI search engine indexing

DiffbotDiffbotOther

Structured data extraction and knowledge graph building

PetalBotHuawei (Petal Search)SEO

Search engine indexing for Petal Search

BarkrowlerBabbarSEO

SEO analysis and backlink mapping

TimpibotTimpiAI Search

Decentralized search indexing

SeekrSeekr TechnologiesAI Search

AI-scored search and content evaluation

Kangaroo BotKangaroo LTDOther

Content indexing and data analytics

VelenpublicwebcrawlerVelenAI Training

Web indexing and AI training

OmgiliOmgili (Webz.io)AI Training

Forum and discussion content crawling for AI datasets

ICC-CrawlerNICT (Japan)AI Training

Multilingual AI research and translation training

BrightbotBright DataOther

Large-scale data collection for AI and analytics

ScrapyOpen Source (Zyte)Otherignores robots.txt

General-purpose web scraping framework

xAI-BotxAI (Elon Musk)AI Training

AI model training for Grok — real-time knowledge and news indexing

DuckAssistBotDuckDuckGoAI Search

Real-time page fetching for DuckAssist AI summaries

BingbotMicrosoftAI Search

Search indexing for Bing + real-time grounding for Microsoft Copilot

Ai2BotAllen Institute for AI (Ai2)AI Training

Open research dataset collection for AI model training

MistralBotMistral AIAI Training

AI model training data collection

GooglebotGoogleAI Search

Search indexing and Google AI Overviews data sourcing

ApplebotAppleAI Assistant

Siri answers, Spotlight search, and Safari web suggestions

AhrefsBotAhrefsSEO

Backlink analysis and SEO intelligence platform data collection

SemrushBotSemrushSEO

SEO competitive intelligence and site audit data collection

YandexBotYandexAI Search

Yandex search indexing and YaGPT AI training

BaiduspiderBaiduAI Search

Baidu search indexing and ERNIE Bot AI training

BraveBotBraveAI Search

Independent search indexing for Brave Search

NaverBotNaverAI Search

Naver search indexing and HyperCLOVA X AI training

HuggingFaceBotHugging FaceAI Training

AI dataset collection for Hugging Face Hub

DuckDuckBotDuckDuckGoAI Search

Search indexing and DuckAssist AI answer support

LinkedInBotLinkedIn (Microsoft)Social

Link preview metadata and LinkedIn AI features

archive.org_botInternet ArchiveOther

Web preservation and Wayback Machine archiving

TurnitinBotTurnitinOther

Academic plagiarism detection and AI content indexing

iaskspideriAsk.aiAI Search

AI-powered search engine indexing

SogouTencentAI Search

Sogou search indexing and Tencent Hunyuan AI training

DataForSeoBotDataForSEOSEO

SEO data collection and AI-powered marketing research

MJ12botMajesticSEO

Link intelligence database and AI SEO tool support

rogerbotMozSEO

SEO metrics and AI content tool support

DeepSeekBotDeepSeekAI Training

AI model training data collection

FirecrawlMendable / FirecrawlAI Assistant

Web scraping for AI agent applications

JinaBotJina AIAI Assistant

LLM-ready content extraction and AI embedding

ExaBotExa AIAI Search

Neural search indexing for AI applications

MojeekBotMojeekOther

Privacy-focused independent search index

ApifyBotApifyAI Training

Web scraping and AI training data collection

QwenBotAlibaba CloudAI Training

AI model training data collection

YandexGPTYandexAI Search

AI search summaries and LLM training

img2datasetHuggingFace / CommunityAI Trainingignores robots.txt

Bulk image dataset collection for AI training

NewsPleaseCommunity / VariousAI Training

News and media dataset collection for NLP/LLM training

SlurpYahoo / Verizon MediaSEO

Web indexing for Yahoo Search and partner properties

TwitterbotX (formerly Twitter)AI Trainingignores robots.txt

Link card previews, Open Graph metadata fetching, and AI training data collection for Grok

Your robots.txt snippet

Configure rules above to generate output

# No AI bot rules configured yet.
# Use the controls above to allow or block bots, then copy your robots.txt snippet.

How to use: Add this snippet to your robots.txt file at the root of your domain (e.g. yourdomain.com/robots.txt). Note: bots that ignore robots.txt will crawl regardless — blocking those requires server-level firewall rules.

🤖

Why configure AI bots?

AI crawlers from OpenAI, Anthropic, Google, and dozens more are indexing your content right now — for training datasets, search results, and live AI assistant queries. robots.txt lets you control exactly who gets in.

⚠️

Does it always work?

Most reputable AI companies (OpenAI, Anthropic, Google, Meta) respect robots.txt. Some crawlers — notably Bytespider (ByteDance) and Scrapy deployments — do not. For those, you'll need server-level IP blocking or firewall rules.

🔍

Block training, keep search

The most popular configuration: block AI Training bots (GPTBot, ClaudeBot, CCBot) to prevent your content from feeding LLM datasets, while keeping AI Search bots (PerplexityBot, Bingbot, Googlebot) so you still appear in AI-generated answers.

📊

Check your current setup

Not sure what your robots.txt looks like right now? Run a free scan to see which AI bots you're allowing or blocking, your AI readiness score, and how your brand appears in AI search engines.

Scan my site →