YouBot is You.com's AI search crawler — it indexes your site so the You.com AI assistant can cite you in search results. Blocking it isn't just an AI training decision: it's a search visibility tradeoff.
YouBot is the web crawler operated by You.com — an AI-native search engine founded in 2021 that has grown into one of the more significant players in the AI search space. You.com competes directly with Perplexity AI and ChatGPT Search: it answers queries with AI-generated responses that cite web sources inline, alongside a traditional search results view.
YouBot's role is analogous to Googlebot: it crawls the web to build and maintain the search index that powers You.com AI answers. When a You.com user asks a question, the AI assistant draws on content YouBot has indexed — and typically displays your site as a clickable citation in its answer. If your site isn't in You.com's index, it won't appear in those answers.
YouBot is not collecting training data for an LLM. You.com doesn't train foundation models — it builds an AI-powered search product on top of existing models. YouBot's purpose is search indexing and citation retrieval, not building training datasets. The decision to block it is a search visibility decision, not a training data privacy decision.
YouBotUse this exact casing in robots.txt.
The YouBot decision is nearly identical to the PerplexityBot decision. Both are AI-native search crawlers. Both cite sources. Both represent a growing segment of AI search users. If you've thought through PerplexityBot and decided to allow or block it, apply the same logic to YouBot.
robots.txtUser-agent: YouBot Disallow: /
# Block AI search crawlers (impacts search visibility in these engines) User-agent: YouBot Disallow: / User-agent: PerplexityBot Disallow: / User-agent: OAI-SearchBot Disallow: / User-agent: ChatGPT-User Disallow: / # Keep traditional search — these don't train AI on your content User-agent: Googlebot Allow: / User-agent: Bingbot Allow: /
This removes you from You.com, Perplexity, and ChatGPT Search results. High impact on AI search visibility.
# Block AI training crawlers User-agent: GPTBot Disallow: / User-agent: ClaudeBot Disallow: / User-agent: anthropic-ai Disallow: / User-agent: Google-Extended Disallow: / User-agent: CCBot Disallow: / User-agent: Amazonbot Disallow: / User-agent: meta-externalagent Disallow: / User-agent: DeepSeekBot Disallow: / User-agent: Bytespider Disallow: / # Allow AI search crawlers — these cite your content and send traffic User-agent: YouBot Allow: / User-agent: PerplexityBot Allow: / User-agent: OAI-SearchBot Allow: / # Allow traditional search User-agent: Googlebot Allow: / User-agent: Bingbot Allow: / User-agent: * Allow: /
Blocks training data collection while maximising AI search visibility. Most publishers benefit from this approach.
import { MetadataRoute } from 'next';
export default function robots(): MetadataRoute.Robots {
return {
rules: [
// Block AI training crawlers
{ userAgent: 'GPTBot', disallow: ['/'] },
{ userAgent: 'ClaudeBot', disallow: ['/'] },
{ userAgent: 'anthropic-ai', disallow: ['/'] },
{ userAgent: 'Google-Extended', disallow: ['/'] },
{ userAgent: 'CCBot', disallow: ['/'] },
{ userAgent: 'Amazonbot', disallow: ['/'] },
// Block YouBot if you don't want You.com search visibility
{ userAgent: 'YouBot', disallow: ['/'] },
// Allow traditional search + AI search (comment out YouBot above to allow)
{ userAgent: 'Googlebot', allow: ['/'] },
{ userAgent: 'PerplexityBot', allow: ['/'] },
{ userAgent: '*', allow: ['/'] },
],
sitemap: 'https://yoursite.com/sitemap.xml',
};
}# Hard 403 block — doesn't depend on robots.txt compliance
if ($http_user_agent ~* "YouBot") {
return 403;
}(http.user_agent contains "YouBot")
Set action to Block. Cloudflare Dashboard → Security → WAF → Custom Rules.
YouBot sits alongside a growing cohort of AI-native search crawlers that are changing how people discover content. Understanding the ecosystem helps you make the right allow/block decisions:
| Bot | Company | Purpose | Block impact |
|---|---|---|---|
| YouBot | You.com | AI search indexing | Lose You.com visibility |
| PerplexityBot | Perplexity AI | AI search indexing | Lose Perplexity visibility |
| OAI-SearchBot | OpenAI | ChatGPT Search indexing | Lose ChatGPT Search visibility |
| GPTBot | OpenAI | LLM training data | No search impact |
| ClaudeBot | Anthropic | LLM training data | No search impact |
Blocking AI search crawlers (YouBot, PerplexityBot, OAI-SearchBot) removes you from those engines' results. Blocking AI training crawlers (GPTBot, ClaudeBot, CCBot) has no search impact.
Is your site protected from AI bots?
Run a free scan to check your robots.txt, meta tags, and overall AI readiness score.
Scan My Site Free →