How is this different from Google Analytics?

Google Analytics shows you traffic. Shadow shows you traffic, AI bot activity, what AI platforms say about your brand, AND tells you what to do about all of it. It's analytics + AI intelligence + action steps in one tool.

Do I need to install anything?

For basic monitoring (bot detection, AI perception, readiness score) — nope, just enter your URL. For full visitor analytics (clicks, behavior, sessions), add one script tag. One-click integrations for Vercel, Shopify, WordPress, and more.

Will it slow down my site?

No. The script is under 5KB and loads async. Zero impact on page speed or Core Web Vitals. External monitoring has literally no impact — it watches from the outside.

What AI bots does Shadow detect?

All of them. GPTBot, ClaudeBot, PerplexityBot, Google-Extended, Bytespider, Amazonbot, and dozens more. The Shadow Network means new bots get identified across all users instantly.

What do you mean by "actionable steps"?

Shadow doesn't just show you graphs. It says things like: "ChatGPT has your pricing wrong — add structured data to /pricing to fix it" or "Your bounce rate on /features is 68% — here's why and what to change." Specific, do-it-today recommendations.

Can Shadow block bots?

Shadow is a telescope, not a shield. It shows you who's visiting and what AI says about you. It generates block rules and robots.txt configs you can apply — but it doesn't intercept traffic.

Yes. Shadow never collects PII. IP addresses are hashed after classification. No cookies on your visitors. All Shadow Network data is anonymized. GDPR compliant by design.

What is cohere-ai and what does it do?

cohere-ai is a web crawler operated by Cohere, the enterprise AI company known for Command R, Aya, and the Embed family of embedding models. Unlike most major AI crawlers, Cohere has not published official documentation explaining cohere-ai's purpose. Based on observed behavior and its operator's business, it likely collects web content for training or fine-tuning Cohere's language and embedding models, and may also function as a live browsing agent for Cohere's AI products. Because it's undocumented, the exact scope of data collection is unclear.

Is cohere-ai the same as a training crawler like GPTBot?

Possibly, but not definitively confirmed. GPTBot is openly documented by OpenAI as a training data crawler. cohere-ai lacks that official documentation. Based on its operator (Cohere) and observed crawling patterns, it's reasonable to treat it as a training-adjacent crawler. It may also function as a live browsing agent — similar to ChatGPT-User — used when Cohere's AI products need to retrieve web content in response to user queries. The lack of documentation means you can't be certain which use applies.

Who is Cohere and what models does it build?

Cohere is a Canadian-American AI company founded in 2019, primarily focused on enterprise AI applications. It's known for Command R and Command R+ (retrieval-augmented generation models), the Aya multilingual model family, and Embed (embedding models for semantic search and retrieval). Cohere's main customers are enterprises — banks, healthcare companies, tech firms — that deploy AI for internal use cases like document search, summarization, and customer support. Unlike OpenAI or Anthropic, Cohere is primarily B2B and doesn't have a major consumer-facing AI product.

Will blocking cohere-ai affect my SEO or search rankings?

No. cohere-ai is not a search engine crawler. Blocking it has zero effect on your rankings in Google, Bing, or any traditional search engine. The only effect is on Cohere's ability to access your content — whether for training, retrieval, or browsing-agent purposes.

What user agent string does cohere-ai use?

The primary token used in robots.txt is: cohere-ai (all lowercase). The full user agent string observed in server logs is: Mozilla/5.0 (compatible; cohere-ai/1.0; +http://www.cohere.ai/bot.html). In robots.txt, use: User-agent: cohere-ai followed by Disallow: /.

CohereUndocumentedLikely Training

How to Block cohere-ai: Cohere's Undocumented Web Crawler

cohere-ai crawls your site without any official documentation explaining what it collects or why. It's operated by Cohere — the enterprise AI lab behind Command R. Only ~13% of major websites block it.

Updated March 2026

Why "Undocumented" Matters

Most major AI companies publish documentation explaining their crawlers. OpenAI documents GPTBot, Anthropic documents ClaudeBot, Google documents Google-Extended. Cohere has published no official documentation for cohere-ai — no help page, no blog post, no developer docs explaining what it does or how it uses collected data.

This lack of transparency means publishers must infer the crawler's purpose from Cohere's business model and observed behavior. When in doubt, blocking is the conservative choice.

What We Know About cohere-ai

Cohere is a Canadian-American AI company founded in 2019, focused on enterprise AI. Its products include Command R and Command R+ (retrieval-augmented generation models), the Aya multilingual model family, and Embed (embedding models for semantic search). Cohere's customers are primarily enterprises — banks, healthcare companies, and tech firms.

The cohere-ai crawler has been identified through server log analysis by security researchers and bot tracking services. Based on Cohere's business (building language and embedding models), the crawler likely serves one or both of these purposes:

Training data collectionCrawling web content to train or fine-tune Cohere's language models (Command R, Aya)

Live retrieval / RAGFetching web pages in real-time when Cohere's AI products need web context for answers

The user agent string is: Mozilla/5.0 (compatible; cohere-ai/1.0; +http://www.cohere.ai/bot.html)

How to Block cohere-ai

Add this to your robots.txt:

robots.txtBlock cohere-ai

User-agent: cohere-ai
Disallow: /

Because cohere-ai is undocumented, consider adding server-level enforcement:

nginxBlock by user agent

if ($http_user_agent ~* "cohere-ai") {
    return 403;
}

Cloudflare WAFCustom rule

Field: User Agent
Operator: contains
Value: cohere-ai
Action: Block

Why Only 13% of Sites Block cohere-ai

The low blocking rate isn't because cohere-ai is safe — it's because most publishers don't know it exists.

📡

No media coverage

GPTBot and ClaudeBot launched with press releases and blog posts. cohere-ai was discovered through server log analysis — no announcement, no documentation, no media coverage.

🏢

Enterprise focus obscures awareness

Cohere is primarily B2B. It doesn't have a consumer-facing AI product like ChatGPT or Claude, so publishers don't encounter it as a product they need to worry about.

📋

Not in standard block lists

Many robots.txt templates for AI blocking focus on the well-known crawlers. cohere-ai is often missing from popular "block all AI bots" templates and guides.

What Blocking Does (and Doesn't) Do

What it stops

• Cohere from crawling your content going forward
• New content from entering Cohere's training pipeline
• Live retrieval of your pages for Cohere's AI products

What it doesn't stop

• Content Cohere has already crawled
• Other AI crawlers (GPTBot, ClaudeBot, etc.)
• Cohere accessing your content via Common Crawl or data brokers
• Google or Bing rankings (unaffected)

Frequently Asked Questions

Does cohere-ai respect robots.txt?

Based on available evidence, it appears to. Cohere is a US-based, venture-backed company with major enterprise customers (including banks and healthcare firms) that expect compliance. However, because the crawler is undocumented, this cannot be officially confirmed. For guaranteed enforcement, add server-level blocking.

Is Cohere different from OpenAI and Anthropic?

Yes. Cohere is primarily B2B — it sells AI infrastructure to enterprises for internal use cases (document search, summarization, customer support). It doesn't have a major consumer product like ChatGPT or Claude. This enterprise focus means your content may end up powering internal corporate AI tools rather than a public chatbot.

Does blocking cohere-ai affect my SEO?

No. cohere-ai has no relationship with Google, Bing, or any search engine. Blocking it has zero effect on your search rankings or visibility.

Should I block cohere-ai if I already block GPTBot and ClaudeBot?

If your policy is to block AI training crawlers, then yes. cohere-ai likely serves a similar training purpose. The lack of documentation makes it a higher-risk crawler to leave unblocked — you don't know exactly what it's doing with your content.

Related Guides

How to Block CCBot

Common Crawl — Cohere's primary data source

How to Block AI2Bot

Research AI crawler with similar mission

How to Block ClaudeBot

Anthropic's training crawler

robots.txt for AI Bots (Complete Guide)

51+ crawlers, full reference table

Is your site protected from AI bots?

Run a free scan to check your robots.txt, meta tags, and overall AI readiness score.