How is this different from Google Analytics?

Google Analytics shows you traffic. Shadow shows you traffic, AI bot activity, what AI platforms say about your brand, AND tells you what to do about all of it. It's analytics + AI intelligence + action steps in one tool.

Do I need to install anything?

For basic monitoring (bot detection, AI perception, readiness score) — nope, just enter your URL. For full visitor analytics (clicks, behavior, sessions), add one script tag. One-click integrations for Vercel, Shopify, WordPress, and more.

Will it slow down my site?

No. The script is under 5KB and loads async. Zero impact on page speed or Core Web Vitals. External monitoring has literally no impact — it watches from the outside.

What AI bots does Shadow detect?

All of them. GPTBot, ClaudeBot, PerplexityBot, Google-Extended, Bytespider, Amazonbot, and dozens more. The Shadow Network means new bots get identified across all users instantly.

What do you mean by "actionable steps"?

Shadow doesn't just show you graphs. It says things like: "ChatGPT has your pricing wrong — add structured data to /pricing to fix it" or "Your bounce rate on /features is 68% — here's why and what to change." Specific, do-it-today recommendations.

Can Shadow block bots?

Shadow is a telescope, not a shield. It shows you who's visiting and what AI says about you. It generates block rules and robots.txt configs you can apply — but it doesn't intercept traffic.

Yes. Shadow never collects PII. IP addresses are hashed after classification. No cookies on your visitors. All Shadow Network data is anonymized. GDPR compliant by design.

Docusaurus · React · Static Site·9 min read

How to Block AI Bots on Docusaurus: Complete 2026 Guide

Docusaurus generates a static React site — like MkDocs, it has no server process. Bot blocking splits across the content layer (robots.txt in static/, noai meta via headTags in docusaurus.config.js) and the hosting platform layer (X-Robots-Tag headers, hard 403 via Edge Functions). The headTags config approach is the simplest — no theme swizzling required.

headTags is all you need for noai meta — no swizzling

Docusaurus v2/v3 supports a headTags array in docusaurus.config.js that injects arbitrary HTML tags into every page's <head>. This is simpler than swizzling the Root or Layout component. Reserve swizzling for per-page conditional logic that can't be done with a static config entry.

Methods at a glance

Method	What it does	Where it lives
static/robots.txt	Signals bots which paths are off-limits	static/ → build/
headTags in docusaurus.config.js	noai meta on every page	Config file
<head> in MDX front matter	noai meta on specific pages	Individual .md/.mdx files
netlify.toml / vercel.json / _headers	X-Robots-Tag on all responses	Hosting platform
Edge Function	Hard 403 on known AI User-Agents	Netlify / Cloudflare
Swizzled Root/Layout	Per-page conditional meta	src/theme/ (advanced)

1. robots.txt — static/ directory

Place robots.txt in the static/ directory at the root of your Docusaurus project. Docusaurus copies the entire static/ directory into build/ unchanged. No config needed.

# static/robots.txt
# Copied to build/robots.txt by Docusaurus — no config needed

User-agent: GPTBot
Disallow: /

User-agent: ChatGPT-User
Disallow: /

User-agent: ClaudeBot
Disallow: /

User-agent: Claude-Web
Disallow: /

User-agent: anthropic-ai
Disallow: /

User-agent: CCBot
Disallow: /

User-agent: Google-Extended
Disallow: /

User-agent: PerplexityBot
Disallow: /

User-agent: Amazonbot
Disallow: /

User-agent: Bytespider
Disallow: /

User-agent: *
Allow: /

# Verify after build:
npm run build
ls build/robots.txt   # should exist

static/ vs docs/ vs src/

Only files in static/ are copied to the build output as-is. Files in docs/ are Markdown content processed by Docusaurus. Files in src/ are React components. robots.txt goes in static/ — not in docs/ or src/.

2. noai meta tag — headTags config

The headTags array in docusaurus.config.js (or docusaurus.config.ts) injects HTML tags into the <head> of every generated page. This is the cleanest approach — no component swizzling required.

// docusaurus.config.js (or .ts)
import type { Config } from '@docusaurus/types';

const config: Config = {
  title: 'My Documentation',
  url: 'https://docs.example.com',
  baseUrl: '/',

  // Inject meta tags into every page's <head>
  headTags: [
    {
      tagName: 'meta',
      attributes: {
        name: 'robots',
        content: 'noai, noimageai',
      },
    },
  ],

  // ... rest of config
};

export default config;

headTags is available in Docusaurus v2.4+ and v3.x. For older versions, use the theme swizzling approach (Section 4).

3. Per-page meta — MDX head block

In any .md or .mdx file, add a <head> block directly in the file to inject page-specific meta tags. These override or supplement the global headTags config.

---
# docs/some-page.mdx
title: My Page
description: Page description
---

<head>
  {/* Block AI indexing on this specific page */}
  <meta name="robots" content="noindex, noai, noimageai" />
</head>

# My Page

Content here...

The <head> block in MDX uses JSX syntax — self-closing tags need />. This is processed by Docusaurus's MDX pipeline, not HTML directly.

4. Theme swizzling — conditional per-page meta

For conditional logic (e.g. different robots values based on doc category or front matter), swizzle the Root component to wrap every page in a custom component that injects the correct meta tag.

# Eject the Root component (safe — wraps the whole app):
npx docusaurus swizzle @docusaurus/theme-classic Root --eject

// src/theme/Root.tsx — after swizzling
import React from 'react';
import Head from '@docusaurus/Head';
import { useLocation } from '@docusaurus/router';
import OriginalRoot from '@theme-original/Root';

// Pages that should not be indexed
const NOINDEX_PATHS = ['/internal/', '/draft/'];

export default function Root(props) {
  const { pathname } = useLocation();
  const isNoIndex = NOINDEX_PATHS.some((p) => pathname.startsWith(p));

  return (
    <>
      <Head>
        <meta
          name="robots"
          content={isNoIndex ? 'noindex, noai, noimageai' : 'noai, noimageai'}
        />
      </Head>
      <OriginalRoot {...props} />
    </>
  );
}

Swizzle with --eject (full copy) not --wrap for Root — wrapping works but ejecting gives more control. Keep the original import and delegate to it.

5. X-Robots-Tag — hosting platform

Docusaurus produces a static site — HTTP headers come from your hosting platform.

Netlify — netlify.toml

# netlify.toml
[build]
  command = "npm run build"
  publish = "build"

[[headers]]
  for = "/*"
  [headers.values]
    X-Robots-Tag = "noai, noimageai"

Vercel — vercel.json

{
  "buildCommand": "npm run build",
  "outputDirectory": "build",
  "headers": [
    {
      "source": "/(.*)",
      "headers": [
        { "key": "X-Robots-Tag", "value": "noai, noimageai" }
      ]
    }
  ]
}

Cloudflare Pages — _headers file

# static/_headers — copied to build/_headers by Docusaurus
/*
  X-Robots-Tag: noai, noimageai

GitHub Pages — no custom headers

GitHub Pages does not support custom HTTP headers. Use the headTags noai meta approach (Section 2) as your only option, or migrate to Cloudflare Pages for header + edge function support.

6. Hard 403 — edge functions

For hard User-Agent blocking before content is served:

Netlify Edge Function

// netlify/edge-functions/block-ai-bots.js
const BLOCKED_UA = /GPTBot|ChatGPT-User|ClaudeBot|Claude-Web|anthropic-ai|CCBot|Google-Extended|PerplexityBot|Amazonbot|Bytespider|YouBot|DuckAssistBot|meta-externalagent|MistralAI-Spider|oai-searchbot/i;

export default async function handler(request, context) {
  const ua = request.headers.get("user-agent") || "";
  const path = new URL(request.url).pathname;

  if (path !== "/robots.txt" && BLOCKED_UA.test(ua)) {
    return new Response("Forbidden", { status: 403 });
  }

  return context.next();
}

export const config = { path: "/*" };

# netlify.toml — declare edge function
[[edge_functions]]
  path = "/*"
  function = "block-ai-bots"

[build]
  command = "npm run build"
  publish = "build"

Cloudflare Pages — _middleware.js

// static/functions/_middleware.js — copied to build/functions/
const BLOCKED_UA = /GPTBot|ClaudeBot|CCBot|PerplexityBot|Amazonbot|Bytespider/i;

export async function onRequest(context) {
  const { request, next } = context;
  const ua = request.headers.get("user-agent") || "";
  const path = new URL(request.url).pathname;

  if (path !== "/robots.txt" && BLOCKED_UA.test(ua)) {
    return new Response("Forbidden", { status: 403 });
  }

  return next();
}

7. Full docusaurus.config.js example

Complete config with headTags for noai meta, robots.txt reference, and standard Docusaurus v3 structure.

// docusaurus.config.ts
import type { Config } from '@docusaurus/types';
import type * as Preset from '@docusaurus/preset-classic';

const config: Config = {
  title: 'My Documentation',
  tagline: 'Project docs',
  url: 'https://docs.example.com',
  baseUrl: '/',
  onBrokenLinks: 'throw',
  onBrokenMarkdownLinks: 'warn',
  favicon: 'img/favicon.ico',

  // ── AI training opt-out ────────────────────────────────────────────
  // Injects into <head> of every generated page
  headTags: [
    {
      tagName: 'meta',
      attributes: {
        name: 'robots',
        content: 'noai, noimageai',
      },
    },
  ],

  presets: [
    [
      'classic',
      {
        docs: {
          sidebarPath: './sidebars.ts',
          routeBasePath: '/',
        },
        blog: false,
        theme: {
          customCss: './src/css/custom.css',
        },
      } satisfies Preset.Options,
    ],
  ],

  themeConfig: {
    navbar: {
      title: 'My Docs',
      items: [
        { to: '/', label: 'Docs', position: 'left' },
      ],
    },
  } satisfies Preset.ThemeConfig,
};

export default config;

8. Deployment quick reference

Platform	Build command	Publish dir	Headers
Netlify	npm run build	build	netlify.toml [[headers]]
Vercel	npm run build	build	vercel.json headers
Cloudflare Pages	npm run build	build	static/_headers file
GitHub Pages	npm run build	build	❌ no custom headers
AWS S3 + CloudFront	npm run build	build	CloudFront response headers policy

Frequently asked questions

How do I add robots.txt to a Docusaurus site?

Place robots.txt in the static/ directory — Docusaurus copies everything from static/ to build/ unchanged. Do not put it in docs/ or src/ — only static/ is copied as-is.

How do I add noai meta tags to Docusaurus?

Use headTags in docusaurus.config.js: headTags: [{ tagName: "meta", attributes: { name: "robots", content: "noai, noimageai" } }]. This injects the meta tag into every page with no swizzling. Available in Docusaurus v2.4+ and v3.

What is theme swizzling and when should I use it?

Swizzling copies a Docusaurus theme component into your project so you can modify it. For a global noai meta tag, use headTags in config — no swizzling needed. Swizzle only when you need per-page conditional logic (e.g. different robots values on /internal/ paths). Use --eject on the Root component for the safest swizzling target.

How do I add X-Robots-Tag headers to a Docusaurus site?

Headers come from your host, not Docusaurus. Netlify: [[headers]] in netlify.toml. Vercel: headers in vercel.json. Cloudflare Pages: _headers file in static/ (copied to build/). GitHub Pages: not supported — use noai meta tag.

Does Docusaurus support per-page robots meta tags?

Yes. In any MDX file, add a <head> block with <meta name="robots" content="noindex, noai, noimageai" />. For programmatic control across many pages, swizzle the Root component and conditionally set the robots value based on the current path.

Can I block AI bots on GitHub Pages with Docusaurus?

GitHub Pages doesn't support custom headers or edge functions. Use headTags for the noai meta tag as your primary defense. For hard 403 blocking, migrate to Netlify or Cloudflare Pages (both free, both support edge functions). Docusaurus deploys cleanly to both with no config changes beyond the build command and publish directory.

Is your site protected from AI bots?

Run a free scan to check your robots.txt, meta tags, and overall AI readiness score.