How is this different from Google Analytics?

Google Analytics shows you traffic. Shadow shows you traffic, AI bot activity, what AI platforms say about your brand, AND tells you what to do about all of it. It's analytics + AI intelligence + action steps in one tool.

Do I need to install anything?

For basic monitoring (bot detection, AI perception, readiness score) — nope, just enter your URL. For full visitor analytics (clicks, behavior, sessions), add one script tag. One-click integrations for Vercel, Shopify, WordPress, and more.

Will it slow down my site?

No. The script is under 5KB and loads async. Zero impact on page speed or Core Web Vitals. External monitoring has literally no impact — it watches from the outside.

What AI bots does Shadow detect?

All of them. GPTBot, ClaudeBot, PerplexityBot, Google-Extended, Bytespider, Amazonbot, and dozens more. The Shadow Network means new bots get identified across all users instantly.

What do you mean by "actionable steps"?

Shadow doesn't just show you graphs. It says things like: "ChatGPT has your pricing wrong — add structured data to /pricing to fix it" or "Your bounce rate on /features is 68% — here's why and what to change." Specific, do-it-today recommendations.

Can Shadow block bots?

Shadow is a telescope, not a shield. It shows you who's visiting and what AI says about you. It generates block rules and robots.txt configs you can apply — but it doesn't intercept traffic.

Yes. Shadow never collects PII. IP addresses are hashed after classification. No cookies on your visitors. All Shadow Network data is anonymized. GDPR compliant by design.

GatsbyReact SSGNew8 min read

How to Block AI Bots on Gatsby

Gatsby's pre-rendered HTML and fast load times make it ideal for AI crawlers. Here's how to lock them out with a plain robots.txt, global noai tags via gatsby-ssr.js, and edge-level blocking via Netlify or Cloudflare — all without additional build complexity.

Quick fix — create static/robots.txt

Same folder as gatsby-config.js. Gatsby copies it to public/robots.txt at build — no plugins needed.

User-agent: GPTBot
Disallow: /

User-agent: ClaudeBot
Disallow: /

User-agent: CCBot
Disallow: /

User-agent: Google-Extended
Disallow: /

Available Methods

static/robots.txt (Recommended)

Easy

static/robots.txt (copied as-is to public/robots.txt at build)

Gatsby copies every file in static/ directly to public/ unchanged. A plain robots.txt here works on all Gatsby deployments with no plugins or configuration.

No JSX, no imports — just plain text. Gatsby never processes files in static/.

gatsby-plugin-robots-txt

Easy

gatsby-config.js → plugins array

npm plugin that generates robots.txt from your gatsby-config.js at build time. Useful for referencing siteUrl, environment-specific rules, and keeping config in one place.

Requires: npm install gatsby-plugin-robots-txt. Adds a build dependency — for simple blocking, static/ is cleaner.

gatsby-ssr.js — global noai tag

Easy

gatsby-ssr.js (project root)

Use the onRenderBody SSR hook to inject <meta name="robots" content="noai, noimageai"> into every page's <head> at build time. Works with Gatsby 4 and 5.

This is the most reliable global meta tag injection method in Gatsby — runs before React hydration.

Gatsby 5 Head API (per-page)

Easy

export function Head() in any page component

Gatsby 5 introduced the Head export — a React component for page-level <head> management. Use it in specific pages or a shared layout for per-page or global noai tags.

Gatsby 5+ only. For Gatsby 4, use gatsby-plugin-react-helmet with the Helmet component.

Netlify/Cloudflare headers + WAF

Intermediate

static/_headers or netlify.toml (or Cloudflare WAF)

Edge-level blocking via HTTP headers or WAF rules. The only method that stops robots.txt violators like Bytespider. static/_headers works natively on both Netlify and Cloudflare Pages.

static/_headers is copied to public/_headers by Gatsby and read natively by Netlify and Cloudflare Pages.

Method 1: static/robots.txt (Recommended)

Gatsby copies every file in static/ directly to public/ during the build — no processing, no plugins. Create static/robots.txt in your project root (same level as gatsby-config.js).

1
Create static/robots.txt in your Gatsby project root with the full AI bot block list:

User-agent: *
Allow: /

User-agent: GPTBot
Disallow: /

User-agent: ChatGPT-User
Disallow: /

User-agent: OAI-SearchBot
Disallow: /

User-agent: ClaudeBot
Disallow: /

User-agent: anthropic-ai
Disallow: /

User-agent: Google-Extended
Disallow: /

User-agent: Bytespider
Disallow: /

User-agent: CCBot
Disallow: /

User-agent: PerplexityBot
Disallow: /

User-agent: meta-externalagent
Disallow: /

User-agent: Amazonbot
Disallow: /

User-agent: Applebot-Extended
Disallow: /

User-agent: xAI-Bot
Disallow: /

User-agent: DeepSeekBot
Disallow: /

User-agent: MistralBot
Disallow: /

User-agent: Diffbot
Disallow: /

User-agent: cohere-ai
Disallow: /

User-agent: AI2Bot
Disallow: /

User-agent: Ai2Bot-Dolma
Disallow: /

User-agent: YouBot
Disallow: /

User-agent: DuckAssistBot
Disallow: /

User-agent: omgili
Disallow: /

User-agent: omgilibot
Disallow: /

User-agent: webzio-extended
Disallow: /

User-agent: gemini-deep-research
Disallow: /

Build and verify locally:

gatsby build
cat public/robots.txt | head -6

3
Commit and deploy. Netlify, Vercel, Cloudflare Pages, and Gatsby Cloud all serve public/ files automatically.

Method 2: gatsby-plugin-robots-txt

For dynamic robots.txt generation tied to your Gatsby config — useful for staging environments or referencing siteUrl in the Sitemap directive.

# Install
npm install gatsby-plugin-robots-txt

// gatsby-config.js
module.exports = {
  siteMetadata: {
    siteUrl: 'https://yourdomain.com',
  },
  plugins: [
    {
      resolve: 'gatsby-plugin-robots-txt',
      options: {
        host: 'https://yourdomain.com',
        sitemap: 'https://yourdomain.com/sitemap/sitemap-index.xml',
        policy: [
          { userAgent: '*', allow: '/' },
          { userAgent: 'GPTBot', disallow: '/' },
          { userAgent: 'ChatGPT-User', disallow: '/' },
          { userAgent: 'OAI-SearchBot', disallow: '/' },
          { userAgent: 'ClaudeBot', disallow: '/' },
          { userAgent: 'anthropic-ai', disallow: '/' },
          { userAgent: 'Google-Extended', disallow: '/' },
          { userAgent: 'Bytespider', disallow: '/' },
          { userAgent: 'CCBot', disallow: '/' },
          { userAgent: 'PerplexityBot', disallow: '/' },
          { userAgent: 'meta-externalagent', disallow: '/' },
          { userAgent: 'Diffbot', disallow: '/' },
          { userAgent: 'DeepSeekBot', disallow: '/' },
          { userAgent: 'MistralBot', disallow: '/' },
          { userAgent: 'cohere-ai', disallow: '/' },
          { userAgent: 'AI2Bot', disallow: '/' },
          // Add full list...
        ],
      },
    },
  ],
};

Method 3: noai Meta Tag via gatsby-ssr.js

gatsby-ssr.js runs at build time and injects elements into every generated page's <head>. This is the most reliable way to add a global noai meta tag — it runs before React hydration and applies to every page including dynamically generated ones.

Create gatsby-ssr.js in your project root:

// gatsby-ssr.js
const React = require('react');

exports.onRenderBody = ({ setHeadComponents }) => {
  setHeadComponents([
    React.createElement('meta', {
      key: 'robots-noai',
      name: 'robots',
      content: 'noai, noimageai',
    }),
  ]);
};

TypeScript version (gatsby-ssr.tsx):

// gatsby-ssr.tsx
import React from 'react';
import type { GatsbySSR } from 'gatsby';

export const onRenderBody: GatsbySSR['onRenderBody'] = ({
  setHeadComponents,
}) => {
  setHeadComponents([
    <meta
      key="robots-noai"
      name="robots"
      content="noai, noimageai"
    />,
  ]);
};

Restart gatsby develop after creating gatsby-ssr.js. Changes to gatsby-ssr.js require a fresh development server start — gatsby develop hot reload does not pick up SSR file changes. Run gatsby build && gatsby serve to verify the tag appears in the generated HTML.

Method 4: Gatsby 5 Head Export (Per-Page)

Gatsby 5 introduced the Head export — a React component for page-level <head> management without react-helmet. Add it to your root layout or specific pages for global or per-page control.

// src/pages/index.tsx (or any page — Gatsby 5+)
import React from 'react';

// The Head export is rendered into the page's <head>
export function Head() {
  return (
    <>
      <meta name="robots" content="noai, noimageai" />
    </>
  );
}

export default function IndexPage() {
  return <main>...</main>;
}

For Gatsby 4 — use gatsby-plugin-react-helmet:

// In your layout component (Gatsby 4)
import { Helmet } from 'react-helmet';

export function Layout({ children }) {
  return (
    <>
      <Helmet>
        <meta name="robots" content="noai, noimageai" />
      </Helmet>
      {children}
    </>
  );
}

Method 5: Netlify / Cloudflare Pages Headers

For edge-level blocking. Place a _headers file in your static/ directory — Gatsby copies it to public/_headers, which Netlify and Cloudflare Pages read natively to set HTTP response headers.

static/_headers (Netlify + Cloudflare Pages)

/*
  X-Robots-Tag: noai, noimageai

Sets the HTTP X-Robots-Tag header on every response. More authoritative than the HTML meta tag.

netlify.toml

[[headers]]
  for = "/*"
  [headers.values]
    X-Robots-Tag = "noai, noimageai"

Cloudflare WAF for hard bot blocking:

(http.user_agent contains "GPTBot") or (http.user_agent contains "ClaudeBot") or (http.user_agent contains "CCBot") or (http.user_agent contains "Bytespider") or (http.user_agent contains "Google-Extended") or (http.user_agent contains "Diffbot") or (http.user_agent contains "meta-externalagent") or (http.user_agent contains "DeepSeekBot")

Action: Block. Stops robots.txt violators before they reach your origin.

Full AI Bot Reference

All 25 AI bots covered by the robots.txt block list above:

GPTBotChatGPT-UserOAI-SearchBotClaudeBotanthropic-aiGoogle-ExtendedBytespiderCCBotPerplexityBotmeta-externalagentAmazonbotApplebot-ExtendedxAI-BotDeepSeekBotMistralBotDiffbotcohere-aiAI2BotAi2Bot-DolmaYouBotDuckAssistBotomgiliomgilibotwebzio-extendedgemini-deep-research

Frequently Asked Questions

Where do I put robots.txt in a Gatsby site?↓

The simplest approach: put robots.txt in the static/ directory at your Gatsby project root. Gatsby copies everything in static/ to the public/ output directory unchanged during the build. So static/robots.txt becomes public/robots.txt and is served at yourdomain.com/robots.txt. Alternatively, use gatsby-plugin-robots-txt to generate robots.txt from your gatsby-config.js — this lets you use site metadata and environment variables in your rules. Both methods work with Gatsby 4 and 5.

How do I add a noai meta tag to every Gatsby page?↓

The recommended method for Gatsby is gatsby-ssr.js: export the onRenderBody function and use setHeadComponents to inject a <meta name='robots' content='noai, noimageai'> element into every page's <head>. This runs at build time (SSG) or on the server (SSR) and is the most reliable global approach. Alternatively, use gatsby-plugin-react-helmet (Gatsby 4) or the built-in Head export (Gatsby 5) in your base layout component.

What is gatsby-plugin-robots-txt and do I need it?↓

gatsby-plugin-robots-txt is an npm package that generates robots.txt from your gatsby-config.js configuration during the Gatsby build. It's useful if you want to: reference your siteUrl from gatsby-config.js in the Sitemap directive, use different rules per environment (e.g. block everything on staging), or manage rules alongside your other Gatsby config. For simple static AI bot blocking, putting a plain robots.txt in static/ is simpler and doesn't require an additional dependency.

How do I use gatsby-ssr.js to inject meta tags?↓

Create gatsby-ssr.js (or gatsby-ssr.ts for TypeScript) in your project root. Export the onRenderBody function: exports.onRenderBody = ({ setHeadComponents }) => { setHeadComponents([ React.createElement('meta', { name: 'robots', content: 'noai, noimageai', key: 'robots-noai' }) ]); }. This injects the meta tag into every generated page's <head> at build time. It works with both Gatsby 4 and Gatsby 5.

Does blocking AI bots affect Gatsby's built-in SEO features?↓

No. Blocking GPTBot, ClaudeBot, CCBot, and other AI training bots has zero effect on Googlebot or Bingbot. Gatsby's built-in sitemap plugin (gatsby-plugin-sitemap), canonical URLs, and meta robots settings are completely unaffected. Your Gatsby site's search engine visibility remains the same.

How do I block AI bots on Gatsby deployed to Netlify?↓

Two approaches work on Netlify: (1) static/robots.txt — Gatsby copies it to public/ and Netlify serves it automatically; (2) netlify.toml headers — add [[headers]] with X-Robots-Tag: noai, noimageai for all routes, or use Netlify's Edge Functions (similar to Vercel Middleware) to block specific user agents with a 403. The netlify.toml approach is most powerful for stopping bots that ignore robots.txt. Netlify also reads a _headers file in your static/ directory (copied to public/_headers) as an alternative to netlify.toml.

Is your site protected from AI bots?

Run a free scan to check your robots.txt, meta tags, and overall AI readiness score.