Skip to content
RemixReact Router v7New8 min read

How to Block AI Bots on Remix

Remix's loader-centric model means you can block AI bots at the root loader before any page renders, serve a dynamic robots[.]txt resource route, and set X-Robots-Tag headers via the headers() export — all without touching your deployment infrastructure.

Quick fix — create public/robots.txt

Place in your project's public/ folder (same level as app/). Vite copies it to the build output — served as a static asset at /robots.txt with no loader required.

User-agent: GPTBot
Disallow: /

User-agent: ClaudeBot
Disallow: /

User-agent: CCBot
Disallow: /

User-agent: Google-Extended
Disallow: /

All Methods

public/robots.txt (Recommended)

Easy

All deployment targets

public/robots.txt

Remix's Vite setup copies the public/ directory to the build output. robots.txt here is served as a static asset before Remix handles the request — zero server overhead.

Plain text only. Works on Vercel, Cloudflare Workers, Fly.io, and all other Remix adapters.

robots[.]txt Resource Route

Easy

SSR deployments

app/routes/robots[.]txt.tsx

A Remix loader-only route that returns a text/plain Response. The [.] bracket syntax prevents the dot from being treated as a path separator — maps to /robots.txt not /robots/txt.

Useful for environment-based rules. Conflict: if public/robots.txt exists, it takes precedence — use one or the other.

noai via meta export in root.tsx

Easy

All deployment targets

app/root.tsx → meta export

Export a meta function from app/root.tsx returning an array that includes { name: 'robots', content: 'noai, noimageai' }. Since root.tsx wraps every route, this applies globally.

Child route meta exports merge with (and can override) root meta. React Router v7 uses the same pattern.

X-Robots-Tag via headers() export

Easy

SSR deployments

app/root.tsx → headers export

Export a headers function from root.tsx that sets X-Robots-Tag: noai, noimageai on every response. More authoritative than HTML meta — applies at the HTTP layer.

Only works when Remix is running as a server. Not applied to static exports.

Root loader — hard bot blocking

Easy

SSR deployments

app/root.tsx → loader

Check the User-Agent in app/root.tsx's loader and throw a 403 Response for matched bots. Since root loader runs before every render, blocked bots receive no page HTML.

Use throw new Response(...) not return — throwing a Response in Remix bypasses rendering entirely.

Cloudflare Worker entry — edge blocking

Intermediate

Cloudflare Workers only

app/entry.worker.ts (or workers/app.ts)

Check the User-Agent in the Cloudflare Worker entry point before calling the Remix handler. Bots are blocked before any Remix code runs — the most efficient method for CF Workers deployments.

Alternatively use Cloudflare WAF custom rules — zero code changes, evaluated before Worker execution.

Method 1: public/robots.txt

Remix (with Vite) copies everything in the public/ directory to the build output root. A file at public/robots.txt is served as /robots.txt directly by the hosting platform — before Remix processes any request. Plain text, no loader, no imports.

User-agent: *
Allow: /

User-agent: GPTBot
Disallow: /

User-agent: ChatGPT-User
Disallow: /

User-agent: OAI-SearchBot
Disallow: /

User-agent: ClaudeBot
Disallow: /

User-agent: anthropic-ai
Disallow: /

User-agent: Google-Extended
Disallow: /

User-agent: Bytespider
Disallow: /

User-agent: CCBot
Disallow: /

User-agent: PerplexityBot
Disallow: /

User-agent: meta-externalagent
Disallow: /

User-agent: Amazonbot
Disallow: /

User-agent: Applebot-Extended
Disallow: /

User-agent: xAI-Bot
Disallow: /

User-agent: DeepSeekBot
Disallow: /

User-agent: MistralBot
Disallow: /

User-agent: Diffbot
Disallow: /

User-agent: cohere-ai
Disallow: /

User-agent: AI2Bot
Disallow: /

User-agent: Ai2Bot-Dolma
Disallow: /

User-agent: YouBot
Disallow: /

User-agent: DuckAssistBot
Disallow: /

User-agent: omgili
Disallow: /

User-agent: omgilibot
Disallow: /

User-agent: webzio-extended
Disallow: /

User-agent: gemini-deep-research
Disallow: /

Method 2: robots[.]txt Resource Route

Remix maps filenames to URLs using a convention where dots become path separators. To create a route at /robots.txt (with a literal dot), you must use bracket escaping: app/routes/robots[.]txt.tsx. Without brackets, robots.txt.tsx would map to /robots/txt.

A resource route exports only a loader (no default component export). The loader returns a Response directly:

// app/routes/robots[.]txt.tsx
import type { LoaderFunctionArgs } from '@remix-run/node';

const AI_BOTS = [
  'GPTBot', 'ChatGPT-User', 'OAI-SearchBot',
  'ClaudeBot', 'anthropic-ai', 'Google-Extended',
  'Bytespider', 'CCBot', 'PerplexityBot',
  'meta-externalagent', 'Amazonbot', 'Applebot-Extended',
  'xAI-Bot', 'DeepSeekBot', 'MistralBot', 'Diffbot',
  'cohere-ai', 'AI2Bot', 'Ai2Bot-Dolma', 'YouBot',
  'DuckAssistBot', 'omgili', 'omgilibot',
  'webzio-extended', 'gemini-deep-research',
];

export async function loader({ request }: LoaderFunctionArgs) {
  const isProduction = process.env.NODE_ENV === 'production';

  const lines: string[] = ['User-agent: *', 'Allow: /', ''];

  if (!isProduction) {
    // Block all crawlers on staging/preview
    lines.unshift('User-agent: *', 'Disallow: /', '');
  } else {
    for (const bot of AI_BOTS) {
      lines.push(`User-agent: ${bot}`, 'Disallow: /', '');
    }
  }

  lines.push('Sitemap: https://yourdomain.com/sitemap.xml');

  return new Response(lines.join('\n'), {
    headers: {
      'Content-Type': 'text/plain; charset=utf-8',
      'Cache-Control': 'public, max-age=86400',
    },
  });
}

public/robots.txt takes precedence

If public/robots.txt exists, the hosting platform (Vercel, Cloudflare, Fly.io) serves it before Remix handles the request — the resource route loader never runs. Use one approach only. Remove public/robots.txt if you want the dynamic route to take effect.

Method 3: noai Meta Tag in root.tsx

In Remix v2 (Vite), export a meta function from app/root.tsx. Since root.tsx is the parent of every route, this meta tag appears on every page. React Router v7 uses the same pattern.

// app/root.tsx
import type { MetaFunction } from '@remix-run/node';
import { Links, Meta, Outlet, Scripts } from '@remix-run/react';

export const meta: MetaFunction = () => [
  { charset: 'utf-8' },
  { name: 'viewport', content: 'width=device-width, initial-scale=1' },
  // Block AI training crawlers on every page
  { name: 'robots', content: 'noai, noimageai' },
];

export default function App() {
  return (
    <html lang="en">
      <head>
        <Meta />
        <Links />
      </head>
      <body>
        <Outlet />
        <Scripts />
      </body>
    </html>
  );
}

Child routes can override the robots directive for specific pages by including their own robots meta in their meta export — Remix merges meta arrays and the most specific (deepest route) value wins for duplicate names:

// app/routes/blog.$slug.tsx — allow AI to index blog posts
export const meta: MetaFunction = ({ data }) => [
  { title: data?.title },
  // Override root noai for this route only
  { name: 'robots', content: 'index, follow' },
];

Method 4: X-Robots-Tag via headers() Export

The headers export in Remix lets you set HTTP response headers on a route. Exporting it from app/root.tsx sets the header on every page response:

// app/root.tsx
import type { HeadersFunction } from '@remix-run/node';

export const headers: HeadersFunction = () => ({
  'X-Robots-Tag': 'noai, noimageai',
});
SSR only. The headers() export runs on the Remix server. For static exports or pre-rendered pages, the header is not set — use the meta tag approach for static deployments, and set headers via the hosting platform (Vercel vercel.json, Netlify _headers, Cloudflare Pages _headers).

Method 5: Root Loader — Hard Blocking

In Remix, throwing a Response from a loader stops rendering and returns that response directly. Adding bot detection to the root loader means matched AI bots receive a 403 before any route component renders — no page HTML, no data, nothing useful for training.

// app/root.tsx
import type { LoaderFunctionArgs } from '@remix-run/node';

const BLOCKED_UAS = /GPTBot|ChatGPT-User|OAI-SearchBot|ClaudeBot|anthropic-ai|Google-Extended|Bytespider|CCBot|PerplexityBot|meta-externalagent|Amazonbot|Applebot-Extended|xAI-Bot|DeepSeekBot|MistralBot|Diffbot|cohere-ai|AI2Bot|Ai2Bot-Dolma|YouBot|DuckAssistBot|omgili|omgilibot|webzio-extended|gemini-deep-research/i;

export async function loader({ request }: LoaderFunctionArgs) {
  const ua = request.headers.get('User-Agent') ?? '';

  if (BLOCKED_UAS.test(ua)) {
    // throw — not return — to bypass route rendering
    throw new Response('Forbidden', { status: 403 });
  }

  return null; // or return your normal root loader data
}

throw, not return

In Remix, return new Response(..., {status: 403}) from a loader is treated as loader data and passed to the component — the page still renders. You must use throw new Response(...) to short-circuit rendering and return the 403 directly to the client.

Method 6: Cloudflare Workers Entry (Edge Blocking)

For Remix apps deployed on Cloudflare Workers, you can intercept requests in the Worker entry point before they reach the Remix request handler. This is the most efficient method — zero Remix overhead for blocked bots.

With the Remix Vite + Cloudflare adapter, the worker entry is typically at workers/app.ts:

// workers/app.ts (Cloudflare Workers entry)
import { createRequestHandler } from '@remix-run/cloudflare';
import * as build from '../build/server';

const BLOCKED_UAS = /GPTBot|ClaudeBot|anthropic-ai|CCBot|Bytespider|Google-Extended|PerplexityBot|Diffbot|DeepSeekBot|MistralBot|cohere-ai|meta-externalagent|Amazonbot|xAI-Bot|AI2Bot|omgili|webzio-extended|gemini-deep-research/i;

const handleRequest = createRequestHandler({ build });

export default {
  async fetch(request: Request, env: Env, ctx: ExecutionContext) {
    const ua = request.headers.get('User-Agent') ?? '';

    if (BLOCKED_UAS.test(ua)) {
      return new Response('Forbidden', { status: 403 });
    }

    return handleRequest(request, { env, ctx });
  },
} satisfies ExportedHandler<Env>;

Alternatively, skip the code entirely and use a Cloudflare WAF custom rule — WAF rules are evaluated before your Worker runs. Go to Security → WAF → Custom rules and add a rule matching AI bot user agents with action Block.

React Router v7 (formerly Remix v3)

Remix merged into React Router in 2024. React Router v7 uses the same file conventions and loader patterns. The biggest change for bot blocking:

  • Official middleware — React Router v7 supports unstable_middleware export in route modules. You can create a root middleware that blocks bots without modifying the loader.
  • All other patterns (resource routes, meta export, headers export) remain identical.
  • Import paths change from @remix-run/node to react-router.

AI Bots to Block

25 user agents covering AI training crawlers and AI search bots. The robots.txt and loader patterns above include all of them.

GPTBotChatGPT-UserOAI-SearchBotClaudeBotanthropic-aiGoogle-ExtendedBytespiderCCBotPerplexityBotmeta-externalagentAmazonbotApplebot-ExtendedxAI-BotDeepSeekBotMistralBotDiffbotcohere-aiAI2BotAi2Bot-DolmaYouBotDuckAssistBotomgiliomgilibotwebzio-extendedgemini-deep-research

Frequently Asked Questions

How do I create a robots.txt route in Remix?

Create the file app/routes/robots[.]txt.tsx (note the brackets around the dot — this is Remix's escape syntax for literal dots in URLs). Export a loader function that returns new Response(content, { headers: { 'Content-Type': 'text/plain' } }). Remix will serve this at /robots.txt. The brackets prevent Remix from treating the dot as a path separator (which would route to /robots/txt instead of /robots.txt). For a simpler alternative with no routing logic, place a plain robots.txt file in public/ — Remix's Vite setup serves the public/ directory as static assets.

How do I add noai meta tags globally in Remix?

In Remix v2 with Vite, export a meta function from app/root.tsx that returns an array including { name: 'robots', content: 'noai, noimageai' }. This runs on every page since root.tsx is the root layout for all routes. In React Router v7, the pattern is identical — the meta export on the root route applies globally. Child routes can extend or override meta tags by including { name: 'robots', content: 'index, follow' } in their own meta export.

Can I block AI bots in a Remix loader function?

Yes. In the root loader in app/root.tsx, check the User-Agent request header and throw a Response with status 403 for matched bots. Since the root loader runs before every page render, no page HTML is generated for blocked bots. Use throw new Response('Forbidden', { status: 403 }) inside the loader — throwing a Response in Remix causes that response to be returned directly without rendering the route component.

Does Remix have middleware for bot blocking?

Remix v2 doesn't have Express-style middleware built in. The closest equivalent is the root loader approach — check the user agent in app/root.tsx's loader and throw a 403 Response before rendering. React Router v7 (the successor to Remix) introduced official middleware support via the unstable_middleware export. For Remix apps deployed on Cloudflare Workers, you can add bot blocking in the Cloudflare Worker script itself before the Remix handler is called — this is the most efficient approach.

How do I block AI bots on a Remix app deployed to Cloudflare Workers?

Cloudflare Workers deployments have a worker entry point (usually app/entry.worker.ts or workers/app.ts for Remix with Vite + Cloudflare adapter). You can intercept requests there before they reach the Remix handler: check the User-Agent header and return a new Response('Forbidden', { status: 403 }) for matched bots. Alternatively, use Cloudflare WAF custom rules — these are evaluated before your Worker executes and require zero code changes to your Remix app.

What is the difference between public/robots.txt and the robots[.]txt resource route in Remix?

public/robots.txt is served as a static file — it's copied verbatim to the build output and served by the hosting platform before Remix processes the request. This is the simplest approach and works on all Remix deployment targets. The robots[.]txt resource route is a Remix loader that generates robots.txt dynamically — it runs as part of your server and lets you vary the output by environment (block all crawlers on staging), read from a config file, or add logic. The trade-off: the resource route requires a running Remix server and has slightly higher latency. Use the static file unless you need dynamic generation.

Is your site protected from AI bots?

Run a free scan to check your robots.txt, meta tags, and overall AI readiness score.

Related Guides