How to Block AI Bots on Svelte: Complete 2026 Guide
Svelte compiled with Vite produces a pure SPA: a static index.html shell and JavaScript bundles. There is no Node server, no request handler, no middleware. This guide covers every blocking method that works within that constraint — from public/robots.txt to Cloudflare Pages _worker.js to nginx hard blocking.
Svelte SPA vs SvelteKit
This guide covers Svelte + Vite (plain SPA — no server, no SSR). If you are using SvelteKit, see the SvelteKit guide — it has server-side rendering, hooks.server.ts middleware, and much more powerful blocking options.
Methods at a glance
| Method | What it does | Blocks JS-less bots? |
|---|---|---|
| public/robots.txt | Signals crawlers to stay out | Signal only |
| index.html noai tag | Opt out of AI training (all crawlers) | ✓ (pre-JS HTML) |
| <svelte:head> noai | Per-component noai tag | ✗ (JS-only) |
| X-Robots-Tag header | noai via HTTP header (all pages) | ✓ (header) |
| nginx map + return 403 | Hard block at reverse proxy | ✓ |
| Cloudflare Pages _worker.js | Hard block at edge (CF Pages only) | ✓ |
| Vercel Edge Middleware | Hard block at edge (Vercel only) | ✓ |
| Netlify Edge Functions | Hard block at edge (Netlify only) | ✓ |
1. robots.txt — public/robots.txt
Vite copies everything in public/ verbatim to the build output at the root level. A public/robots.txt file becomes dist/robots.txt and is served at /robots.txt with no extra config.
# public/robots.txt
User-agent: GPTBot
Disallow: /
User-agent: ChatGPT-User
Disallow: /
User-agent: OAI-SearchBot
Disallow: /
User-agent: ClaudeBot
Disallow: /
User-agent: Claude-Web
Disallow: /
User-agent: anthropic-ai
Disallow: /
User-agent: Google-Extended
Disallow: /
User-agent: Bytespider
Disallow: /
User-agent: CCBot
Disallow: /
User-agent: meta-externalagent
Disallow: /
User-agent: Amazonbot
Disallow: /
User-agent: Applebot-Extended
Disallow: /
User-agent: PerplexityBot
Disallow: /
User-agent: cohere-ai
Disallow: /
User-agent: YouBot
Disallow: /
User-agent: DuckAssistBot
Disallow: /
User-agent: Diffbot
Disallow: /
User-agent: omgilibot
Disallow: /
User-agent: omgili
Disallow: /
User-agent: Webzio
Disallow: /
User-agent: AI2Bot
Disallow: /
User-agent: DeepSeekBot
Disallow: /
User-agent: MistralAI
Disallow: /
User-agent: xAI-Bot
Disallow: /
User-agent: gemini-deep-research
Disallow: /
User-agent: *
Allow: /Vite public/ vs src/assets/
Only files in public/ are copied verbatim. Files in src/assets/ go through the Vite build pipeline and get content-hashed filenames — robots.txt must be in public/.
2. noai meta tag in index.html
In a Vite project, index.html lives at the project root and is the entry point Vite processes. Unlike the <svelte:head> block in your components (which requires JavaScript to execute), tags you add directly to index.html are in the raw HTML response — crawlers see them immediately, before any JavaScript runs.
This is the most reliable way to deliver noai to non-JS crawlers from a Svelte SPA.
<!-- index.html (project root — Vite entry point) -->
<!doctype html>
<html lang="en">
<head>
<meta charset="UTF-8" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<!-- AI training opt-out — visible before JavaScript runs -->
<meta name="robots" content="noai, noimageai" />
<title>My App</title>
</head>
<body>
<div id="app"></div>
<script type="module" src="/src/main.ts"></script>
</body>
</html>Why index.html beats <svelte:head> for global noai
Svelte SPA is client-rendered: the server sends a nearly-empty HTML shell and Svelte injects content after JavaScript loads. Any <svelte:head> tags — including a noai meta — are injected by JS at runtime. A bot that doesn't execute JavaScript sees <div id="app"></div> and nothing else. Adding noai directly to index.html ensures every HTML response carries it, regardless of JavaScript.
Per-page noai is not possible in a plain SPA
A Svelte SPA has a single index.html served for every route via the HTML5 History API fallback. You cannot serve a different <head> per route without a server layer. If per-route control matters, use SvelteKit.
3. <svelte:head> meta tags (JS-only)
<svelte:head> lets any Svelte component inject into <head>. In SPA mode this runs after JavaScript loads — useful for dynamic title/description updates, but invisible to non-JS crawlers. Use it for secondary coverage, not as your primary noai mechanism.
<!-- App.svelte — global noai via svelte:head (JS-only fallback) -->
<script lang="ts">
// Component logic
</script>
<svelte:head>
<meta name="robots" content="noai, noimageai" />
</svelte:head>
<main>
<!-- your app content -->
</main><!-- Per-route component — override if a specific page allows AI -->
<svelte:head>
<meta name="robots" content="index, follow" />
</svelte:head>Caveat: SPA vs SSR behaviour
In SvelteKit with SSR enabled, <svelte:head> tags are rendered server-side and appear in the HTML source — bots see them. In a Svelte SPA (Vite, no SSR), they are injected at runtime by JavaScript — bots that don't run JS never see them. This is the fundamental difference between the two setups.
4. X-Robots-Tag response header
The X-Robots-Tag HTTP header works like the <meta name="robots"> tag but is set at the server or CDN layer — no HTML changes required. It applies to every response, including non-HTML files.
Netlify (netlify.toml)
# netlify.toml
[[headers]]
for = "/*"
[headers.values]
X-Robots-Tag = "noai, noimageai"Vercel (vercel.json)
{
"headers": [
{
"source": "/(.*)",
"headers": [
{ "key": "X-Robots-Tag", "value": "noai, noimageai" }
]
}
]
}Cloudflare Pages (_headers)
# public/_headers (copied to dist/ by Vite)
/*
X-Robots-Tag: noai, noimageainginx
server {
listen 80;
server_name example.com;
root /var/www/html/dist;
add_header X-Robots-Tag "noai, noimageai" always;
location / {
try_files $uri $uri/ /index.html;
}
}5. nginx hard blocking (return 403)
If you self-host your Svelte build behind nginx, a map block lets you match User-Agent strings and return 403 before any static file is served. Compile the bot list once in the map block — the regex is evaluated per request at near-zero cost.
# nginx.conf (http block)
map $http_user_agent $block_ai_bot {
default 0;
~*GPTBot 1;
~*ChatGPT-User 1;
~*OAI-SearchBot 1;
~*ClaudeBot 1;
~*Claude-Web 1;
~*anthropic-ai 1;
~*Google-Extended 1;
~*Bytespider 1;
~*CCBot 1;
~*meta-externalagent 1;
~*Amazonbot 1;
~*Applebot-Extended 1;
~*PerplexityBot 1;
~*cohere-ai 1;
~*YouBot 1;
~*DuckAssistBot 1;
~*Diffbot 1;
~*omgilibot 1;
~*omgili 1;
~*Webzio 1;
~*AI2Bot 1;
~*DeepSeekBot 1;
~*MistralAI 1;
~*xAI-Bot 1;
~*gemini-deep-research 1;
}
server {
listen 443 ssl;
server_name example.com;
root /var/www/html/dist;
# Always serve robots.txt — blocking it hides your rules from crawlers
location = /robots.txt {
try_files $uri =404;
}
# Block matched bots
location / {
if ($block_ai_bot) {
return 403 "Forbidden";
}
try_files $uri $uri/ /index.html;
}
}nginx if() inside location
nginx's if() directive is generally discouraged for complex rewrites but is safe for simple return statements. The pattern above is production-safe. Alternatively, use deny via a geo/map + satisfy combination if you need more control.
6. Cloudflare Pages — _worker.js
Cloudflare Pages supports a special _worker.js file that runs as a Cloudflare Worker before static assets are served. This gives you server-like middleware for a fully static Svelte SPA — no Node server required.
Place _worker.js in public/ and Vite will copy it to dist/. Cloudflare Pages picks it up automatically when it finds _worker.js at the root of the output directory.
// public/_worker.js
// Runs at the Cloudflare edge before static assets are served
const BLOCKED_UAS = /GPTBot|ChatGPT-User|OAI-SearchBot|ClaudeBot|Claude-Web|anthropic-ai|Google-Extended|Bytespider|CCBot|meta-externalagent|Amazonbot|Applebot-Extended|PerplexityBot|cohere-ai|YouBot|DuckAssistBot|Diffbot|omgilibot|omgili|Webzio|AI2Bot|DeepSeekBot|MistralAI|xAI-Bot|gemini-deep-research/i;
export default {
async fetch(request, env) {
const url = new URL(request.url);
// Always serve robots.txt — never block it
if (url.pathname === '/robots.txt') {
return env.ASSETS.fetch(request);
}
const ua = request.headers.get('User-Agent') || '';
if (BLOCKED_UAS.test(ua)) {
return new Response('Forbidden', {
status: 403,
headers: { 'Content-Type': 'text/plain' },
});
}
// Pass through to Cloudflare Pages static asset serving
return env.ASSETS.fetch(request);
},
};env.ASSETS — Cloudflare Pages asset binding
env.ASSETS is the Cloudflare Pages asset binding that serves your static files. It handles the SPA fallback (serving index.html for unknown paths) automatically. You must call env.ASSETS.fetch(request) — not a plain fetch(request) — to reach static files.
_worker.js vs Functions/ directory
Cloudflare Pages also supports a functions/ directory for route-specific handlers. Use _worker.js when you want a single handler for ALL requests (including the asset fallback). Use functions/ when you need per-route logic alongside static asset serving.
7. Vercel — Edge Middleware
For Svelte SPAs deployed to Vercel (not SvelteKit), create a middleware.js (or middleware.ts) at the project root. Vercel runs it at the edge before serving static files.
// middleware.js (project root — NOT in src/ or public/)
// Vercel deploys this as an Edge Middleware automatically
import { NextResponse } from 'next/server';
const BLOCKED_UAS = /GPTBot|ChatGPT-User|OAI-SearchBot|ClaudeBot|Claude-Web|anthropic-ai|Google-Extended|Bytespider|CCBot|meta-externalagent|Amazonbot|Applebot-Extended|PerplexityBot|cohere-ai|YouBot|DuckAssistBot|Diffbot|omgilibot|omgili|Webzio|AI2Bot|DeepSeekBot|MistralAI|xAI-Bot|gemini-deep-research/i;
export function middleware(request) {
const { pathname } = request.nextUrl;
// Always serve robots.txt
if (pathname === '/robots.txt') {
return NextResponse.next();
}
const ua = request.headers.get('user-agent') || '';
if (BLOCKED_UAS.test(ua)) {
return new Response('Forbidden', { status: 403 });
}
return NextResponse.next();
}
export const config = {
// Run on all routes — excludes _next/static and _next/image automatically
matcher: ['/((?!_next/static|_next/image|favicon.ico).*)'],
};Vite + Vercel note
Vercel's Edge Middleware uses the Next.js runtime API (next/server) even for non-Next.js projects. Add next as a dev dependency (npm i -D next) so TypeScript types resolve — it won't be bundled into your Svelte app.
8. Netlify — Edge Functions
Netlify's _headers file only supports header injection — it cannot return a 403 based on User-Agent. For hard blocking on Netlify, use an Edge Function.
// netlify/edge-functions/block-ai-bots.js
const BLOCKED_UAS = /GPTBot|ChatGPT-User|OAI-SearchBot|ClaudeBot|Claude-Web|anthropic-ai|Google-Extended|Bytespider|CCBot|meta-externalagent|Amazonbot|Applebot-Extended|PerplexityBot|cohere-ai|YouBot|DuckAssistBot|Diffbot|omgilibot|omgili|Webzio|AI2Bot|DeepSeekBot|MistralAI|xAI-Bot|gemini-deep-research/i;
export default async (request, context) => {
const url = new URL(request.url);
if (url.pathname === '/robots.txt') {
return context.next();
}
const ua = request.headers.get('user-agent') || '';
if (BLOCKED_UAS.test(ua)) {
return new Response('Forbidden', { status: 403 });
}
return context.next();
};
export const config = { path: '/*' };# netlify.toml — wire the edge function
[[edge_functions]]
function = "block-ai-bots"
path = "/*"Svelte SPA vs SvelteKit — blocking comparison
| Feature | Svelte SPA (Vite) | SvelteKit |
|---|---|---|
| robots.txt | public/robots.txt | static/robots.txt |
| Global noai tag | index.html (root) — pre-JS | app.html (src/) — pre-JS |
| Per-route noai | Not possible (single index.html) | <svelte:head> — SSR-rendered |
| svelte:head bots see? | ✗ (JS-only) | ✓ (SSR-rendered in HTML) |
| Hard 403 server-side | No (no server layer) | hooks.server.ts handle() |
| Hard 403 edge/CDN | Cloudflare _worker.js / Vercel Edge / Netlify Edge Fn | Adapter + hooks OR above |
| X-Robots-Tag | netlify.toml / vercel.json / nginx / _headers | hooks.server.ts or same hosting config |
| Middleware concept | None — pure static output | hooks.server.ts handle() |
Key takeaway: SvelteKit renders <svelte:head> server-side so bots see it in HTML; Svelte SPA does not. For per-route noai or server-side hard blocking, migrate to SvelteKit.
Hosting comparison
| Host | robots.txt | X-Robots-Tag | Hard 403 |
|---|---|---|---|
| Cloudflare Pages | public/robots.txt ✓ | _headers ✓ | _worker.js ✓ |
| Vercel | public/robots.txt ✓ | vercel.json ✓ | middleware.js ✓ |
| Netlify | public/robots.txt ✓ | netlify.toml ✓ | Edge Function ✓ |
| GitHub Pages | public/robots.txt ✓ | ✗ (no header config) | Cloudflare WAF proxy ✓ |
| AWS S3 + CloudFront | public/robots.txt ✓ | CF Response Headers ✓ | CF Function ✓ |
| nginx (self-hosted) | public/robots.txt ✓ | add_header ✓ | map + return 403 ✓ |
FAQ
Does robots.txt block AI bots on a Svelte SPA?
robots.txt signals which crawlers to disallow, but most AI training bots ignore it. For guaranteed blocking you need hard server-level enforcement via nginx, a Cloudflare Pages Worker, or a Vercel Edge Function.
Does <svelte:head> work for blocking AI bots?
In a Svelte SPA, <svelte:head> tags are injected by JavaScript at runtime — non-JS crawlers never see them. Use index.html (Vite project root) instead: the noai meta tag there is in every HTML response before JavaScript runs.
What is the difference between Svelte and SvelteKit for AI bot blocking?
SvelteKit has server-side rendering and a hooks.server.ts handle() hook that can return a 403 before any content is sent. Svelte SPA (Vite) has no server layer — hard blocking requires nginx, a Cloudflare Pages _worker.js, or a Vercel Edge Function. SvelteKit <svelte:head> is rendered server-side so bots see it; Svelte SPA <svelte:head> is client-side only.
What is the best way to block AI bots on Cloudflare Pages with Svelte?
Add a _worker.js file to your public/ directory. Cloudflare Pages runs it as an edge Worker before serving static assets. Check the User-Agent header in the fetch handler and return a 403 Response for matched bots — exempt /robots.txt so it stays accessible.
Can I use a Service Worker to block AI bots in a Svelte SPA?
No. Service Workers only intercept requests made by browsers that have already loaded the Service Worker registration script. A bot fetching your page for the first time will not have a Service Worker installed and will bypass it entirely.
Do I need to change anything in vite.config.ts to serve robots.txt?
No. Vite copies public/ to dist/ verbatim as part of every build. No plugin or config change is required for robots.txt to be served at /robots.txt after deploy.
Is your site protected from AI bots?
Run a free scan to check your robots.txt, meta tags, and overall AI readiness score.