How to Block AI Bots on Hono: Complete 2026 Guide
Hono is a lightweight, runtime-agnostic web framework — the same application code runs on Cloudflare Workers, Bun, Deno, and Node.js. This guide covers every approach: serving robots.txt with serveStatic, hard-blocking bots with app.use() middleware, adding X-Robots-Tag response headers, embedding noai meta tags via Hono JSX, and runtime-specific deployment notes.
Hono v4
All examples target Hono v4 with TypeScript. The middleware API is stable across v3 and v4. Runtime-specific adapters (@hono/node-server, hono/cloudflare-workers) are shown where the import path differs.
Methods at a glance
| Method | What it does | Blocks JS-less bots? |
|---|---|---|
| serveStatic → robots.txt | Signals crawlers to stay out | Signal only |
| GET /robots.txt handler | Dynamic robots.txt via string constant | Signal only |
| noai meta in JSX template | Opt out of AI training per page | ✓ (server-rendered) |
| X-Robots-Tag middleware | noai on all HTTP responses | ✓ (header) |
| app.use("*") UA middleware | Hard 403 globally — before any route | ✓ |
| nginx map block | Hard 403 at reverse proxy layer | ✓ |
1. robots.txt
Hono does not auto-serve a public/ directory. Use serveStatic from the runtime adapter, or handle the route explicitly. Register robots.txt before bot-blocking middleware so crawlers can always read your disallow rules.
public/robots.txt
User-agent: GPTBot
Disallow: /
User-agent: ChatGPT-User
Disallow: /
User-agent: OAI-SearchBot
Disallow: /
User-agent: ClaudeBot
Disallow: /
User-agent: Claude-Web
Disallow: /
User-agent: anthropic-ai
Disallow: /
User-agent: Google-Extended
Disallow: /
User-agent: Bytespider
Disallow: /
User-agent: CCBot
Disallow: /
User-agent: PerplexityBot
Disallow: /
User-agent: Applebot-Extended
Disallow: /
User-agent: *
Allow: /Node.js — serveStatic
import { Hono } from 'hono';
import { serveStatic } from '@hono/node-server/serve-static';
const app = new Hono();
// Serve robots.txt BEFORE bot-blocking middleware
app.use('/robots.txt', serveStatic({ path: './public/robots.txt' }));
// ... bot-blocking middleware and routes belowCloudflare Workers — string constant
Workers isolates have no file system. Embed robots.txt as a string constant and handle the route explicitly.
import { Hono } from 'hono';
const ROBOTS = `User-agent: GPTBot
Disallow: /
User-agent: ChatGPT-User
Disallow: /
User-agent: OAI-SearchBot
Disallow: /
User-agent: ClaudeBot
Disallow: /
User-agent: anthropic-ai
Disallow: /
User-agent: Google-Extended
Disallow: /
User-agent: Bytespider
Disallow: /
User-agent: CCBot
Disallow: /
User-agent: PerplexityBot
Disallow: /
User-agent: *
Allow: /`;
const app = new Hono();
// Serve robots.txt BEFORE bot-blocking middleware
app.get('/robots.txt', (c) => c.text(ROBOTS));
export default app;2. Hard blocking with app.use()
app.use('*', handler) registers global middleware that runs before any route handler. Compile the User-Agent regex at module scope — not inside the handler — so the regex is built once at startup rather than on every request.
import { Hono } from 'hono';
// Compile once at module scope
const AI_BOT_UA = /GPTBot|ChatGPT-User|OAI-SearchBot|ClaudeBot|Claude-Web|anthropic-ai|Google-Extended|Bytespider|CCBot|PerplexityBot|Applebot-Extended|DuckAssistBot|cohere-ai|Meta-ExternalAgent|Diffbot|YouBot|Amazonbot|AI2Bot|Timpibot|PetalBot|Kangaroo Bot/i;
const app = new Hono();
// 1. robots.txt — always accessible (before blocking middleware)
app.get('/robots.txt', (c) => c.text(ROBOTS));
// 2. Bot-blocking middleware — applies to all routes
app.use('*', async (c, next) => {
const ua = c.req.header('user-agent') ?? '';
if (AI_BOT_UA.test(ua)) {
return c.text('Forbidden', 403);
}
return next();
});
// Routes defined after middleware are protected
app.get('/', (c) => c.text('Hello, human!'));
app.get('/api/data', (c) => c.json({ ok: true }));
export default app;Registration order matters
Hono processes middleware in the order it is registered. app.use('*') registered before any app.get() intercepts every request. If you register the middleware after a route, that route is not protected.
3. X-Robots-Tag response header
X-Robots-Tag: noai, noimageai tells AI training crawlers to skip the page even if they bypassed robots.txt. Add it as a separate middleware that calls await next() first, then sets the header on the outgoing response.
app.use('*', async (c, next) => {
await next();
// c.res is the Response — set headers after the handler runs
c.res.headers.set('X-Robots-Tag', 'noai, noimageai');
});Register this middleware after the bot-blocking middleware so blocked requests never reach it.
4. noai meta tag with Hono JSX
Hono has a built-in JSX renderer. Add <meta name="robots" content="noai, noimageai" /> in your HTML <head> to opt pages out of AI training. The /** @jsxImportSource hono/jsx */ comment sets the JSX factory without a tsconfig change.
/** @jsxImportSource hono/jsx */
import { Hono } from 'hono';
const app = new Hono();
app.get('/', (c) => {
return c.html(
<html lang="en">
<head>
<meta charset="UTF-8" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
{/* Opt out of AI training crawlers */}
<meta name="robots" content="noai, noimageai" />
<title>My Hono App</title>
</head>
<body>
<h1>Hello, human!</h1>
</body>
</html>
);
});
export default app;The noai directive is recognised by CCBot, Common Crawl, and a growing list of AI training crawlers. It is separate from noindex — it does not affect search engine indexing.
5. Full example
Combining all layers: robots.txt route first, bot-blocking middleware second, X-Robots-Tag middleware third, then routes.
import { Hono } from 'hono';
// ─── constants ───────────────────────────────────────────────────────────────
const ROBOTS = `User-agent: GPTBot
Disallow: /
User-agent: ChatGPT-User
Disallow: /
User-agent: OAI-SearchBot
Disallow: /
User-agent: ClaudeBot
Disallow: /
User-agent: anthropic-ai
Disallow: /
User-agent: Google-Extended
Disallow: /
User-agent: Bytespider
Disallow: /
User-agent: CCBot
Disallow: /
User-agent: PerplexityBot
Disallow: /
User-agent: *
Allow: /`;
// Compiled once at module scope — not per-request
const AI_BOT_UA = /GPTBot|ChatGPT-User|OAI-SearchBot|ClaudeBot|Claude-Web|anthropic-ai|Google-Extended|Bytespider|CCBot|PerplexityBot|Applebot-Extended|DuckAssistBot|cohere-ai|Meta-ExternalAgent|Diffbot|YouBot|Amazonbot|AI2Bot/i;
// ─── app ─────────────────────────────────────────────────────────────────────
const app = new Hono();
// 1. robots.txt — before blocking so bots can read it
app.get('/robots.txt', (c) => c.text(ROBOTS));
// 2. Hard block known AI bots
app.use('*', async (c, next) => {
const ua = c.req.header('user-agent') ?? '';
if (AI_BOT_UA.test(ua)) {
return c.text('Forbidden', 403);
}
return next();
});
// 3. X-Robots-Tag on every response
app.use('*', async (c, next) => {
await next();
c.res.headers.set('X-Robots-Tag', 'noai, noimageai');
});
// 4. Routes
app.get('/', (c) => c.text('Hello, human!'));
export default app;6. Runtime entrypoints
The middleware above is identical across all runtimes. Only the entrypoint changes.
Cloudflare Workers
// index.ts — workers entrypoint
export default app;
// wrangler.toml
// name = "my-worker"
// main = "src/index.ts"
// compatibility_date = "2024-01-01"Node.js
import { serve } from '@hono/node-server';
import { serveStatic } from '@hono/node-server/serve-static';
// Override robots.txt for Node — use file system
app.use('/robots.txt', serveStatic({ path: './public/robots.txt' }));
serve({ fetch: app.fetch, port: 3000 });
// npm install @hono/node-serverBun
// index.ts
export default {
port: 3000,
fetch: app.fetch,
};
// Run: bun run index.tsDeno
// index.ts
Deno.serve({ port: 3000 }, app.fetch);
// Run: deno run --allow-net --allow-read index.ts7. nginx — block at the proxy layer
For Node.js / Bun deployments behind nginx, block AI bots at the proxy before requests reach Hono. This is the most efficient approach — blocked bots never consume application resources.
# /etc/nginx/conf.d/hono-app.conf
map $http_user_agent $blocked_bot {
default 0;
"~*GPTBot" 1;
"~*ChatGPT-User" 1;
"~*OAI-SearchBot" 1;
"~*ClaudeBot" 1;
"~*anthropic-ai" 1;
"~*Google-Extended" 1;
"~*Bytespider" 1;
"~*CCBot" 1;
"~*PerplexityBot" 1;
"~*Applebot-Extended" 1;
}
server {
listen 80;
server_name example.com;
# Always allow robots.txt — bots can read crawl directives
location = /robots.txt {
proxy_pass http://127.0.0.1:3000;
}
location / {
if ($blocked_bot) {
return 403;
}
proxy_pass http://127.0.0.1:3000;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
}
}FAQ
How do I serve robots.txt in Hono?
Use serveStatic from the appropriate runtime adapter. For Node.js: import { serveStatic } from "@hono/node-server/serve-static". For Cloudflare Workers: Workers have no file system — handle GET /robots.txt explicitly and return a string constant. Register robots.txt before bot-blocking middleware so crawlers can always read your directives.
Where do I register bot-blocking middleware in Hono?
Use app.use("*", ...) registered before your route definitions. Hono executes middleware in registration order — any app.use() before the first app.get()/app.post() intercepts all routes. Compile the User-Agent regex at module scope (not inside the handler) so it is built once and reused across requests.
Does Hono middleware work the same way on all runtimes?
Yes. Hono's middleware API is runtime-agnostic. The same bot-blocking middleware runs unchanged on Cloudflare Workers, Bun, Deno, Node.js, and AWS Lambda. Only the entrypoint differs: Workers uses export default app, Node.js wraps it with serve(app), Bun exports { fetch: app.fetch }, and Deno uses Deno.serve(app.fetch).
How do I add X-Robots-Tag headers in Hono?
Add a middleware that calls await next() first, then sets the header: app.use("*", async (c, next) => { await next(); c.res.headers.set("X-Robots-Tag", "noai, noimageai"); }). Register this middleware before your route handlers. X-Robots-Tag tells AI training crawlers to ignore the page even if they reached it.
Can I use Hono JSX to add noai meta tags?
Yes. Set /** @jsxImportSource hono/jsx */ at the top of your file, then return c.html(<html><head><meta name="robots" content="noai, noimageai" /></head>...</html>) from your route handler. The noai directive tells AI training crawlers to skip the page content even if they bypass robots.txt.
Is your site protected from AI bots?
Run a free scan to check your robots.txt, meta tags, and overall AI readiness score.