Skip to content
Hono · TypeScript · Multi-runtime·9 min read

How to Block AI Bots on Hono: Complete 2026 Guide

Hono is a lightweight, runtime-agnostic web framework — the same application code runs on Cloudflare Workers, Bun, Deno, and Node.js. This guide covers every approach: serving robots.txt with serveStatic, hard-blocking bots with app.use() middleware, adding X-Robots-Tag response headers, embedding noai meta tags via Hono JSX, and runtime-specific deployment notes.

Hono v4

All examples target Hono v4 with TypeScript. The middleware API is stable across v3 and v4. Runtime-specific adapters (@hono/node-server, hono/cloudflare-workers) are shown where the import path differs.

Methods at a glance

MethodWhat it doesBlocks JS-less bots?
serveStatic → robots.txtSignals crawlers to stay outSignal only
GET /robots.txt handlerDynamic robots.txt via string constantSignal only
noai meta in JSX templateOpt out of AI training per page✓ (server-rendered)
X-Robots-Tag middlewarenoai on all HTTP responses✓ (header)
app.use("*") UA middlewareHard 403 globally — before any route
nginx map blockHard 403 at reverse proxy layer

1. robots.txt

Hono does not auto-serve a public/ directory. Use serveStatic from the runtime adapter, or handle the route explicitly. Register robots.txt before bot-blocking middleware so crawlers can always read your disallow rules.

public/robots.txt

User-agent: GPTBot
Disallow: /

User-agent: ChatGPT-User
Disallow: /

User-agent: OAI-SearchBot
Disallow: /

User-agent: ClaudeBot
Disallow: /

User-agent: Claude-Web
Disallow: /

User-agent: anthropic-ai
Disallow: /

User-agent: Google-Extended
Disallow: /

User-agent: Bytespider
Disallow: /

User-agent: CCBot
Disallow: /

User-agent: PerplexityBot
Disallow: /

User-agent: Applebot-Extended
Disallow: /

User-agent: *
Allow: /

Node.js — serveStatic

import { Hono } from 'hono';
import { serveStatic } from '@hono/node-server/serve-static';

const app = new Hono();

// Serve robots.txt BEFORE bot-blocking middleware
app.use('/robots.txt', serveStatic({ path: './public/robots.txt' }));

// ... bot-blocking middleware and routes below

Cloudflare Workers — string constant

Workers isolates have no file system. Embed robots.txt as a string constant and handle the route explicitly.

import { Hono } from 'hono';

const ROBOTS = `User-agent: GPTBot
Disallow: /
User-agent: ChatGPT-User
Disallow: /
User-agent: OAI-SearchBot
Disallow: /
User-agent: ClaudeBot
Disallow: /
User-agent: anthropic-ai
Disallow: /
User-agent: Google-Extended
Disallow: /
User-agent: Bytespider
Disallow: /
User-agent: CCBot
Disallow: /
User-agent: PerplexityBot
Disallow: /
User-agent: *
Allow: /`;

const app = new Hono();

// Serve robots.txt BEFORE bot-blocking middleware
app.get('/robots.txt', (c) => c.text(ROBOTS));

export default app;

2. Hard blocking with app.use()

app.use('*', handler) registers global middleware that runs before any route handler. Compile the User-Agent regex at module scope — not inside the handler — so the regex is built once at startup rather than on every request.

import { Hono } from 'hono';

// Compile once at module scope
const AI_BOT_UA = /GPTBot|ChatGPT-User|OAI-SearchBot|ClaudeBot|Claude-Web|anthropic-ai|Google-Extended|Bytespider|CCBot|PerplexityBot|Applebot-Extended|DuckAssistBot|cohere-ai|Meta-ExternalAgent|Diffbot|YouBot|Amazonbot|AI2Bot|Timpibot|PetalBot|Kangaroo Bot/i;

const app = new Hono();

// 1. robots.txt — always accessible (before blocking middleware)
app.get('/robots.txt', (c) => c.text(ROBOTS));

// 2. Bot-blocking middleware — applies to all routes
app.use('*', async (c, next) => {
  const ua = c.req.header('user-agent') ?? '';
  if (AI_BOT_UA.test(ua)) {
    return c.text('Forbidden', 403);
  }
  return next();
});

// Routes defined after middleware are protected
app.get('/', (c) => c.text('Hello, human!'));
app.get('/api/data', (c) => c.json({ ok: true }));

export default app;

Registration order matters

Hono processes middleware in the order it is registered. app.use('*') registered before any app.get() intercepts every request. If you register the middleware after a route, that route is not protected.

3. X-Robots-Tag response header

X-Robots-Tag: noai, noimageai tells AI training crawlers to skip the page even if they bypassed robots.txt. Add it as a separate middleware that calls await next() first, then sets the header on the outgoing response.

app.use('*', async (c, next) => {
  await next();
  // c.res is the Response — set headers after the handler runs
  c.res.headers.set('X-Robots-Tag', 'noai, noimageai');
});

Register this middleware after the bot-blocking middleware so blocked requests never reach it.

4. noai meta tag with Hono JSX

Hono has a built-in JSX renderer. Add <meta name="robots" content="noai, noimageai" /> in your HTML <head> to opt pages out of AI training. The /** @jsxImportSource hono/jsx */ comment sets the JSX factory without a tsconfig change.

/** @jsxImportSource hono/jsx */
import { Hono } from 'hono';

const app = new Hono();

app.get('/', (c) => {
  return c.html(
    <html lang="en">
      <head>
        <meta charset="UTF-8" />
        <meta name="viewport" content="width=device-width, initial-scale=1.0" />
        {/* Opt out of AI training crawlers */}
        <meta name="robots" content="noai, noimageai" />
        <title>My Hono App</title>
      </head>
      <body>
        <h1>Hello, human!</h1>
      </body>
    </html>
  );
});

export default app;

The noai directive is recognised by CCBot, Common Crawl, and a growing list of AI training crawlers. It is separate from noindex — it does not affect search engine indexing.

5. Full example

Combining all layers: robots.txt route first, bot-blocking middleware second, X-Robots-Tag middleware third, then routes.

import { Hono } from 'hono';

// ─── constants ───────────────────────────────────────────────────────────────

const ROBOTS = `User-agent: GPTBot
Disallow: /
User-agent: ChatGPT-User
Disallow: /
User-agent: OAI-SearchBot
Disallow: /
User-agent: ClaudeBot
Disallow: /
User-agent: anthropic-ai
Disallow: /
User-agent: Google-Extended
Disallow: /
User-agent: Bytespider
Disallow: /
User-agent: CCBot
Disallow: /
User-agent: PerplexityBot
Disallow: /
User-agent: *
Allow: /`;

// Compiled once at module scope — not per-request
const AI_BOT_UA = /GPTBot|ChatGPT-User|OAI-SearchBot|ClaudeBot|Claude-Web|anthropic-ai|Google-Extended|Bytespider|CCBot|PerplexityBot|Applebot-Extended|DuckAssistBot|cohere-ai|Meta-ExternalAgent|Diffbot|YouBot|Amazonbot|AI2Bot/i;

// ─── app ─────────────────────────────────────────────────────────────────────

const app = new Hono();

// 1. robots.txt — before blocking so bots can read it
app.get('/robots.txt', (c) => c.text(ROBOTS));

// 2. Hard block known AI bots
app.use('*', async (c, next) => {
  const ua = c.req.header('user-agent') ?? '';
  if (AI_BOT_UA.test(ua)) {
    return c.text('Forbidden', 403);
  }
  return next();
});

// 3. X-Robots-Tag on every response
app.use('*', async (c, next) => {
  await next();
  c.res.headers.set('X-Robots-Tag', 'noai, noimageai');
});

// 4. Routes
app.get('/', (c) => c.text('Hello, human!'));

export default app;

6. Runtime entrypoints

The middleware above is identical across all runtimes. Only the entrypoint changes.

Cloudflare Workers

// index.ts — workers entrypoint
export default app;

// wrangler.toml
// name = "my-worker"
// main = "src/index.ts"
// compatibility_date = "2024-01-01"

Node.js

import { serve } from '@hono/node-server';
import { serveStatic } from '@hono/node-server/serve-static';

// Override robots.txt for Node — use file system
app.use('/robots.txt', serveStatic({ path: './public/robots.txt' }));

serve({ fetch: app.fetch, port: 3000 });

// npm install @hono/node-server

Bun

// index.ts
export default {
  port: 3000,
  fetch: app.fetch,
};

// Run: bun run index.ts

Deno

// index.ts
Deno.serve({ port: 3000 }, app.fetch);

// Run: deno run --allow-net --allow-read index.ts

7. nginx — block at the proxy layer

For Node.js / Bun deployments behind nginx, block AI bots at the proxy before requests reach Hono. This is the most efficient approach — blocked bots never consume application resources.

# /etc/nginx/conf.d/hono-app.conf

map $http_user_agent $blocked_bot {
    default         0;
    "~*GPTBot"      1;
    "~*ChatGPT-User" 1;
    "~*OAI-SearchBot" 1;
    "~*ClaudeBot"   1;
    "~*anthropic-ai" 1;
    "~*Google-Extended" 1;
    "~*Bytespider"  1;
    "~*CCBot"       1;
    "~*PerplexityBot" 1;
    "~*Applebot-Extended" 1;
}

server {
    listen 80;
    server_name example.com;

    # Always allow robots.txt — bots can read crawl directives
    location = /robots.txt {
        proxy_pass http://127.0.0.1:3000;
    }

    location / {
        if ($blocked_bot) {
            return 403;
        }
        proxy_pass http://127.0.0.1:3000;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
    }
}

FAQ

How do I serve robots.txt in Hono?

Use serveStatic from the appropriate runtime adapter. For Node.js: import { serveStatic } from "@hono/node-server/serve-static". For Cloudflare Workers: Workers have no file system — handle GET /robots.txt explicitly and return a string constant. Register robots.txt before bot-blocking middleware so crawlers can always read your directives.

Where do I register bot-blocking middleware in Hono?

Use app.use("*", ...) registered before your route definitions. Hono executes middleware in registration order — any app.use() before the first app.get()/app.post() intercepts all routes. Compile the User-Agent regex at module scope (not inside the handler) so it is built once and reused across requests.

Does Hono middleware work the same way on all runtimes?

Yes. Hono's middleware API is runtime-agnostic. The same bot-blocking middleware runs unchanged on Cloudflare Workers, Bun, Deno, Node.js, and AWS Lambda. Only the entrypoint differs: Workers uses export default app, Node.js wraps it with serve(app), Bun exports { fetch: app.fetch }, and Deno uses Deno.serve(app.fetch).

How do I add X-Robots-Tag headers in Hono?

Add a middleware that calls await next() first, then sets the header: app.use("*", async (c, next) => { await next(); c.res.headers.set("X-Robots-Tag", "noai, noimageai"); }). Register this middleware before your route handlers. X-Robots-Tag tells AI training crawlers to ignore the page even if they reached it.

Can I use Hono JSX to add noai meta tags?

Yes. Set /** @jsxImportSource hono/jsx */ at the top of your file, then return c.html(<html><head><meta name="robots" content="noai, noimageai" /></head>...</html>) from your route handler. The noai directive tells AI training crawlers to skip the page content even if they bypass robots.txt.

Is your site protected from AI bots?

Run a free scan to check your robots.txt, meta tags, and overall AI readiness score.

Related Guides