Skip to content
Bun · Hono · Elysia·9 min read

How to Block AI Bots on Bun: Complete 2026 Guide

Bun is a runtime, bundler, test runner, and package manager in one binary. It's Node.js-compatible and ships Bun.serve() as its native HTTP server. Unlike Node, Bun.file() returns a BunFile (a Blob) you can stream directly into a Response — serving robots.txt is a one-liner. This guide covers every approach from raw Bun.serve() through Hono, Elysia, and bun build --compile single-binary deployment.

Bun 1.x

This guide targets Bun 1.1+ (stable). Bun.serve(), Bun.file(), and bun build --compile are all stable in Bun 1.x. Hono 4.x and Elysia 1.x are used. Node.js-compatible frameworks (Express, Fastify) also run without modification.

Methods at a glance

MethodWhat it doesBlocks JS-less bots?
Bun.serve() /robots.txt + Bun.file()Signals crawlers to stay outSignal only
String constant (--compile binary)Embed robots.txt into compiled binarySignal only
noai meta tag in HTML templateOpt out of AI training per page✓ (server-rendered)
X-Robots-Tag headernoai via HTTP header on all responses✓ (header)
Bun.serve() handler blockHard 403 in raw Bun.serve() — no framework
Hono app.use() middlewareHard 403 before routes — Hono apps
Elysia .derive() + .onBeforeHandle()Hard 403 in Elysia lifecycle
nginx map blockHard 403 at reverse proxy layer

1. robots.txt — Bun.file() one-liner

Bun.file(path) returns a BunFile — a lazy reference to a file that implements the Blob interface. Pass it directly to new Response(). Bun streams the file without loading it fully into memory, and sets the correct Content-Type from the extension automatically.

static/robots.txt

User-agent: GPTBot
Disallow: /

User-agent: ChatGPT-User
Disallow: /

User-agent: OAI-SearchBot
Disallow: /

User-agent: ClaudeBot
Disallow: /

User-agent: Claude-Web
Disallow: /

User-agent: anthropic-ai
Disallow: /

User-agent: Google-Extended
Disallow: /

User-agent: Bytespider
Disallow: /

User-agent: CCBot
Disallow: /

User-agent: PerplexityBot
Disallow: /

User-agent: Applebot-Extended
Disallow: /

User-agent: *
Allow: /

server.ts — raw Bun.serve()

// Compile regex once at module load — not per request
const BLOCKED_UAS =
  /GPTBot|ChatGPT-User|OAI-SearchBot|ClaudeBot|Claude-Web|anthropic-ai|Google-Extended|Bytespider|CCBot|PerplexityBot|Applebot-Extended/i;

Bun.serve({
  port: 3000,
  fetch(req: Request): Response {
    const { pathname } = new URL(req.url);

    // Always serve robots.txt — exempt from blocking
    // Bun.file() streams directly; Content-Type set from extension
    if (pathname === '/robots.txt') {
      return new Response(Bun.file('./static/robots.txt'));
    }

    // Block AI crawlers on all other paths
    const ua = req.headers.get('user-agent') ?? '';
    if (BLOCKED_UAS.test(ua)) {
      return new Response('Forbidden', { status: 403 });
    }

    if (pathname === '/') {
      return new Response('Hello Bun', {
        headers: { 'Content-Type': 'text/plain' },
      });
    }

    return new Response('Not Found', { status: 404 });
  },
});

Run with: bun run server.ts. Bun executes TypeScript natively — no tsc step needed.

2. bun build --compile — single binary deployment

bun build --compile bundles your source, dependencies, and static assets into a single self-contained executable — no Bun runtime required at deployment.Bun.file() cannot read files that don't exist at runtime, so embed robots.txt as a string constant when compiling.

server.ts — compiled-binary-safe

// Embedded constant — works in compiled binary with no static/ directory
const ROBOTS_TXT = `User-agent: GPTBot
Disallow: /

User-agent: ChatGPT-User
Disallow: /

User-agent: OAI-SearchBot
Disallow: /

User-agent: ClaudeBot
Disallow: /

User-agent: anthropic-ai
Disallow: /

User-agent: Google-Extended
Disallow: /

User-agent: Bytespider
Disallow: /

User-agent: CCBot
Disallow: /

User-agent: PerplexityBot
Disallow: /

User-agent: *
Allow: /`;

const BLOCKED_UAS =
  /GPTBot|ChatGPT-User|OAI-SearchBot|ClaudeBot|Claude-Web|anthropic-ai|Google-Extended|Bytespider|CCBot|PerplexityBot|Applebot-Extended/i;

Bun.serve({
  port: 3000,
  fetch(req: Request): Response {
    const { pathname } = new URL(req.url);

    if (pathname === '/robots.txt') {
      return new Response(ROBOTS_TXT, {
        headers: { 'Content-Type': 'text/plain; charset=utf-8' },
      });
    }

    const ua = req.headers.get('user-agent') ?? '';
    if (BLOCKED_UAS.test(ua)) {
      return new Response('Forbidden', { status: 403 });
    }

    return new Response('Hello', { status: 200 });
  },
});

Build command

# Compile to a single binary for the current platform
bun build --compile --minify server.ts --outfile server

# Cross-compile for Linux (e.g. build on Mac, deploy to Linux VPS)
bun build --compile --target bun-linux-x64 server.ts --outfile server-linux

# Run the binary — no Bun installation needed on the target machine
./server

Embed vs file: pick based on deployment

Use the embedded string constant for compiled binaries and containerised deployments where you want a single artefact. Use Bun.file() for development and VPS deployments where static files are co-located with the process.

3. Hono middleware — hard 403 before routes

Hono is an ultra-fast web framework that runs on Bun, Deno, Cloudflare Workers, and Node.js. Use app.use('*', ...) to register a global middleware that runs before all routes. Hono's c.req.header('user-agent') reads the UA header.

package.json

{
  "dependencies": {
    "hono": "^4"
  }
}

server.ts — Hono on Bun

import { Hono } from 'hono';

const BLOCKED_UAS =
  /GPTBot|ChatGPT-User|OAI-SearchBot|ClaudeBot|Claude-Web|anthropic-ai|Google-Extended|Bytespider|CCBot|PerplexityBot|Applebot-Extended/i;

const ROBOTS_TXT = `User-agent: GPTBot
Disallow: /
User-agent: ChatGPT-User
Disallow: /
User-agent: OAI-SearchBot
Disallow: /
User-agent: ClaudeBot
Disallow: /
User-agent: anthropic-ai
Disallow: /
User-agent: Google-Extended
Disallow: /
User-agent: Bytespider
Disallow: /
User-agent: CCBot
Disallow: /
User-agent: PerplexityBot
Disallow: /
User-agent: *
Allow: /`;

const app = new Hono();

// 1. Serve robots.txt first — before bot blocking
app.get('/robots.txt', (c) => {
  return c.text(ROBOTS_TXT);
});

// 2. Bot-blocking middleware — runs before all other routes
app.use('*', async (c, next) => {
  const ua = c.req.header('user-agent') ?? '';
  if (BLOCKED_UAS.test(ua)) {
    return c.text('Forbidden', 403);
  }
  await next();
});

// 3. Application routes — only reached by non-blocked requests
app.get('/', (c) => c.text('Hello Hono on Bun'));

// Export for Bun.serve()
export default {
  port: 3000,
  fetch: app.fetch,
};

Route before middleware for /robots.txt

In Hono, a specific route registered with app.get() takes precedence over a wildcard app.use('*') when registered first. Register the robots.txt route before the blocking middleware so crawlers can always fetch it — even if the bot is in your block list.

4. Elysia — lifecycle hooks

Elysia is a Bun-first web framework with a strong TypeScript type system and lifecycle hooks. Use .onBeforeHandle() to intercept requests before any route handler runs — return a Response to short-circuit.

server.ts — Elysia on Bun

import { Elysia } from 'elysia';

const BLOCKED_UAS =
  /GPTBot|ChatGPT-User|OAI-SearchBot|ClaudeBot|Claude-Web|anthropic-ai|Google-Extended|Bytespider|CCBot|PerplexityBot|Applebot-Extended/i;

const ROBOTS_TXT = `User-agent: GPTBot
Disallow: /
User-agent: *
Allow: /`;

const app = new Elysia()
  // Serve robots.txt — registered before the blocking hook
  .get('/robots.txt', () => new Response(ROBOTS_TXT, {
    headers: { 'Content-Type': 'text/plain' },
  }))
  // Global before-handle hook — runs before all route handlers
  .onBeforeHandle(({ request, path, set }) => {
    // Exempt robots.txt
    if (path === '/robots.txt') return;

    const ua = request.headers.get('user-agent') ?? '';
    if (BLOCKED_UAS.test(ua)) {
      set.status = 403;
      return 'Forbidden';
    }
  })
  .get('/', () => 'Hello Elysia on Bun')
  .listen(3000);

console.log(`Server running on http://localhost:${app.server?.port}`);

onBeforeHandle in Elysia runs in the order hooks are registered. Returning a value (not undefined) from the hook short-circuits further processing — the route handler never runs.

Install Elysia

bun add elysia

5. noai meta tag in HTML responses

Bun has no built-in templating engine. Compose HTML as tagged template literals or plain strings. Pass a flag to your render function to opt specific pages out of AI training.

function html(
  content: string,
  opts: { noAiTraining?: boolean } = {}
): Response {
  const robotsMeta = opts.noAiTraining
    ? '<meta name="robots" content="noai, noimageai">'
    : '';

  return new Response(
    `<!DOCTYPE html>
<html lang="en">
<head>
  <meta charset="UTF-8">
  <meta name="viewport" content="width=device-width, initial-scale=1.0">
  ${robotsMeta}
  <title>My App</title>
</head>
<body>
  ${content}
</body>
</html>`,
    { headers: { 'Content-Type': 'text/html; charset=utf-8' } }
  );
}

// Block AI training on specific pages
app.get('/articles/:id', (c) => {
  return html('<h1>Article</h1>', { noAiTraining: true });
});

// Allow AI training on public marketing pages
app.get('/', (c) => {
  return html('<h1>Home</h1>');
});

6. X-Robots-Tag response header

X-Robots-Tag applies to all content types — HTML, JSON, images, PDFs. Use it in Hono via c.header() in an after-handler middleware, or set it on the raw Response headers in Bun.serve().

Hono — X-Robots-Tag middleware

// Add after the bot-blocking middleware, before route definitions
app.use('*', async (c, next) => {
  await next();
  // Set on the way out — after the route handler has run
  c.header('X-Robots-Tag', 'noai, noimageai');
});

Bun.serve() — header on every response

function withRobotsHeader(res: Response): Response {
  const headers = new Headers(res.headers);
  headers.set('X-Robots-Tag', 'noai, noimageai');
  return new Response(res.body, { status: res.status, headers });
}

Bun.serve({
  port: 3000,
  fetch(req: Request): Response {
    const { pathname } = new URL(req.url);

    // Never add X-Robots-Tag to robots.txt itself
    if (pathname === '/robots.txt') {
      return new Response(Bun.file('./static/robots.txt'));
    }

    const ua = req.headers.get('user-agent') ?? '';
    if (BLOCKED_UAS.test(ua)) {
      return new Response('Forbidden', { status: 403 });
    }

    return withRobotsHeader(
      new Response('<h1>Hello</h1>', {
        headers: { 'Content-Type': 'text/html' },
      })
    );
  },
});

7. Express on Bun — Node.js compatibility

Bun implements the Node.js API. If your codebase uses Express, it runs on Bun without modification — just replace node server.js with bun server.js. The Express middleware pattern is identical.

// This is standard Express — runs on both Node.js and Bun unchanged
import express from 'express';

const BLOCKED_UAS =
  /GPTBot|ChatGPT-User|OAI-SearchBot|ClaudeBot|Claude-Web|anthropic-ai|Google-Extended|Bytespider|CCBot|PerplexityBot|Applebot-Extended/i;

const app = express();

// 1. Serve static files first (robots.txt in public/)
app.use(express.static('public'));

// 2. Block AI bots — after static, before routes
app.use((req, res, next) => {
  const ua = req.headers['user-agent'] ?? '';
  if (BLOCKED_UAS.test(ua)) {
    return res.status(403).type('text').send('Forbidden');
  }
  next();
});

app.get('/', (req, res) => res.send('Hello Express on Bun'));

app.listen(3000);

Run with: bun server.js. Bun is typically 2–4× faster than Node.js for Express workloads with no code changes.

8. nginx — reverse proxy hard block

nginx in front of Bun blocks AI bots before any request reaches your process. Combine with the application-level middleware for defence in depth.

map $http_user_agent $block_ai_bot {
    default             0;
    "~*GPTBot"          1;
    "~*ChatGPT-User"    1;
    "~*OAI-SearchBot"   1;
    "~*ClaudeBot"       1;
    "~*Claude-Web"      1;
    "~*anthropic-ai"    1;
    "~*Google-Extended" 1;
    "~*Bytespider"      1;
    "~*CCBot"           1;
    "~*PerplexityBot"   1;
    "~*Applebot-Extended" 1;
}

server {
    listen 80;
    server_name yourdomain.com;

    # Serve robots.txt directly from disk — exempt from bot check
    location = /robots.txt {
        alias /var/www/static/robots.txt;
        add_header Content-Type "text/plain";
    }

    location / {
        if ($block_ai_bot) {
            return 403;
        }
        proxy_pass http://127.0.0.1:3000;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
    }
}

9. Docker — oven/bun image

The official Bun Docker image is oven/bun. For production, build a compiled binary (section 2) for a minimal image with no runtime dependency.

Option A: oven/bun runtime image

FROM oven/bun:1 AS base
WORKDIR /app

# Cache dependencies
COPY package.json bun.lockb ./
RUN bun install --frozen-lockfile

# Copy source
COPY . .

EXPOSE 3000
CMD ["bun", "run", "server.ts"]

Option B: compiled binary, scratch image

# Build stage: compile to a Linux binary
FROM oven/bun:1 AS builder
WORKDIR /app
COPY package.json bun.lockb ./
RUN bun install --frozen-lockfile
COPY . .
# Cross-compile for linux/amd64
RUN bun build --compile --target bun-linux-x64 server.ts --outfile server

# Run stage: copy binary only — no Bun, no Node needed
FROM gcr.io/distroless/cc-debian12
WORKDIR /app
COPY --from=builder /app/server ./server
EXPOSE 3000
ENTRYPOINT ["./server"]

The distroless image is ~20 MB. No shell, no package manager, no runtime. If the embedded robots.txt constant (section 2) is used, no static files need to be copied.

Framework comparison

FrameworkBlock hookrobots.txtNode compat
Bun.serve() (raw)fetch() handler — pathname first, UA secondBun.file() one-linerN/A
Honoapp.use("*") before routesapp.get("/robots.txt") before use()Node adapter available
Elysia.onBeforeHandle() global hook.get("/robots.txt") firstBun-first, limited Node
Express on Bunapp.use() after static, before routesexpress.static("public")✓ identical to Node

FAQ

What is Bun.file() and why use it for robots.txt?

Bun.file(path) returns a BunFile — a lazy, streaming reference to a file on disk that implements the Web Blob interface. Passing it to new Response() tells Bun to stream the file directly to the client without loading it into JavaScript memory. Content-Type is inferred from the file extension. It is the idiomatic Bun approach for serving static files without a framework.

Does bun build --compile include static files?

Not automatically. bun build --compile bundles TypeScript source and imported modules — it does not bundle arbitrary files referenced via Bun.file() at runtime. Embed robots.txt as a string constant in your source code so it is included in the compiled binary. The constant is compiled into the binary and needs no external file.

Can I use Hono on both Bun and Cloudflare Workers with the same bot-blocking code?

Yes. Hono is runtime-agnostic — the app.use() middleware and route handlers are identical across Bun, Deno, Cloudflare Workers, and Node.js. Only the export differs: export default { port: 3000, fetch: app.fetch } for Bun vs export default app for Workers. The bot-blocking middleware code is unchanged.

Is regex matching on User-Agent reliable for blocking AI bots?

For well-known bots (GPTBot, ClaudeBot, Bytespider, CCBot) that announce themselves via User-Agent, yes — these bots use consistent strings. UA matching does not block bots that spoof a browser UA. For those, combine UA blocking with rate limiting, Cloudflare WAF rules, or IP-based blocking at the network layer.

Should I block at nginx or in Bun code?

Both if nginx is in your stack. nginx blocks matched bots before any request reaches Bun — saving compute. The Bun middleware blocks bots that connect directly (bypassing nginx) and is a useful defence-in-depth layer for container environments where nginx is not always present.

Is your site protected from AI bots?

Run a free scan to check your robots.txt, meta tags, and overall AI readiness score.

Related Guides