Skip to content
Guides/Elysia

How to Block AI Bots on Elysia: Complete 2026 Guide

Elysia is a Bun-native TypeScript web framework — plugin-first architecture, lifecycle hooks, and consistently top benchmarks among TypeScript servers. Bot blocking uses Elysia's onBeforeHandle and onAfterHandle lifecycle hooks, and the plugin pattern for reusable, deduplicated middleware.

Four protection layers

1
robots.txtStatic Bun.file() route or inline string constant for compiled binaries
2
noai meta tag@elysiajs/html JSX plugin — <meta name="robots" content="noai, noimageai" />
3
X-Robots-Tag headeronAfterHandle lifecycle hook — set.headers['X-Robots-Tag'] = 'noai, noimageai'
4
Hard 403 blockonBeforeHandle global hook with AI bot regex + EXEMPT_PATHS list

Layer 1: robots.txt

Add a .get('/robots.txt') route before your bot-blocking hook. This ensures the route is exempt from the check — crawlers must always be able to read your disallow rules.

Option A: Bun.file() (file on disk)

Bun.file() returns a BunFile that implements the Blob interface — pass it directly to the Response constructor. No reading, no buffering.

// src/index.ts
import Elysia from 'elysia'

const app = new Elysia()
  .get('/robots.txt', () =>
    new Response(Bun.file('./static/robots.txt'), {
      headers: { 'Content-Type': 'text/plain; charset=utf-8' },
    })
  )
  // ... rest of your routes
  .listen(3000)

Option B: Inline constant (bun build --compile)

When you compile to a single binary with bun build --compile, no files exist on disk at runtime. Embed the content as a string constant — it's baked into the binary.

const ROBOTS_TXT = `User-agent: *
Allow: /

User-agent: GPTBot
User-agent: ClaudeBot
User-agent: anthropic-ai
User-agent: Google-Extended
User-agent: CCBot
User-agent: Bytespider
User-agent: Applebot-Extended
User-agent: PerplexityBot
User-agent: Diffbot
User-agent: cohere-ai
User-agent: FacebookBot
User-agent: omgili
User-agent: omgilibot
Disallow: /`

const app = new Elysia()
  .get('/robots.txt', () =>
    new Response(ROBOTS_TXT, {
      headers: { 'Content-Type': 'text/plain; charset=utf-8' },
    })
  )

This also works for regular deployments — slightly faster than a disk read since there's no I/O.

Layer 2: noai meta tag

Elysia doesn't render HTML by default — use the @elysiajs/html plugin for JSX-based HTML responses, or return a plain string with the meta tag embedded.

Option A: @elysiajs/html JSX plugin

// Install: bun add @elysiajs/html
import Elysia from 'elysia'
import { html } from '@elysiajs/html'

const app = new Elysia()
  .use(html())
  .get('/', () => (
    <html lang="en">
      <head>
        <meta charset="utf-8" />
        <title>My App</title>
        {/* Tells compliant AI crawlers not to use this page for training */}
        <meta name="robots" content="noai, noimageai" />
      </head>
      <body>
        <h1>Hello World</h1>
      </body>
    </html>
  ))
  .listen(3000)

Option B: Plain string response

No plugin needed — return a string with the correct Content-Type header.

app.get('/', ({ set }) => {
  set.headers['Content-Type'] = 'text/html; charset=utf-8'
  return `<!DOCTYPE html>
<html lang="en">
<head>
  <meta charset="utf-8">
  <title>My App</title>
  <meta name="robots" content="noai, noimageai">
</head>
<body><h1>Hello World</h1></body>
</html>`
})

Layer 3: X-Robots-Tag header

onAfterHandle runs after your handler returns a value but before Elysia sends the response — the right place to add headers globally.

Global (all HTML responses)

const app = new Elysia()
  .onAfterHandle(({ set }) => {
    set.headers['X-Robots-Tag'] = 'noai, noimageai'
  })
  .get('/', () => 'Hello World')
  .listen(3000)

Selective (HTML responses only, skip JSON API routes)

const app = new Elysia()
  .onAfterHandle(({ set, response }) => {
    // Only add to HTML responses, not JSON API routes
    const contentType = typeof response === 'string'
      ? 'text/html'
      : ''
    if (contentType.includes('text/html') || typeof response === 'string') {
      set.headers['X-Robots-Tag'] = 'noai, noimageai'
    }
  })

Layer 4: Hard 403 block

onBeforeHandle intercepts every request before your route handler runs. Return a value to short-circuit — the handler never executes. Call nothing to pass through normally.

Inline approach

import Elysia from 'elysia'

const AI_BOTS = /gptobt|claudebot|anthropic-ai|claude-web|google-extended|ccbot|bytespider|applebot-extended|perplexitybot|diffbot|cohere-ai|facebookbot|omgili|omgilibot|iaskspider|youbot/i

// Paths that must remain accessible to all crawlers
const EXEMPT_PATHS = new Set(['/robots.txt', '/sitemap.xml', '/favicon.ico'])

const app = new Elysia()
  .onBeforeHandle(({ request, set }) => {
    const { pathname } = new URL(request.url)

    // Always allow crawlers to read robots.txt
    if (EXEMPT_PATHS.has(pathname)) return

    const ua = (request.headers.get('user-agent') ?? '').toLowerCase()
    if (AI_BOTS.test(ua)) {
      set.status = 403
      return 'Forbidden'
    }
  })
  .get('/robots.txt', () =>
    new Response(ROBOTS_TXT, {
      headers: { 'Content-Type': 'text/plain; charset=utf-8' },
    })
  )
  .get('/', () => 'Hello World')
  .listen(3000)

console.log(`Elysia running at ${app.server?.hostname}:${app.server?.port}`)

Note: gptobt is intentional — GPTBot normalised to lowercase. The regex uses /i for case-insensitive matching as a safety net.

Elysia plugin pattern (recommended for larger apps)

The name property enables Elysia's singleton deduplication — if multiple parts of your app call .use(aiBotBlock), the plugin only registers once. Without a name, the hook would fire twice.

// src/plugins/aiBotBlock.ts
import Elysia from 'elysia'

const AI_BOTS = /gptobt|claudebot|anthropic-ai|claude-web|google-extended|ccbot|bytespider|applebot-extended|perplexitybot|diffbot|cohere-ai|facebookbot|omgili|omgilibot|iaskspider|youbot/i

const EXEMPT_PATHS = new Set(['/robots.txt', '/sitemap.xml', '/favicon.ico'])

export const aiBotBlock = new Elysia({ name: 'ai-bot-block' })
  .onBeforeHandle(({ request, set }) => {
    const { pathname } = new URL(request.url)
    if (EXEMPT_PATHS.has(pathname)) return

    const ua = (request.headers.get('user-agent') ?? '').toLowerCase()
    if (AI_BOTS.test(ua)) {
      set.status = 403
      return 'Forbidden'
    }
  })
// src/index.ts
import Elysia from 'elysia'
import { aiBotBlock } from './plugins/aiBotBlock'

const app = new Elysia()
  .use(aiBotBlock)            // ← singleton: registers once regardless of how many .use() calls
  .get('/robots.txt', () => new Response(ROBOTS_TXT, {
    headers: { 'Content-Type': 'text/plain; charset=utf-8' }
  }))
  .get('/', () => 'Hello World')
  .listen(3000)

Route-scoped blocking with .guard()

Use .guard() when you only want to block bots on specific route groups — for example, protecting API endpoints or premium content while leaving your public pages accessible.

import Elysia from 'elysia'

const AI_BOTS = /gptobt|claudebot|anthropic-ai|ccbot|bytespider|perplexitybot/i

function aiBotCheck({ request, set }: { request: Request; set: { status: number } }) {
  const ua = (request.headers.get('user-agent') ?? '').toLowerCase()
  if (AI_BOTS.test(ua)) {
    set.status = 403
    return 'Forbidden'
  }
}

const app = new Elysia()
  // Public routes — no bot check
  .get('/', () => 'Welcome')
  .get('/robots.txt', () => new Response(ROBOTS_TXT, {
    headers: { 'Content-Type': 'text/plain; charset=utf-8' }
  }))

  // Protected routes — bot check applied
  .guard(
    { beforeHandle: [aiBotCheck] },
    (app) =>
      app
        .get('/api/data', () => ({ secret: true }))
        .post('/api/submit', ({ body }) => ({ ok: true }))
  )
  .listen(3000)

All four layers combined

import Elysia from 'elysia'
import { html } from '@elysiajs/html'

const AI_BOTS = /gptobt|claudebot|anthropic-ai|claude-web|google-extended|ccbot|bytespider|applebot-extended|perplexitybot|diffbot|cohere-ai|facebookbot|omgili|omgilibot/i
const EXEMPT_PATHS = new Set(['/robots.txt', '/sitemap.xml', '/favicon.ico'])

const ROBOTS_TXT = `User-agent: *
Allow: /

User-agent: GPTBot
User-agent: ClaudeBot
User-agent: anthropic-ai
User-agent: Google-Extended
User-agent: CCBot
User-agent: Bytespider
Disallow: /`

const app = new Elysia()
  .use(html())

  // Layer 4: Hard 403 — runs before every handler
  .onBeforeHandle(({ request, set }) => {
    const { pathname } = new URL(request.url)
    if (EXEMPT_PATHS.has(pathname)) return

    const ua = (request.headers.get('user-agent') ?? '').toLowerCase()
    if (AI_BOTS.test(ua)) {
      set.status = 403
      return 'Forbidden'
    }
  })

  // Layer 3: X-Robots-Tag — runs after every handler
  .onAfterHandle(({ set }) => {
    set.headers['X-Robots-Tag'] = 'noai, noimageai'
  })

  // Layer 1: robots.txt
  .get('/robots.txt', () =>
    new Response(ROBOTS_TXT, {
      headers: { 'Content-Type': 'text/plain; charset=utf-8' },
    })
  )

  // Layer 2: noai meta tag
  .get('/', () => (
    <html lang="en">
      <head>
        <meta charset="utf-8" />
        <title>My App</title>
        <meta name="robots" content="noai, noimageai" />
      </head>
      <body>
        <h1>Hello World</h1>
      </body>
    </html>
  ))

  .listen(3000)

console.log(`Running at http://${app.server?.hostname}:${app.server?.port}`)

Deployment

Elysia requires the Bun runtime. Use oven/bun as your Docker base image.

# Dockerfile
FROM oven/bun:1 AS base
WORKDIR /app

FROM base AS deps
COPY package.json bun.lockb ./
RUN bun install --frozen-lockfile

FROM base AS runner
COPY --from=deps /app/node_modules ./node_modules
COPY . .

EXPOSE 3000
CMD ["bun", "run", "src/index.ts"]

For a single self-contained binary (no Bun required at runtime):

# Build a self-contained binary
bun build ./src/index.ts --compile --outfile myapp

# Run anywhere — no Bun, no Node required
./myapp

Platforms: Fly.io (Dockerfile deploy), Railway (auto-detects Bun via bun.lockb), Render (set runtime to Bun), Coolify, any VPS with Docker.

FAQ

Should I use onBeforeHandle or onRequest for bot blocking?

Use onBeforeHandle. It runs after Elysia parses the request but before your handler executes — the correct interception point. onRequest runs even earlier (before parsing) and context.set is not available yet. onBeforeHandle gives you set.status for the response code and request.headers for reading the user agent.

What does the name property on an Elysia plugin do?

The name property enables plugin deduplication (singleton pattern). If two parts of your app both call .use(aiBotBlock) and the plugin has a name, Elysia only registers it once — the second call is ignored. Without a name, the hook would fire twice per request. Always give shared plugins a name.

Does onBeforeHandle run for the /robots.txt route?

Yes — global lifecycle hooks run for every request, including your .get('/robots.txt') route. You must check EXEMPT_PATHS before the bot check. If you don't exempt /robots.txt, all crawlers (including legitimate search engines reading your disallow rules) receive a 403. Parse the pathname from new URL(request.url).pathname and compare before running the UA check.

How do I serve robots.txt in a bun build --compile binary?

Define the robots.txt content as a string constant in your source file. When compiled to a single binary, no files exist on disk at runtime — Bun.file() would fail because the path doesn't exist. Embedding the content as a string constant bakes it into the binary.

Can I use .guard() to block bots only on API routes?

Yes. .guard({ beforeHandle: [aiBotCheck] }, (app) => app.get('/api/data', handler)) applies the bot check only to routes defined inside the guard callback. Public routes like your homepage and /robots.txt are unaffected. This is the cleanest pattern when you want to protect API or premium routes without blocking your entire site.

Is your site protected from AI bots?

Run a free scan to check your robots.txt, meta tags, and overall AI readiness score.