How to Block AI Bots on Hapi.js: Complete 2026 Guide
Hapi.js (created at Walmart Labs) is configuration-driven rather than middleware-driven. There is no app.use() — instead, Hapi exposes a set of server lifecycle extensions that fire at specific points in the request pipeline. Bot blocking uses two: onPreAuth fires before authentication and any route handler (for the hard 403 block), and onPreResponse fires after the handler returns (for the X-Robots-Tag header).
Hapi request lifecycle (simplified)
Request → onPreAuth ← Layer 4: bot check + hard 403 here → onCredentials (auth token validation) → onPreHandler (last point before route handler) → Route handler → onPostHandler → onPreResponse ← Layer 3: X-Robots-Tag injection here Response
Four protection layers
Layer 1: robots.txt
Register @hapi/inert to enable the file route handler, then define a GET /robots.txt route. Add it before registering the bot-blocking extension so it exists in the routing table when the extension exempts it.
# public/robots.txt User-agent: * Allow: / User-agent: GPTBot User-agent: ClaudeBot User-agent: anthropic-ai User-agent: Google-Extended User-agent: CCBot User-agent: Bytespider User-agent: Applebot-Extended User-agent: PerplexityBot User-agent: Diffbot User-agent: cohere-ai User-agent: FacebookBot User-agent: omgili User-agent: omgilibot User-agent: Amazonbot User-agent: DeepSeekBot User-agent: MistralBot User-agent: xAI-Bot User-agent: AI2Bot Disallow: /
// server.js — register @hapi/inert and add robots.txt route
import Hapi from '@hapi/hapi';
import Inert from '@hapi/inert';
const server = Hapi.server({ port: 3000, host: '0.0.0.0' });
await server.register(Inert);
// robots.txt — registered before the bot-blocking extension
server.route({
method: 'GET',
path: '/robots.txt',
handler: { file: 'public/robots.txt' },
});Layer 2: noai meta tag
Register @hapi/vision to enable server-side templates, then pass a robots variable from your route handler via h.view().
Base Handlebars layout
{{! views/layouts/base.hbs }}
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1">
<title>{{title}}</title>
{{! AI bot training opt-out. Per-page override: pass robots from route handler. }}
<meta name="robots" content="{{#if robots}}{{robots}}{{else}}noai, noimageai{{/if}}">
</head>
<body>
{{{content}}}
</body>
</html>Route handler — default (no override)
// No robots key → layout defaults to "noai, noimageai"
server.route({
method: 'GET',
path: '/',
handler: (request, h) => h.view('home', { title: 'Home' }),
});Route handler — per-page override
// Public pages that should be indexed normally:
server.route({
method: 'GET',
path: '/about',
handler: (request, h) => h.view('about', {
title: 'About',
robots: 'index, follow',
}),
});Layers 3 & 4: server lifecycle extensions
Two server.ext() calls replace the middleware pattern from Express or Koa. Both are registered globally and fire for every request.
Layer 4 — hard 403 block (onPreAuth)
// extensions/aiBotBlock.js
const AI_BOT_PATTERNS = [
'gptbot', 'chatgpt-user', 'oai-searchbot',
'claudebot', 'anthropic-ai', 'claude-web',
'google-extended', 'ccbot', 'bytespider',
'applebot-extended', 'perplexitybot', 'diffbot',
'cohere-ai', 'facebookbot', 'meta-externalagent',
'omgili', 'omgilibot', 'amazonbot',
'deepseekbot', 'mistralbot', 'xai-bot', 'ai2-bot',
];
const EXEMPT_PATHS = new Set(['/robots.txt', '/sitemap.xml', '/favicon.ico']);
export function registerBotBlock(server) {
// Layer 4: block before auth — earliest possible intercept
server.ext('onPreAuth', (request, h) => {
if (EXEMPT_PATHS.has(request.path)) {
return h.continue; // pass through — let the route handle it
}
const ua = (request.headers['user-agent'] || '').toLowerCase();
if (AI_BOT_PATTERNS.some(pattern => ua.includes(pattern))) {
// .takeover() is REQUIRED — tells Hapi to use this response
// and skip all remaining lifecycle points (handler, etc.)
return h.response('Forbidden').code(403).takeover();
}
return h.continue; // pass to next lifecycle point
});
}Layer 3 — X-Robots-Tag header (onPreResponse)
export function registerXRobotsTag(server) {
server.ext('onPreResponse', (request, h) => {
const response = request.response;
// Skip error responses (Boom objects don't have .header())
if (response.isBoom) {
return h.continue;
}
// Layer 3: inject X-Robots-Tag on all legitimate responses
response.header('X-Robots-Tag', 'noai, noimageai');
return h.continue;
});
}Key points
.takeover()is mandatory when replacing the response in a lifecycle extension. Without it, Hapi continues the request cycle after your extension returns — the route handler still executes.h.continue(a Symbol) tells Hapi to proceed to the next lifecycle point. It is the correct return value when you are not modifying the response.response.isBoom— Hapi wraps all errors in Boom objects. Boom objects do not have a.header()method, so checkisBoombefore calling it inonPreResponse.onPreAuthruns before authentication — bots are blocked before any session or database lookup. This is the correct placement for simple UA matching.- EXEMPT_PATHS uses
h.continue(not takeover) — the request proceeds normally to the/robots.txtfile handler.
Full server wiring
// server.js
import Hapi from '@hapi/hapi';
import Inert from '@hapi/inert';
import Vision from '@hapi/vision';
import Handlebars from 'handlebars';
import { registerBotBlock, registerXRobotsTag } from './extensions/aiBotBlock.js';
const server = Hapi.server({ port: 3000, host: '0.0.0.0' });
// 1. Plugins
await server.register([Inert, Vision]);
// 2. Template engine
server.views({
engines: { hbs: Handlebars },
path: 'views',
layout: true,
layoutPath: 'views/layouts',
partialsPath: 'views/partials',
});
// 3. Static robots.txt route (before extensions)
server.route({
method: 'GET',
path: '/robots.txt',
handler: { file: 'public/robots.txt' },
});
// 4. Bot-blocking extensions (Layer 4 + Layer 3)
registerBotBlock(server);
registerXRobotsTag(server);
// 5. Application routes
server.route([
{
method: 'GET',
path: '/',
handler: (request, h) => h.view('home', { title: 'Home' }),
},
{
method: 'GET',
path: '/about',
handler: (request, h) => h.view('about', { title: 'About', robots: 'index, follow' }),
},
]);
await server.start();
console.log('Server running on', server.info.uri);package.json dependencies
{
"type": "module",
"dependencies": {
"@hapi/hapi": "^21.3.0",
"@hapi/inert": "^7.1.0",
"@hapi/vision": "^7.0.3",
"handlebars": "^4.7.8"
}
}Plugin-scoped blocking
To block bots only on a subset of routes, register the extension inside a sandboxed plugin. Extensions defined within a plugin's register function only apply to routes registered by that plugin.
// plugins/api.js — bot blocking scoped to /api/* routes only
export const apiPlugin = {
name: 'api',
register: async (server) => {
// This extension only fires for routes in THIS plugin
server.ext('onPreAuth', (request, h) => {
const ua = (request.headers['user-agent'] || '').toLowerCase();
if (AI_BOT_PATTERNS.some(p => ua.includes(p))) {
return h.response('Forbidden').code(403).takeover();
}
return h.continue;
});
server.route({
method: 'GET',
path: '/api/data',
handler: (request, h) => ({ data: 'Protected API response' }),
});
},
};
// In server.js — public routes are not affected
await server.register(apiPlugin);Verification
# Layer 1 — robots.txt via @hapi/inert curl https://yoursite.com/robots.txt # Layer 3 — X-Robots-Tag on a real page curl -I https://yoursite.com/ # Expected: X-Robots-Tag: noai, noimageai # Layer 4 — hard 403 on bot user-agent curl -A "Mozilla/5.0 (compatible; GPTBot/1.0)" -I https://yoursite.com/ # Expected: HTTP/1.1 403 Forbidden # robots.txt must be exempt from hard block curl -A "GPTBot" -I https://yoursite.com/robots.txt # Expected: HTTP/1.1 200 OK
FAQ
What is h.response().takeover() and why is it required?
In Hapi lifecycle extensions, setting a new response does not automatically stop the request lifecycle. You must call .takeover() to signal that the extension is taking over — Hapi will use this response and skip all remaining lifecycle points (authentication, route handler, etc.). Without .takeover(), Hapi sees your response but continues execution anyway, potentially overwriting it with the route handler's response. h.response('Forbidden').code(403).takeover() is the correct pattern for hard blocks in Hapi.
Which lifecycle point is best for bot blocking — onPreAuth or onPreHandler?
Use onPreAuth. It fires before authentication, so blocked bots consume minimal resources — no session lookup, no database query, no route handler. onPreHandler fires after authentication; use it only if you need the authenticated user context to make the blocking decision. For simple user-agent matching, onPreAuth is always the correct choice.
How does @hapi/inert serve robots.txt?
Register @hapi/inert as a plugin, then define: server.route({ method: 'GET', path: '/robots.txt', handler: { file: 'public/robots.txt' } }). The file handler object is provided by @hapi/inert — it sets Content-Type: text/plain automatically. Alternatively, use the function form: handler: (req, h) => h.file('public/robots.txt'). The route must be defined before the bot-blocking extension is registered, so it exists in the routing table when the extension exempts it via EXEMPT_PATHS.
Why check response.isBoom in onPreResponse?
Hapi wraps all error responses (4xx, 5xx) in Boom error objects. Boom objects don't have a .header() method — calling it throws a TypeError at runtime. Check response.isBoom before calling response.header() and return h.continue for error responses unchanged. This ensures X-Robots-Tag is only set on successful responses (HTML pages, JSON data) and does not crash the server when 404 or 500 errors occur.
Can I limit bot blocking to specific routes or plugins?
Yes. Register the lifecycle extension inside a Hapi plugin's register function instead of at the server level. Extensions defined within a plugin are sandboxed — they only fire for routes registered by that same plugin. Routes registered at the server level or by other plugins are not affected. This is the Hapi-idiomatic approach to selective middleware, analogous to Express router-level middleware.
Is your site protected from AI bots?
Run a free scan to check your robots.txt, meta tags, and overall AI readiness score.