How to Block AI Bots on Rocket (Rust): Complete 2026 Guide
Rocket is a Rust web framework known for compile-time route validation and type-safe request handling. Its bot-blocking architecture is fundamentally different from every other framework: Fairings (lifecycle callbacks) cannot abort requests, and Request Guards (the FromRequest trait) are the idiomatic way to block requests with a 403.
Fairings are NOT middleware
In every other framework, “middleware” can short-circuit a request and return a response. Rocket fairings cannot. on_request returns () — there is no mechanism to abort. on_response can modify the response but the route handler has already run. This is by design: Rocket separates side-effects (fairings) from access control (guards).
Protection layers
Step 1 — Shared bot list (src/bots.rs)
A &[&str] slice is zero-cost at runtime — the strings are embedded in the binary. The is_ai_bot() function lowercases and checks for substring matches.
// src/bots.rs — shared AI bot list
pub const AI_BOTS: &[&str] = &[
// OpenAI
"gptbot", "chatgpt-user", "oai-searchbot",
// Anthropic
"claudebot", "claude-web",
// Common Crawl
"ccbot",
// Bytedance
"bytespider",
// Meta
"meta-externalagent",
// Perplexity
"perplexitybot",
// Google AI
"google-extended", "googleother",
// Cohere
"cohere-ai",
// Amazon
"amazonbot",
// Diffbot
"diffbot",
// AI2
"ai2bot",
// DeepSeek
"deepseekbot",
// Mistral
"mistralai-user",
// xAI
"xai-bot",
// You.com
"youbot",
// DuckDuckGo AI
"duckassistbot",
];
pub fn is_ai_bot(user_agent: &str) -> bool {
let ua = user_agent.to_lowercase();
AI_BOTS.iter().any(|bot| ua.contains(bot))
}Step 2 — Request Guard (recommended — per-route blocking)
Implement FromRequest for a guard struct. Return Outcome::Error((Status::Forbidden, ())) for AI bots — the route handler never executes. Zero wasted computation. Add the guard as a function parameter to any route.
// src/guards.rs — Request Guard for AI bot blocking
use rocket::request::{self, FromRequest, Request};
use rocket::http::Status;
use crate::bots::is_ai_bot;
/// Request Guard that blocks AI bots with a 403.
/// Add this as a parameter to any route to protect it.
pub struct AiBotGuard;
#[rocket::async_trait]
impl<'r> FromRequest<'r> for AiBotGuard {
type Error = ();
async fn from_request(req: &'r Request<'_>) -> request::Outcome<Self, Self::Error> {
let ua = req.headers().get_one("User-Agent").unwrap_or("");
if is_ai_bot(ua) {
// Short-circuit — route handler never runs
request::Outcome::Error((Status::Forbidden, ()))
} else {
request::Outcome::Success(AiBotGuard)
}
}
}Step 3 — Routes, mounting, and catcher
Add _guard: AiBotGuard to route parameters. The underscore prefix tells Rust the value is intentionally unused — the guard's effect is its existence (it ran successfully). Routes without the guard remain accessible to AI bots.
// src/main.rs — routes with AiBotGuard
use rocket::fs::FileServer;
mod bots;
mod guards;
mod fairings;
use guards::AiBotGuard;
// Protected route — guard runs before handler.
// If UA is an AI bot, handler never executes → 403.
#[get("/")]
fn index(_guard: AiBotGuard) -> &'static str {
"Welcome to my site"
}
// Protected API route — same guard, same behavior.
#[get("/api/data")]
fn api_data(_guard: AiBotGuard) -> rocket::serde::json::Json<&'static str> {
rocket::serde::json::Json("sensitive data")
}
// Unprotected route — no guard, AI bots can access.
#[get("/health")]
fn health() -> &'static str {
"ok"
}
#[launch]
fn rocket() -> _ {
rocket::build()
// Attach the X-Robots-Tag fairing (global, all responses)
.attach(fairings::XRobotsTagFairing)
// Mount routes
.mount("/", routes![index, api_data, health])
// Serve static files (including robots.txt)
.mount("/", FileServer::from("./static"))
// Register 403 catcher for a clean error page
.register("/", catchers![forbidden])
}
#[catch(403)]
fn forbidden() -> &'static str {
"Forbidden"
}Step 4 — X-Robots-Tag via Fairing (global header)
This is what fairings are designed for: adding headers to all responses. on_response fires after every route handler — set X-Robots-Tag on every response, including 403 error pages.
// src/fairings.rs — X-Robots-Tag on all responses
use rocket::fairing::{Fairing, Info, Kind};
use rocket::{Request, Response};
/// Fairing that adds X-Robots-Tag to every response.
/// Fairings CAN modify responses — they just cannot abort requests.
pub struct XRobotsTagFairing;
#[rocket::async_trait]
impl Fairing for XRobotsTagFairing {
fn info(&self) -> Info {
Info {
name: "X-Robots-Tag Header",
kind: Kind::Response,
}
}
// on_response: fires AFTER the route handler returns.
// Can modify headers, status, body — anything on the Response.
async fn on_response<'r>(
&self,
_req: &'r Request<'_>,
res: &mut Response<'r>,
) {
res.set_raw_header("X-Robots-Tag", "noai, noimageai");
}
}Step 5 — Global blocking via Fairing override (alternative)
Trade-off: The route handler still runs — database queries, computation, and side-effects all execute before the response is overwritten. Use this only when you cannot add Request Guards to every route. For zero-overhead global blocking, put Rocket behind nginx or Caddy.
// src/fairings.rs — Global blocking via on_response override
use rocket::fairing::{Fairing, Info, Kind};
use rocket::http::Status;
use rocket::{Request, Response};
use std::io::Cursor;
use crate::bots::is_ai_bot;
/// Global bot blocker via Fairing.
///
/// ⚠️ TRADE-OFF: The route handler STILL RUNS — the response is
/// overwritten afterward. This wastes computation (DB queries, etc.)
/// but provides truly global coverage without adding guards to every route.
///
/// For zero-overhead global blocking, use nginx/Caddy in front of Rocket.
pub struct GlobalBotBlockerFairing;
#[rocket::async_trait]
impl Fairing for GlobalBotBlockerFairing {
fn info(&self) -> Info {
Info {
name: "Global AI Bot Blocker",
kind: Kind::Response,
}
}
async fn on_response<'r>(
&self,
req: &'r Request<'_>,
res: &mut Response<'r>,
) {
let ua = req.headers().get_one("User-Agent").unwrap_or("");
if is_ai_bot(ua) {
// Override the entire response to 403
res.set_status(Status::Forbidden);
res.set_sized_body(
Some("Forbidden".len()),
Cursor::new("Forbidden"),
);
// Remove any body the handler set
res.remove_header("Content-Type");
res.set_raw_header("Content-Type", "text/plain; charset=utf-8");
}
// X-Robots-Tag on ALL responses (blocked or not)
res.set_raw_header("X-Robots-Tag", "noai, noimageai");
}
}Step 6 — robots.txt
Three options: static file via FileServer, a #[get] route handler, or compile-time embedding with include_str!(). The include_str! approach bakes the file into the binary at compile time — zero filesystem reads at runtime.
// Option A: Static file — place in ./static/robots.txt
// FileServer::from("./static") serves it automatically at /robots.txt.
// No code needed — just the file.
// Option B: Route handler — dynamic or compile-time embedded
#[get("/robots.txt")]
fn robots() -> (rocket::http::ContentType, &'static str) {
(rocket::http::ContentType::Plain, ROBOTS_CONTENT)
}
// Option C: Compile-time embedded via include_str!()
// The file is baked into the binary — no filesystem read at runtime.
#[get("/robots.txt")]
fn robots_embedded() -> (rocket::http::ContentType, &'static str) {
(
rocket::http::ContentType::Plain,
include_str!("../static/robots.txt"),
)
}
const ROBOTS_CONTENT: &str = "User-agent: *
Allow: /
# AI training bots — blocked
User-agent: GPTBot
Disallow: /
User-agent: ClaudeBot
Disallow: /
User-agent: CCBot
Disallow: /
User-agent: Bytespider
Disallow: /
User-agent: Google-Extended
Disallow: /
User-agent: PerplexityBot
Disallow: /
User-agent: Meta-ExternalAgent
Disallow: /
User-agent: YouBot
Disallow: /
User-agent: AmazonBot
Disallow: /
User-agent: Diffbot
Disallow: /";Step 7 — noai meta tag via custom Responder
Rocket's Responder trait lets you create types that control every aspect of the HTTP response. Combine the noai meta tag in HTML with the X-Robots-Tag header in a single responder type.
// In a Tera/Handlebars/Askama template:
// <meta name="robots" content="noai, noimageai">
// With Rocket's responder system, inject via a custom responder:
use rocket::response::{self, Responder, Response};
use rocket::http::ContentType;
use std::io::Cursor;
pub struct HtmlWithNoAi(pub String);
impl<'r> Responder<'r, 'static> for HtmlWithNoAi {
fn respond_to(self, _req: &'r Request<'_>) -> response::Result<'static> {
Response::build()
.header(ContentType::HTML)
.raw_header("X-Robots-Tag", "noai, noimageai")
.sized_body(Some(self.0.len()), Cursor::new(self.0))
.ok()
}
}
// Usage in a route:
#[get("/")]
fn index(_guard: AiBotGuard) -> HtmlWithNoAi {
HtmlWithNoAi(format!(
"<html><head><meta name=\"robots\" content=\"noai, noimageai\"></head>\
<body><h1>Protected</h1></body></html>"
))
}Rocket vs Actix-web vs Axum vs Warp
| Feature | Rocket | Actix-web | Axum | Warp |
|---|---|---|---|---|
| Middleware model | Request Guards (per-route) + Fairings (lifecycle) | wrap() / wrap_fn() — wraps handler chain | Tower layers — Service<Request> → Service<Request> | Filter combinators — composable extractors |
| Can abort in middleware? | Guards: yes (Outcome::Error). Fairings: no. | Yes — return HttpResponse early | Yes — return Response from layer | Yes — return Rejection |
| Global blocking | Fairing on_response override (handler still runs) | app.wrap(middleware) — runs on all routes | Router::layer(middleware) — runs on all routes | .and(filter) at route composition level |
| Per-route blocking | fn index(_g: Guard) — function parameter | .wrap() on resource/scope | .route_layer() on specific routes | .and(filter) per-route |
| UA header access | req.headers().get_one("User-Agent") | req.headers().get("user-agent") | req.headers().get("user-agent") | warp::header::optional("user-agent") |
| Hard 403 | Outcome::Error((Status::Forbidden, ())) | HttpResponse::Forbidden().finish() | (StatusCode::FORBIDDEN, "Forbidden").into_response() | warp::reject::custom(Forbidden) |
| robots.txt | FileServer::from("./static") or #[get] route | actix_files::Files or route | tower_http::services::ServeDir or route | warp::fs::dir("./static") or route |
| Compile-time checks | Yes — route types checked at compile time | Partial — extractors checked at runtime | Partial — extractors checked at runtime | Yes — filter types checked at compile time |
Quick reference
req.headers().get_one("User-Agent")Outcome::Error((Status::Forbidden, ()))Outcome::Success(AiBotGuard)res.set_raw_header("X-Robots-Tag", "noai, noimageai")impl Fairing — on_request (no abort), on_response (can modify)impl FromRequest — Outcome::Error to blockFileServer::from("./static") or #[get("/robots.txt")] routeinclude_str!("../static/robots.txt")#[catch(403)] fn forbidden() -> &strFAQ
Can Rocket fairings block or abort incoming requests?
No. on_request returns () — there is no mechanism to return a response or abort the request. This is Rocket's deliberate design: fairings are side-effects (logging, metrics, header injection), not access control. For blocking AI bots, use a Request Guard that returns Outcome::Error((Status::Forbidden, ())).
How do I block AI bots globally without adding a guard to every route?
Three options: (1) Fairing on_response override — check the request UA and rewrite the response to 403. The route handler still runs (wasted CPU/DB) but the response is replaced. (2) Reverse proxy — put nginx or Caddy in front of Rocket and block at the proxy level. Zero wasted computation. This is the recommended production approach. (3) Guard on every route — explicit and Rocket-idiomatic, but verbose for large applications.
What is the difference between Request Guards and Fairings?
Request Guards (FromRequest): per-route, type-checked at compile time, CAN abort with Outcome::Error. Only run on routes that declare them as parameters. Fairings (Fairing trait): global lifecycle callbacks, CANNOT abort requests. on_request modifies the request, on_response modifies the response. Run on all requests regardless of route.
How do I serve robots.txt in Rocket?
(1) FileServer::from("./static") — place robots.txt in ./static/. Simplest approach. (2) Dedicated #[get("/robots.txt")] route — for dynamic content. (3) include_str!("../static/robots.txt") — embeds the file at compile time, zero filesystem reads at runtime. Use FileServer for simple cases, the route for dynamic content, include_str! for maximum performance.
How is this different from the general Rust guide on Open Shadow?
The general Rust guide covers Actix-web wrap_fn and Axum from_fn Tower middleware — both use the standard middleware pattern where you CAN short-circuit. Rocket's architecture is fundamentally different: fairings cannot abort, guards are the blocking mechanism. Different traits, different patterns, different trade-offs. If you're using Rocket, this guide applies. If you're using Actix-web or Axum, see the general Rust guide.
Is your site protected from AI bots?
Run a free scan to check your robots.txt, meta tags, and overall AI readiness score.