Skip to content
Guides/Vert.x (Java)

How to Block AI Bots on Vert.x (Java): Complete 2026 Guide

Vert.x is Eclipse's reactive toolkit for building high-performance, non-blocking web services on the JVM. Vert.x Web provides a Router with ordered Handler<RoutingContext> middleware — fully event-driven, no thread-per-request. Bot blocking is a handler registered with negative order so it fires before any route logic.

Vert.x vs Spring Boot / Servlet

Spring Boot uses FilterRegistrationBean with blocking doFilter() chains. Vert.x uses Handler<RoutingContext> with non-blocking ctx.next(). No annotations, no bean lifecycle — just ordered handlers on a Router. Short-circuit with ctx.response().setStatusCode(403).end().

Protection layers

1
robots.txtStaticHandler on /robots.txt route — fires before bot blocker
2
noai meta tagctx.put("robots", "noai") in handler — template engines read from RoutingContext data
3
X-Robots-Tag headerctx.addHeadersEndHandler() — fires just before response flush, all routes
4
Hard 403 blockctx.response().setStatusCode(403).end() — immediate, non-blocking termination

Layer 1: robots.txt

Place robots.txt in src/main/resources/webroot/ and serve it via Vert.x Web's StaticHandler:

# src/main/resources/webroot/robots.txt

User-agent: *
Allow: /

User-agent: GPTBot
User-agent: ClaudeBot
User-agent: anthropic-ai
User-agent: Google-Extended
User-agent: CCBot
User-agent: cohere-ai
User-agent: Bytespider
User-agent: Amazonbot
User-agent: PerplexityBot
User-agent: YouBot
User-agent: Diffbot
User-agent: DeepSeekBot
User-agent: MistralBot
User-agent: xAI-Bot
User-agent: AI2Bot
Disallow: /
// MainVerticle.java — serve robots.txt BEFORE bot blocker
import io.vertx.ext.web.Router;
import io.vertx.ext.web.handler.StaticHandler;

Router router = Router.router(vertx);

// ① robots.txt — order -2 so it fires before bot blocker (-1)
router.route("/robots.txt")
    .order(-2)
    .handler(StaticHandler.create("webroot").setIndexPage(null));

Layers 2, 3 & 4: Bot blocker handler

Write the bot blocker as a Handler<RoutingContext>. Register with order(-1) so it fires before all route handlers (default order is 0):

// AiBotBlocker.java
package com.example.middleware;

import io.vertx.core.Handler;
import io.vertx.ext.web.RoutingContext;
import java.util.List;

public class AiBotBlocker implements Handler<RoutingContext> {

    private static final List<String> AI_BOTS = List.of(
        "gptbot", "chatgpt-user", "claudebot", "anthropic-ai",
        "ccbot", "cohere-ai", "bytespider", "amazonbot",
        "applebot-extended", "perplexitybot", "youbot", "diffbot",
        "google-extended", "deepseekbot", "mistralbot", "xai-bot",
        "ai2bot", "oai-searchbot", "duckassistbot"
    );

    @Override
    public void handle(RoutingContext ctx) {
        String ua = ctx.request().getHeader("User-Agent");
        String uaLower = (ua != null) ? ua.toLowerCase() : "";

        // Layer 2: set noai directive for template engines
        ctx.put("robots", "noai, noimageai");

        // Layer 3: X-Robots-Tag on every response (fires before flush)
        ctx.addHeadersEndHandler(v ->
            ctx.response().putHeader("X-Robots-Tag", "noai, noimageai")
        );

        // Layer 4: hard 403 for AI bots
        if (AI_BOTS.stream().anyMatch(uaLower::contains)) {
            ctx.response()
                .setStatusCode(403)
                .putHeader("Content-Type", "text/plain")
                .end("Forbidden: AI crawlers are not permitted.");
            return; // Do NOT call ctx.next() — short-circuit
        }

        ctx.next(); // Pass to next handler
    }
}

Register on the Router:

// MainVerticle.java
import com.example.middleware.AiBotBlocker;

Router router = Router.router(vertx);

// ① robots.txt (order -2)
router.route("/robots.txt")
    .order(-2)
    .handler(StaticHandler.create("webroot").setIndexPage(null));

// ② Bot blocker on ALL routes (order -1)
router.route()
    .order(-1)
    .handler(new AiBotBlocker());

// ③ Your application routes (default order 0)
router.get("/api/articles").handler(this::getArticles);
router.get("/api/posts").handler(this::getPosts);

// Start HTTP server
vertx.createHttpServer()
    .requestHandler(router)
    .listen(8080);

addHeadersEndHandler — set headers after all handlers run

ctx.addHeadersEndHandler() registers a callback that fires just before response headers are flushed to the wire. This means X-Robots-Tag is set regardless of which route handler produced the response — even if a downstream handler calls ctx.response().end() without setting it. This is Vert.x's equivalent of Koa's post-await next() pattern.

Route ordering

Vert.x routes fire in order() ascending. Use negative orders for middleware that must fire before application routes:

// Execution order:
// 1. robots.txt handler    (order -2)  — serves robots.txt, no ctx.next()
// 2. AiBotBlocker          (order -1)  — blocks bots or calls ctx.next()
// 3. Application routes    (order 0)   — your actual handlers

// If two routes have the same order, they fire in registration order.
// Use explicit ordering to guarantee middleware fires first.

router.route().order(-1).handler(new AiBotBlocker());  // Always fires before:
router.get("/api/articles").handler(this::getArticles); // This (order 0)
ctx.next() vs ctx.response().end()
ctx.next(): pass to the next matching handler (continue the chain).
ctx.response().end(): send the response immediately, no further handlers fire.
The bot blocker uses .end() for bots (short-circuit) and ctx.next() for legitimate traffic (pass through).

Sub-router isolation (selective blocking)

To protect only specific path prefixes, mount the bot blocker on a sub-router instead of the main router:

// Only block AI bots on /api/content/* — leave /api/public/* open
Router mainRouter = Router.router(vertx);
Router protectedRouter = Router.router(vertx);

// Protected sub-router gets the bot blocker
protectedRouter.route().handler(new AiBotBlocker());
protectedRouter.get("/articles").handler(this::getArticles);
protectedRouter.get("/posts").handler(this::getPosts);

// Mount under /api/content
mainRouter.route("/api/content/*").subRouter(protectedRouter);

// Public routes — no bot blocker
mainRouter.get("/api/public/health").handler(this::healthCheck);
mainRouter.get("/api/public/status").handler(this::getStatus);

// Alternative: path-specific handler (no sub-router)
mainRouter.route("/api/content/*")
    .order(-1)
    .handler(new AiBotBlocker());

Regex route blocking

For fine-grained path matching, use routeWithRegex():

// Block AI bots only on article and post endpoints
router.routeWithRegex("\/api\/(articles|posts|content)\/.*")
    .order(-1)
    .handler(new AiBotBlocker());

// Block on all GET requests only (not POST/PUT/DELETE)
router.route()
    .method(HttpMethod.GET)
    .order(-1)
    .handler(new AiBotBlocker());

EventBus service blocking (clustered Vert.x)

In clustered deployments, HTTP requests may be forwarded to EventBus consumers (service proxy pattern). Pass the User-Agent in DeliveryOptions headers:

// HTTP handler → forwards to EventBus consumer
router.get("/api/articles").handler(ctx -> {
    String ua = ctx.request().getHeader("User-Agent");
    DeliveryOptions options = new DeliveryOptions()
        .addHeader("user-agent", ua != null ? ua : "");

    vertx.eventBus()
        .request("articles.list", null, options)
        .onSuccess(reply -> {
            ctx.response()
                .putHeader("Content-Type", "application/json")
                .end(reply.body().toString());
        })
        .onFailure(err -> {
            if (err.getMessage().contains("403")) {
                ctx.response().setStatusCode(403).end("Forbidden");
            } else {
                ctx.fail(500);
            }
        });
});

// EventBus consumer — checks UA from DeliveryOptions headers
vertx.eventBus().consumer("articles.list", message -> {
    String ua = message.headers().get("user-agent");
    String uaLower = (ua != null) ? ua.toLowerCase() : "";

    if (AI_BOTS.stream().anyMatch(uaLower::contains)) {
        message.fail(403, "Forbidden: AI crawlers are not permitted.");
        return;
    }

    // Normal processing...
    message.reply(getArticlesJson());
});
When to use EventBus blocking?
Only if your architecture uses the service proxy pattern (HTTP → EventBus → consumer). For standard Vert.x Web apps, the Router handler is sufficient — EventBus blocking adds defense-in-depth for clustered microservice deployments where a compromised node might bypass the HTTP layer.

Kotlin variant (Vert.x Kotlin DSL)

Vert.x has first-class Kotlin support with coroutine extensions:

// AiBotBlocker.kt
package com.example.middleware

import io.vertx.ext.web.RoutingContext

private val AI_BOTS = listOf(
    "gptbot", "chatgpt-user", "claudebot", "anthropic-ai",
    "ccbot", "cohere-ai", "bytespider", "amazonbot",
    "applebot-extended", "perplexitybot", "youbot", "diffbot",
    "google-extended", "deepseekbot", "mistralbot", "xai-bot",
    "ai2bot", "oai-searchbot", "duckassistbot"
)

fun aiBotBlocker(ctx: RoutingContext) {
    val ua = ctx.request().getHeader("User-Agent")?.lowercase() ?: ""

    // X-Robots-Tag on every response
    ctx.addHeadersEndHandler {
        ctx.response().putHeader("X-Robots-Tag", "noai, noimageai")
    }

    // Hard 403 for AI bots
    if (AI_BOTS.any { ua.contains(it) }) {
        ctx.response()
            .setStatusCode(403)
            .putHeader("Content-Type", "text/plain")
            .end("Forbidden: AI crawlers are not permitted.")
        return
    }

    ctx.next()
}

// Registration:
// router.route().order(-1).handler(::aiBotBlocker)

Vert.x vs Spring Boot vs Micronaut vs Quarkus — comparison

Vert.x — Handler<RoutingContext>

// Non-blocking handler with explicit ordering
router.route().order(-1).handler(ctx -> {
    String ua = ctx.request().getHeader("User-Agent");
    if (ua != null && isAiBot(ua.toLowerCase())) {
        ctx.response().setStatusCode(403).end("Forbidden");
        return;
    }
    ctx.next();
});

Spring Boot — FilterRegistrationBean

// Blocking Servlet filter with bean lifecycle
@Bean
FilterRegistrationBean<AiBotFilter> aiBotFilter() {
    var reg = new FilterRegistrationBean<>(new AiBotFilter());
    reg.setOrder(Ordered.HIGHEST_PRECEDENCE);
    reg.addUrlPatterns("/*");
    return reg;
}

Micronaut — HttpServerFilter

// Annotation-driven reactive filter
@Filter("/**")
public class AiBotFilter implements HttpServerFilter {
    @Override
    public Publisher<MutableHttpResponse<?>> doFilter(
        HttpRequest<?> request, ServerFilterChain chain) {
        if (isAiBot(request.getHeaders().get("User-Agent"))) {
            return Mono.just(HttpResponse.status(403));
        }
        return chain.proceed(request);
    }
}

Quarkus — @ServerRequestFilter

// CDI-managed request filter
@ServerRequestFilter
public void blockAiBots(ContainerRequestContext ctx) {
    String ua = ctx.getHeaderString("User-Agent");
    if (ua != null && isAiBot(ua.toLowerCase())) {
        ctx.abortWith(Response.status(403)
            .entity("Forbidden").build());
    }
}

Vert.x is the only one that's fully non-blocking at the filter layer. Spring Boot's Filter is Servlet-based (blocking thread model). Micronaut returns a Publisher (reactive). Quarkus uses JAX-RS ContainerRequestContext. Vert.x uses plain handler ordering — no annotation magic, no DI framework required.

noai meta tag (template engines)

If you use Vert.x Web's template engines (Thymeleaf, Pebble, Handlebars), the ctx.put("robots", ...) value is available in templates:

<!-- Thymeleaf template -->
<head>
  <meta name="robots" th:content="${robots}" />
</head>

<!-- Pebble template -->
<head>
  <meta name="robots" content="{{ robots }}" />
</head>

<!-- Handlebars template -->
<head>
  <meta name="robots" content="{{robots}}" />
</head>

The bot blocker sets ctx.put("robots", "noai, noimageai") on every request — template engines automatically pick it up from the RoutingContext data map.

Testing

Use Vert.x's WebClient with @ExtendWith(VertxExtension.class) for non-blocking test assertions:

// AiBotBlockerTest.java
import io.vertx.core.Vertx;
import io.vertx.ext.web.client.WebClient;
import io.vertx.junit5.VertxExtension;
import io.vertx.junit5.VertxTestContext;
import org.junit.jupiter.api.Test;
import org.junit.jupiter.api.extension.ExtendWith;

import static org.junit.jupiter.api.Assertions.*;

@ExtendWith(VertxExtension.class)
class AiBotBlockerTest {

    @Test
    void blocksGPTBot(Vertx vertx, VertxTestContext ctx) {
        WebClient client = WebClient.create(vertx);
        client.get(8080, "localhost", "/api/articles")
            .putHeader("User-Agent", "GPTBot/1.0")
            .send()
            .onComplete(ctx.succeeding(res -> ctx.verify(() -> {
                assertEquals(403, res.statusCode());
                ctx.completeNow();
            })));
    }

    @Test
    void blocksClaudeBot(Vertx vertx, VertxTestContext ctx) {
        WebClient client = WebClient.create(vertx);
        client.get(8080, "localhost", "/api/articles")
            .putHeader("User-Agent", "ClaudeBot/2.0")
            .send()
            .onComplete(ctx.succeeding(res -> ctx.verify(() -> {
                assertEquals(403, res.statusCode());
                ctx.completeNow();
            })));
    }

    @Test
    void allowsBrowserWithHeaders(Vertx vertx, VertxTestContext ctx) {
        WebClient client = WebClient.create(vertx);
        client.get(8080, "localhost", "/api/articles")
            .putHeader("User-Agent", "Mozilla/5.0 (compatible browser)")
            .send()
            .onComplete(ctx.succeeding(res -> ctx.verify(() -> {
                assertEquals(200, res.statusCode());
                assertEquals("noai, noimageai",
                    res.getHeader("X-Robots-Tag"));
                ctx.completeNow();
            })));
    }

    @Test
    void servesRobotsTxt(Vertx vertx, VertxTestContext ctx) {
        WebClient client = WebClient.create(vertx);
        client.get(8080, "localhost", "/robots.txt")
            .putHeader("User-Agent", "GPTBot/1.0")
            .send()
            .onComplete(ctx.succeeding(res -> ctx.verify(() -> {
                // robots.txt served even to blocked bots (order -2 < -1)
                assertEquals(200, res.statusCode());
                assertTrue(res.bodyAsString().contains("Disallow: /"));
                ctx.completeNow();
            })));
    }
}

AI bot User-Agent strings (2026)

GPTBotChatGPT-UserClaudeBotanthropic-aiCCBotcohere-aiBytespiderAmazonbotApplebot-ExtendedPerplexityBotYouBotDiffbotGoogle-ExtendedFacebookBotomgiliomgilibotDeepSeekBotMistralBotxAI-BotAI2Bot

Vert.x exposes headers via ctx.request().getHeader("User-Agent") — returns the raw string (case-sensitive header name, case-insensitive lookup). Always .toLowerCase() the value before matching. In EventBus consumers: message.headers().get("user-agent").

Is your site protected from AI bots?

Run a free scan to check your robots.txt, meta tags, and overall AI readiness score.