Skip to content
openshadow.io/guides/blocking-ai-bots-spring-boot

How to Block AI Bots on Spring Boot: Complete 2026 Guide

Spring Boot's static resource handler serves robots.txt with zero configuration — just drop it in src/main/resources/static/. For hard blocking, you have two clean options: a HandlerInterceptor in the MVC layer or a OncePerRequestFilter at the servlet layer — both intercept before any controller code runs.

8 min read·Updated April 2026·Spring Boot 3.x / Java 17+

Methods overview

Method
robots.txt in src/main/resources/static/

Always — zero config needed

HandlerInterceptor (MVC layer)

No Spring Security dependency

OncePerRequestFilter (servlet layer)

Spring Security already in project

noai meta tag in Thymeleaf layout

Thymeleaf template engine

nginx reverse proxy block

nginx in front of embedded Tomcat

1. robots.txt in src/main/resources/static/

Spring Boot's auto-configured ResourceHttpRequestHandler serves everything in src/main/resources/static/ at the root URL. No @Controller, no @RequestMapping — just drop the file and it's available at /robots.txt.

src/main/resources/static/robots.txtserved at /robots.txt automatically
User-agent: GPTBot
Disallow: /

User-agent: ChatGPT-User
Disallow: /

User-agent: OAI-SearchBot
Disallow: /

User-agent: ClaudeBot
Disallow: /

User-agent: anthropic-ai
Disallow: /

User-agent: Google-Extended
Disallow: /

User-agent: Bytespider
Disallow: /

User-agent: CCBot
Disallow: /

User-agent: PerplexityBot
Disallow: /

User-agent: meta-externalagent
Disallow: /

User-agent: Amazonbot
Disallow: /

User-agent: Applebot-Extended
Disallow: /

User-agent: xAI-Bot
Disallow: /

User-agent: DeepSeekBot
Disallow: /

User-agent: MistralBot
Disallow: /

User-agent: Diffbot
Disallow: /

User-agent: cohere-ai
Disallow: /

User-agent: AI2Bot
Disallow: /

User-agent: Ai2Bot-Dolma
Disallow: /

User-agent: YouBot
Disallow: /

User-agent: DuckAssistBot
Disallow: /

User-agent: omgili
Disallow: /

User-agent: omgilibot
Disallow: /

User-agent: webzio-extended
Disallow: /

User-agent: gemini-deep-research
Disallow: /

User-agent: *
Allow: /

Dynamic robots.txt via controller (optional)

If you need environment-specific rules (e.g., block all crawlers in staging), expose a controller endpoint instead of a static file:

RobotsController.java
package com.example.app.web;

import org.springframework.beans.factory.annotation.Value;
import org.springframework.http.MediaType;
import org.springframework.http.ResponseEntity;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RestController;

@RestController
public class RobotsController {

    @Value("${spring.profiles.active:default}")
    private String activeProfile;

    private static final String AI_BOTS_DISALLOW = """
            User-agent: GPTBot
            Disallow: /

            User-agent: ChatGPT-User
            Disallow: /

            User-agent: ClaudeBot
            Disallow: /

            User-agent: anthropic-ai
            Disallow: /

            User-agent: Google-Extended
            Disallow: /

            User-agent: Bytespider
            Disallow: /

            User-agent: CCBot
            Disallow: /

            User-agent: PerplexityBot
            Disallow: /

            User-agent: Diffbot
            Disallow: /

            User-agent: *
            Allow: /
            """;

    private static final String BLOCK_ALL = """
            User-agent: *
            Disallow: /
            """;

    @GetMapping(value = "/robots.txt", produces = MediaType.TEXT_PLAIN_VALUE)
    public ResponseEntity<String> robots() {
        String body = "production".equals(activeProfile)
                ? AI_BOTS_DISALLOW
                : BLOCK_ALL;
        return ResponseEntity.ok(body);
    }
}

Static vs controller: If bothsrc/main/resources/static/robots.txtand a @GetMapping("/robots.txt") controller exist, the controller takes precedence. Use one or the other.

2. HandlerInterceptor (MVC layer)

A HandlerInterceptor fires after DispatcherServlet but before any@Controller method. Return false from preHandle() to stop the chain.

Step 1 — write the interceptor

AiBotBlockingInterceptor.java
package com.example.app.config;

import jakarta.servlet.http.HttpServletRequest;
import jakarta.servlet.http.HttpServletResponse;
import org.springframework.stereotype.Component;
import org.springframework.web.servlet.HandlerInterceptor;

import java.util.regex.Pattern;

@Component
public class AiBotBlockingInterceptor implements HandlerInterceptor {

    private static final Pattern BLOCKED_UAS = Pattern.compile(
        "GPTBot|ChatGPT-User|OAI-SearchBot|ClaudeBot|anthropic-ai|" +
        "Google-Extended|Bytespider|CCBot|PerplexityBot|meta-externalagent|" +
        "Amazonbot|Applebot-Extended|xAI-Bot|DeepSeekBot|MistralBot|Diffbot|" +
        "cohere-ai|AI2Bot|Ai2Bot-Dolma|YouBot|DuckAssistBot|omgili|omgilibot|" +
        "webzio-extended|gemini-deep-research",
        Pattern.CASE_INSENSITIVE
    );

    @Override
    public boolean preHandle(HttpServletRequest request,
                             HttpServletResponse response,
                             Object handler) throws Exception {
        // Always allow robots.txt through
        if ("/robots.txt".equals(request.getRequestURI())) {
            return true;
        }

        String userAgent = request.getHeader("User-Agent");
        if (userAgent != null && BLOCKED_UAS.matcher(userAgent).find()) {
            response.sendError(HttpServletResponse.SC_FORBIDDEN, "Forbidden");
            return false;
        }

        return true;
    }
}

Step 2 — register via WebMvcConfigurer

WebConfig.java
package com.example.app.config;

import org.springframework.context.annotation.Configuration;
import org.springframework.web.servlet.config.annotation.InterceptorRegistry;
import org.springframework.web.servlet.config.annotation.WebMvcConfigurer;

@Configuration
public class WebConfig implements WebMvcConfigurer {

    private final AiBotBlockingInterceptor aiBotBlockingInterceptor;

    public WebConfig(AiBotBlockingInterceptor aiBotBlockingInterceptor) {
        this.aiBotBlockingInterceptor = aiBotBlockingInterceptor;
    }

    @Override
    public void addInterceptors(InterceptorRegistry registry) {
        registry.addInterceptor(aiBotBlockingInterceptor)
                .addPathPatterns("/**")
                .excludePathPatterns("/robots.txt", "/favicon.ico");
    }
}

3. OncePerRequestFilter (servlet layer)

OncePerRequestFilter runs at the servlet filter level — earlier in the stack than HandlerInterceptor, before DispatcherServlet. Preferred if you already have Spring Security, since it fits naturally into the SecurityFilterChain.

The filter class

AiBotBlockingFilter.java
package com.example.app.security;

import jakarta.servlet.FilterChain;
import jakarta.servlet.ServletException;
import jakarta.servlet.http.HttpServletRequest;
import jakarta.servlet.http.HttpServletResponse;
import org.springframework.web.filter.OncePerRequestFilter;

import java.io.IOException;
import java.util.regex.Pattern;

public class AiBotBlockingFilter extends OncePerRequestFilter {

    private static final Pattern BLOCKED_UAS = Pattern.compile(
        "GPTBot|ChatGPT-User|OAI-SearchBot|ClaudeBot|anthropic-ai|" +
        "Google-Extended|Bytespider|CCBot|PerplexityBot|meta-externalagent|" +
        "Amazonbot|Applebot-Extended|xAI-Bot|DeepSeekBot|MistralBot|Diffbot|" +
        "cohere-ai|AI2Bot|Ai2Bot-Dolma|YouBot|DuckAssistBot|omgili|omgilibot|" +
        "webzio-extended|gemini-deep-research",
        Pattern.CASE_INSENSITIVE
    );

    @Override
    protected boolean shouldNotFilter(HttpServletRequest request) {
        // Skip filter for robots.txt — always let it through
        return "/robots.txt".equals(request.getRequestURI());
    }

    @Override
    protected void doFilterInternal(HttpServletRequest request,
                                    HttpServletResponse response,
                                    FilterChain filterChain)
            throws ServletException, IOException {
        String userAgent = request.getHeader("User-Agent");
        if (userAgent != null && BLOCKED_UAS.matcher(userAgent).find()) {
            response.sendError(HttpServletResponse.SC_FORBIDDEN, "Forbidden");
            return;
        }
        filterChain.doFilter(request, response);
    }
}

Register in SecurityFilterChain (Spring Security 6)

SecurityConfig.java
package com.example.app.security;

import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.security.config.annotation.web.builders.HttpSecurity;
import org.springframework.security.config.annotation.web.configuration.EnableWebSecurity;
import org.springframework.security.web.SecurityFilterChain;
import org.springframework.security.web.authentication.UsernamePasswordAuthenticationFilter;

@Configuration
@EnableWebSecurity
public class SecurityConfig {

    @Bean
    public SecurityFilterChain filterChain(HttpSecurity http) throws Exception {
        http
            .addFilterBefore(
                new AiBotBlockingFilter(),
                UsernamePasswordAuthenticationFilter.class
            )
            .authorizeHttpRequests(auth -> auth
                .requestMatchers("/robots.txt", "/public/**").permitAll()
                .anyRequest().authenticated()
            );
        return http.build();
    }
}

Register as a bean without Spring Security

FilterConfig.java — no Spring Security required
package com.example.app.config;

import com.example.app.security.AiBotBlockingFilter;
import org.springframework.boot.web.servlet.FilterRegistrationBean;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;

@Configuration
public class FilterConfig {

    @Bean
    public FilterRegistrationBean<AiBotBlockingFilter> aiBotBlockingFilter() {
        FilterRegistrationBean<AiBotBlockingFilter> registration =
            new FilterRegistrationBean<>();
        registration.setFilter(new AiBotBlockingFilter());
        registration.addUrlPatterns("/*");
        registration.setOrder(1); // runs first
        return registration;
    }
}

4. noai meta tag in Thymeleaf layout

Add the noai meta tag to your base Thymeleaf layout template. All pages that extend the layout inherit it automatically. Use a named fragment for per-page override.

Base layout with noai — Thymeleaf Layout Dialect

templates/layout/base.html
<!DOCTYPE html>
<html xmlns:th="http://www.thymeleaf.org"
      xmlns:layout="http://www.ultraq.net.nz/thymeleaf/layout">
<head>
    <meta charset="UTF-8" />
    <meta name="viewport" content="width=device-width, initial-scale=1.0" />

    <!-- Block AI training bots on all pages -->
    <meta name="robots" th:content="'noai, noimageai'" />

    <!-- Per-page head content (title, description, etc.) -->
    <th:block layout:fragment="head-extra"></th:block>
</head>
<body>
    <div layout:fragment="content"></div>
</body>
</html>

Per-page override (allow AI on specific pages)

templates/pages/public-article.html
<!DOCTYPE html>
<html xmlns:th="http://www.thymeleaf.org"
      xmlns:layout="http://www.ultraq.net.nz/thymeleaf/layout"
      layout:decorate="~{layout/base}">
<head>
    <!-- Override: allow AI on this specific public article -->
    <th:block layout:fragment="head-extra">
        <meta name="robots" content="index, follow" />
    </th:block>
</head>
<body>
    <div layout:fragment="content">
        <h1 th:text="${article.title}">Article Title</h1>
        <!-- content -->
    </div>
</body>
</html>

Without Layout Dialect (th:replace fragment)

templates/fragments/head.html
<!DOCTYPE html>
<html xmlns:th="http://www.thymeleaf.org">
<head th:fragment="head(title)">
    <meta charset="UTF-8" />
    <meta name="viewport" content="width=device-width, initial-scale=1.0" />
    <meta name="robots" content="noai, noimageai" />
    <title th:text="${title}">My App</title>
</head>
</html>

<!-- In page templates: -->
<!-- <head th:replace="~{fragments/head :: head('Page Title')}"></head> -->

5. nginx reverse proxy block

In production, Spring Boot typically runs behind nginx. Block AI bots at the nginx level before any request reaches the JVM — cheaper compute and stops bots that ignore robots.txt.

/etc/nginx/sites-available/spring-app
# In http { } block of /etc/nginx/nginx.conf:
map $http_user_agent $blocked_ai_bot {
    default                  0;
    "~*GPTBot"               1;
    "~*ChatGPT-User"         1;
    "~*OAI-SearchBot"        1;
    "~*ClaudeBot"            1;
    "~*anthropic-ai"         1;
    "~*Google-Extended"      1;
    "~*Bytespider"           1;
    "~*CCBot"                1;
    "~*PerplexityBot"        1;
    "~*meta-externalagent"   1;
    "~*Amazonbot"            1;
    "~*Applebot-Extended"    1;
    "~*xAI-Bot"              1;
    "~*DeepSeekBot"          1;
    "~*MistralBot"           1;
    "~*Diffbot"              1;
    "~*cohere-ai"            1;
    "~*AI2Bot"               1;
    "~*Ai2Bot-Dolma"         1;
    "~*omgili"               1;
    "~*omgilibot"            1;
    "~*webzio-extended"      1;
    "~*gemini-deep-research" 1;
}

server {
    listen 80;
    server_name myapp.com www.myapp.com;

    # Block AI bots before hitting Spring Boot
    if ($blocked_ai_bot) {
        return 403;
    }

    # noai header on all responses
    add_header X-Robots-Tag "noai, noimageai" always;

    # Proxy to embedded Tomcat
    location / {
        proxy_pass         http://127.0.0.1:8080;
        proxy_set_header   Host $host;
        proxy_set_header   X-Real-IP $remote_addr;
        proxy_set_header   X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header   X-Forwarded-Proto $scheme;
    }
}

Kubernetes nginx ingress annotation

ingress.yaml — nginx ingress controller
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: spring-app-ingress
  annotations:
    nginx.ingress.kubernetes.io/server-snippet: |
      if ($http_user_agent ~* "GPTBot|ClaudeBot|CCBot|Bytespider|Diffbot|Google-Extended|anthropic-ai|meta-externalagent|cohere-ai|AI2Bot|DeepSeekBot|MistralBot") {
        return 403;
      }
      add_header X-Robots-Tag "noai, noimageai" always;
spec:
  rules:
    - host: myapp.com
      http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: spring-app-service
                port:
                  number: 8080

Blocking layer comparison

LayerRuns where
HandlerInterceptorAfter DispatcherServlet
OncePerRequestFilterBefore DispatcherServlet
nginx map + ifBefore JVM
K8s ingress snippetBefore pods

AI bots to block on Spring Boot applications

BotOperator
GPTBotOpenAI
ChatGPT-UserOpenAI
OAI-SearchBotOpenAI
ClaudeBotAnthropic
anthropic-aiAnthropic
Google-ExtendedGoogle
BytespiderByteDance
CCBotCommon Crawl
PerplexityBotPerplexity
meta-externalagentMeta
AmazonbotAmazon
Applebot-ExtendedApple
xAI-BotxAI
DeepSeekBotDeepSeek
MistralBotMistral
DiffbotDiffbot
cohere-aiCohere
AI2BotAllen Institute
Ai2Bot-DolmaAllen Institute
YouBotYou.com
DuckAssistBotDuckDuckGo
omgiliWebz.io
omgilibotWebz.io
webzio-extendedWebz.io
gemini-deep-researchGoogle

FAQ

Where do I put robots.txt in a Spring Boot project?

In src/main/resources/static/robots.txt. Spring Boot's auto-configured ResourceHttpRequestHandler serves everything in static/ at the root URL — no controller needed.

HandlerInterceptor vs OncePerRequestFilter — which should I use?

If you have Spring Security: use OncePerRequestFilter and register it in your SecurityFilterChain. If you don't have Spring Security: use HandlerInterceptor via WebMvcConfigurer.addInterceptors() — fewer dependencies.

Does Spring Boot have a built-in way to block bots by user agent?

No. There is no application.properties setting for user-agent blocking. You need to write a filter or interceptor, or block at the nginx/ingress layer.

How do I add noai meta tags to all pages in a Spring Boot Thymeleaf app?

Add <meta name="robots" content="noai, noimageai"> to your base layout template (Thymeleaf Layout Dialect) or a shared th:fragment. Pages that extend the layout inherit it. Override per-page with a layout:fragment block.

How do I block AI bots in a Spring Boot app deployed to Kubernetes?

Use the nginx ingress controller's nginx.ingress.kubernetes.io/server-snippet annotation to inject user-agent blocking rules at the ingress layer. This blocks before traffic reaches your pods. Alternatively, implement blocking in-application via OncePerRequestFilter — works regardless of infrastructure.

Does my Spring Boot filter also block requests in development?

It will unless you add a profile check. Use @Profile("!dev") on your FilterRegistrationBean or SecurityConfig bean to disable it in the dev profile. Or inject @Value("${spring.profiles.active}") and short-circuit the filter logic in non-production.

Is your site protected from AI bots?

Run a free scan to check your robots.txt, meta tags, and overall AI readiness score.

Related Guides