How is this different from Google Analytics?

Google Analytics shows you traffic. Shadow shows you traffic, AI bot activity, what AI platforms say about your brand, AND tells you what to do about all of it. It's analytics + AI intelligence + action steps in one tool.

Do I need to install anything?

For basic monitoring (bot detection, AI perception, readiness score) — nope, just enter your URL. For full visitor analytics (clicks, behavior, sessions), add one script tag. One-click integrations for Vercel, Shopify, WordPress, and more.

Will it slow down my site?

No. The script is under 5KB and loads async. Zero impact on page speed or Core Web Vitals. External monitoring has literally no impact — it watches from the outside.

What AI bots does Shadow detect?

All of them. GPTBot, ClaudeBot, PerplexityBot, Google-Extended, Bytespider, Amazonbot, and dozens more. The Shadow Network means new bots get identified across all users instantly.

What do you mean by "actionable steps"?

Shadow doesn't just show you graphs. It says things like: "ChatGPT has your pricing wrong — add structured data to /pricing to fix it" or "Your bounce rate on /features is 68% — here's why and what to change." Specific, do-it-today recommendations.

Can Shadow block bots?

Shadow is a telescope, not a shield. It shows you who's visiting and what AI says about you. It generates block rules and robots.txt configs you can apply — but it doesn't intercept traffic.

Yes. Shadow never collects PII. IP addresses are hashed after classification. No cookies on your visitors. All Shadow Network data is anonymized. GDPR compliant by design.

What is the difference between HandlerInterceptor and OncePerRequestFilter for blocking AI bots in Spring Boot?

HandlerInterceptor runs in the Spring MVC layer — after DispatcherServlet but before your controller. It can be registered via WebMvcConfigurer and is easiest to configure selectively per URL. OncePerRequestFilter is a servlet filter — it runs before DispatcherServlet, making it slightly earlier in the stack and more appropriate for security-level blocking. For AI bot blocking, either works; OncePerRequestFilter is preferred if you're already using Spring Security.

Should I use Spring Security or a HandlerInterceptor to block AI bots?

If you already have Spring Security in your project, add a OncePerRequestFilter bean and register it in your SecurityFilterChain — it integrates cleanly with your existing security config. If you don't have Spring Security, a HandlerInterceptor registered via addInterceptors() in WebMvcConfigurer is simpler with fewer dependencies.

openshadow.io/guides/blocking-ai-bots-spring-boot

How to Block AI Bots on Spring Boot: Complete 2026 Guide

Spring Boot's static resource handler serves robots.txt with zero configuration — just drop it in src/main/resources/static/. For hard blocking, you have two clean options: a HandlerInterceptor in the MVC layer or a OncePerRequestFilter at the servlet layer — both intercept before any controller code runs.

8 min read·Updated April 2026·Spring Boot 3.x / Java 17+

Methods overview

Method	Level	Effect
robots.txt in src/main/resources/static/ Always — zero config needed	Beginner	Signals only
HandlerInterceptor (MVC layer) No Spring Security dependency	Intermediate	Hard block — 403 before controller
OncePerRequestFilter (servlet layer) Spring Security already in project	Intermediate	Hard block — before DispatcherServlet
noai meta tag in Thymeleaf layout Thymeleaf template engine	Beginner	Training opt-out signal
nginx reverse proxy block nginx in front of embedded Tomcat	Intermediate	Hard block — before JVM

1. robots.txt in src/main/resources/static/

Spring Boot's auto-configured ResourceHttpRequestHandler serves everything in src/main/resources/static/ at the root URL. No @Controller, no @RequestMapping — just drop the file and it's available at /robots.txt.

src/main/resources/static/robots.txtserved at /robots.txt automatically

User-agent: GPTBot
Disallow: /

User-agent: ChatGPT-User
Disallow: /

User-agent: OAI-SearchBot
Disallow: /

User-agent: ClaudeBot
Disallow: /

User-agent: anthropic-ai
Disallow: /

User-agent: Google-Extended
Disallow: /

User-agent: Bytespider
Disallow: /

User-agent: CCBot
Disallow: /

User-agent: PerplexityBot
Disallow: /

User-agent: meta-externalagent
Disallow: /

User-agent: Amazonbot
Disallow: /

User-agent: Applebot-Extended
Disallow: /

User-agent: xAI-Bot
Disallow: /

User-agent: DeepSeekBot
Disallow: /

User-agent: MistralBot
Disallow: /

User-agent: Diffbot
Disallow: /

User-agent: cohere-ai
Disallow: /

User-agent: AI2Bot
Disallow: /

User-agent: Ai2Bot-Dolma
Disallow: /

User-agent: YouBot
Disallow: /

User-agent: DuckAssistBot
Disallow: /

User-agent: omgili
Disallow: /

User-agent: omgilibot
Disallow: /

User-agent: webzio-extended
Disallow: /

User-agent: gemini-deep-research
Disallow: /

User-agent: *
Allow: /

Dynamic robots.txt via controller (optional)

If you need environment-specific rules (e.g., block all crawlers in staging), expose a controller endpoint instead of a static file:

RobotsController.java

package com.example.app.web;

import org.springframework.beans.factory.annotation.Value;
import org.springframework.http.MediaType;
import org.springframework.http.ResponseEntity;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RestController;

@RestController
public class RobotsController {

    @Value("${spring.profiles.active:default}")
    private String activeProfile;

    private static final String AI_BOTS_DISALLOW = """
            User-agent: GPTBot
            Disallow: /

            User-agent: ChatGPT-User
            Disallow: /

            User-agent: ClaudeBot
            Disallow: /

            User-agent: anthropic-ai
            Disallow: /

            User-agent: Google-Extended
            Disallow: /

            User-agent: Bytespider
            Disallow: /

            User-agent: CCBot
            Disallow: /

            User-agent: PerplexityBot
            Disallow: /

            User-agent: Diffbot
            Disallow: /

            User-agent: *
            Allow: /
            """;

    private static final String BLOCK_ALL = """
            User-agent: *
            Disallow: /
            """;

    @GetMapping(value = "/robots.txt", produces = MediaType.TEXT_PLAIN_VALUE)
    public ResponseEntity<String> robots() {
        String body = "production".equals(activeProfile)
                ? AI_BOTS_DISALLOW
                : BLOCK_ALL;
        return ResponseEntity.ok(body);
    }
}

Static vs controller: If bothsrc/main/resources/static/robots.txtand a @GetMapping("/robots.txt") controller exist, the controller takes precedence. Use one or the other.

2. HandlerInterceptor (MVC layer)

A HandlerInterceptor fires after DispatcherServlet but before any@Controller method. Return false from preHandle() to stop the chain.

Step 1 — write the interceptor

AiBotBlockingInterceptor.java

package com.example.app.config;

import jakarta.servlet.http.HttpServletRequest;
import jakarta.servlet.http.HttpServletResponse;
import org.springframework.stereotype.Component;
import org.springframework.web.servlet.HandlerInterceptor;

import java.util.regex.Pattern;

@Component
public class AiBotBlockingInterceptor implements HandlerInterceptor {

    private static final Pattern BLOCKED_UAS = Pattern.compile(
        "GPTBot|ChatGPT-User|OAI-SearchBot|ClaudeBot|anthropic-ai|" +
        "Google-Extended|Bytespider|CCBot|PerplexityBot|meta-externalagent|" +
        "Amazonbot|Applebot-Extended|xAI-Bot|DeepSeekBot|MistralBot|Diffbot|" +
        "cohere-ai|AI2Bot|Ai2Bot-Dolma|YouBot|DuckAssistBot|omgili|omgilibot|" +
        "webzio-extended|gemini-deep-research",
        Pattern.CASE_INSENSITIVE
    );

    @Override
    public boolean preHandle(HttpServletRequest request,
                             HttpServletResponse response,
                             Object handler) throws Exception {
        // Always allow robots.txt through
        if ("/robots.txt".equals(request.getRequestURI())) {
            return true;
        }

        String userAgent = request.getHeader("User-Agent");
        if (userAgent != null && BLOCKED_UAS.matcher(userAgent).find()) {
            response.sendError(HttpServletResponse.SC_FORBIDDEN, "Forbidden");
            return false;
        }

        return true;
    }
}

Step 2 — register via WebMvcConfigurer

WebConfig.java

package com.example.app.config;

import org.springframework.context.annotation.Configuration;
import org.springframework.web.servlet.config.annotation.InterceptorRegistry;
import org.springframework.web.servlet.config.annotation.WebMvcConfigurer;

@Configuration
public class WebConfig implements WebMvcConfigurer {

    private final AiBotBlockingInterceptor aiBotBlockingInterceptor;

    public WebConfig(AiBotBlockingInterceptor aiBotBlockingInterceptor) {
        this.aiBotBlockingInterceptor = aiBotBlockingInterceptor;
    }

    @Override
    public void addInterceptors(InterceptorRegistry registry) {
        registry.addInterceptor(aiBotBlockingInterceptor)
                .addPathPatterns("/**")
                .excludePathPatterns("/robots.txt", "/favicon.ico");
    }
}

3. OncePerRequestFilter (servlet layer)

OncePerRequestFilter runs at the servlet filter level — earlier in the stack than HandlerInterceptor, before DispatcherServlet. Preferred if you already have Spring Security, since it fits naturally into the SecurityFilterChain.

The filter class

AiBotBlockingFilter.java

package com.example.app.security;

import jakarta.servlet.FilterChain;
import jakarta.servlet.ServletException;
import jakarta.servlet.http.HttpServletRequest;
import jakarta.servlet.http.HttpServletResponse;
import org.springframework.web.filter.OncePerRequestFilter;

import java.io.IOException;
import java.util.regex.Pattern;

public class AiBotBlockingFilter extends OncePerRequestFilter {

    private static final Pattern BLOCKED_UAS = Pattern.compile(
        "GPTBot|ChatGPT-User|OAI-SearchBot|ClaudeBot|anthropic-ai|" +
        "Google-Extended|Bytespider|CCBot|PerplexityBot|meta-externalagent|" +
        "Amazonbot|Applebot-Extended|xAI-Bot|DeepSeekBot|MistralBot|Diffbot|" +
        "cohere-ai|AI2Bot|Ai2Bot-Dolma|YouBot|DuckAssistBot|omgili|omgilibot|" +
        "webzio-extended|gemini-deep-research",
        Pattern.CASE_INSENSITIVE
    );

    @Override
    protected boolean shouldNotFilter(HttpServletRequest request) {
        // Skip filter for robots.txt — always let it through
        return "/robots.txt".equals(request.getRequestURI());
    }

    @Override
    protected void doFilterInternal(HttpServletRequest request,
                                    HttpServletResponse response,
                                    FilterChain filterChain)
            throws ServletException, IOException {
        String userAgent = request.getHeader("User-Agent");
        if (userAgent != null && BLOCKED_UAS.matcher(userAgent).find()) {
            response.sendError(HttpServletResponse.SC_FORBIDDEN, "Forbidden");
            return;
        }
        filterChain.doFilter(request, response);
    }
}

Register in SecurityFilterChain (Spring Security 6)

SecurityConfig.java

package com.example.app.security;

import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.security.config.annotation.web.builders.HttpSecurity;
import org.springframework.security.config.annotation.web.configuration.EnableWebSecurity;
import org.springframework.security.web.SecurityFilterChain;
import org.springframework.security.web.authentication.UsernamePasswordAuthenticationFilter;

@Configuration
@EnableWebSecurity
public class SecurityConfig {

    @Bean
    public SecurityFilterChain filterChain(HttpSecurity http) throws Exception {
        http
            .addFilterBefore(
                new AiBotBlockingFilter(),
                UsernamePasswordAuthenticationFilter.class
            )
            .authorizeHttpRequests(auth -> auth
                .requestMatchers("/robots.txt", "/public/**").permitAll()
                .anyRequest().authenticated()
            );
        return http.build();
    }
}

Register as a bean without Spring Security

FilterConfig.java — no Spring Security required

package com.example.app.config;

import com.example.app.security.AiBotBlockingFilter;
import org.springframework.boot.web.servlet.FilterRegistrationBean;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;

@Configuration
public class FilterConfig {

    @Bean
    public FilterRegistrationBean<AiBotBlockingFilter> aiBotBlockingFilter() {
        FilterRegistrationBean<AiBotBlockingFilter> registration =
            new FilterRegistrationBean<>();
        registration.setFilter(new AiBotBlockingFilter());
        registration.addUrlPatterns("/*");
        registration.setOrder(1); // runs first
        return registration;
    }
}

4. noai meta tag in Thymeleaf layout

Add the noai meta tag to your base Thymeleaf layout template. All pages that extend the layout inherit it automatically. Use a named fragment for per-page override.

Base layout with noai — Thymeleaf Layout Dialect

templates/layout/base.html

<!DOCTYPE html>
<html xmlns:th="http://www.thymeleaf.org"
      xmlns:layout="http://www.ultraq.net.nz/thymeleaf/layout">
<head>
    <meta charset="UTF-8" />
    <meta name="viewport" content="width=device-width, initial-scale=1.0" />

    <!-- Block AI training bots on all pages -->
    <meta name="robots" th:content="'noai, noimageai'" />

    <!-- Per-page head content (title, description, etc.) -->
    <th:block layout:fragment="head-extra"></th:block>
</head>
<body>
    <div layout:fragment="content"></div>
</body>
</html>

Per-page override (allow AI on specific pages)

templates/pages/public-article.html

<!DOCTYPE html>
<html xmlns:th="http://www.thymeleaf.org"
      xmlns:layout="http://www.ultraq.net.nz/thymeleaf/layout"
      layout:decorate="~{layout/base}">
<head>
    <!-- Override: allow AI on this specific public article -->
    <th:block layout:fragment="head-extra">
        <meta name="robots" content="index, follow" />
    </th:block>
</head>
<body>
    <div layout:fragment="content">
        <h1 th:text="${article.title}">Article Title</h1>
        <!-- content -->
    </div>
</body>
</html>

Without Layout Dialect (th:replace fragment)

templates/fragments/head.html

<!DOCTYPE html>
<html xmlns:th="http://www.thymeleaf.org">
<head th:fragment="head(title)">
    <meta charset="UTF-8" />
    <meta name="viewport" content="width=device-width, initial-scale=1.0" />
    <meta name="robots" content="noai, noimageai" />
    <title th:text="${title}">My App</title>
</head>
</html>

<!-- In page templates: -->
<!-- <head th:replace="~{fragments/head :: head('Page Title')}"></head> -->

5. nginx reverse proxy block

In production, Spring Boot typically runs behind nginx. Block AI bots at the nginx level before any request reaches the JVM — cheaper compute and stops bots that ignore robots.txt.

/etc/nginx/sites-available/spring-app

# In http { } block of /etc/nginx/nginx.conf:
map $http_user_agent $blocked_ai_bot {
    default                  0;
    "~*GPTBot"               1;
    "~*ChatGPT-User"         1;
    "~*OAI-SearchBot"        1;
    "~*ClaudeBot"            1;
    "~*anthropic-ai"         1;
    "~*Google-Extended"      1;
    "~*Bytespider"           1;
    "~*CCBot"                1;
    "~*PerplexityBot"        1;
    "~*meta-externalagent"   1;
    "~*Amazonbot"            1;
    "~*Applebot-Extended"    1;
    "~*xAI-Bot"              1;
    "~*DeepSeekBot"          1;
    "~*MistralBot"           1;
    "~*Diffbot"              1;
    "~*cohere-ai"            1;
    "~*AI2Bot"               1;
    "~*Ai2Bot-Dolma"         1;
    "~*omgili"               1;
    "~*omgilibot"            1;
    "~*webzio-extended"      1;
    "~*gemini-deep-research" 1;
}

server {
    listen 80;
    server_name myapp.com www.myapp.com;

    # Block AI bots before hitting Spring Boot
    if ($blocked_ai_bot) {
        return 403;
    }

    # noai header on all responses
    add_header X-Robots-Tag "noai, noimageai" always;

    # Proxy to embedded Tomcat
    location / {
        proxy_pass         http://127.0.0.1:8080;
        proxy_set_header   Host $host;
        proxy_set_header   X-Real-IP $remote_addr;
        proxy_set_header   X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header   X-Forwarded-Proto $scheme;
    }
}

Kubernetes nginx ingress annotation

ingress.yaml — nginx ingress controller

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: spring-app-ingress
  annotations:
    nginx.ingress.kubernetes.io/server-snippet: |
      if ($http_user_agent ~* "GPTBot|ClaudeBot|CCBot|Bytespider|Diffbot|Google-Extended|anthropic-ai|meta-externalagent|cohere-ai|AI2Bot|DeepSeekBot|MistralBot") {
        return 403;
      }
      add_header X-Robots-Tag "noai, noimageai" always;
spec:
  rules:
    - host: myapp.com
      http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: spring-app-service
                port:
                  number: 8080

Blocking layer comparison

Layer	Runs where	Spring Security needed?	Best for
HandlerInterceptor	After DispatcherServlet	No	Simple MVC apps, path-selective blocking
OncePerRequestFilter	Before DispatcherServlet	Optional	Spring Security projects, early interception
nginx map + if	Before JVM	No	Production VPS, lowest compute cost
K8s ingress snippet	Before pods	No	Kubernetes deployments

AI bots to block on Spring Boot applications

Bot	Operator	Purpose
GPTBot	OpenAI	Training
ChatGPT-User	OpenAI	Browsing
OAI-SearchBot	OpenAI	Search
ClaudeBot	Anthropic	Training
anthropic-ai	Anthropic	Training
Google-Extended	Google	Training
Bytespider	ByteDance	Training
CCBot	Common Crawl	Training
PerplexityBot	Perplexity	Search
meta-externalagent	Meta	Training
Amazonbot	Amazon	Training
Applebot-Extended	Apple	Training
xAI-Bot	xAI	Training
DeepSeekBot	DeepSeek	Training
MistralBot	Mistral	Training
Diffbot	Diffbot	Data broker
cohere-ai	Cohere	Training
AI2Bot	Allen Institute	Training
Ai2Bot-Dolma	Allen Institute	Training
YouBot	You.com	Search
DuckAssistBot	DuckDuckGo	Search
omgili	Webz.io	Data broker
omgilibot	Webz.io	Data broker
webzio-extended	Webz.io	Data broker
gemini-deep-research	Google	Research

FAQ

Where do I put robots.txt in a Spring Boot project?

In src/main/resources/static/robots.txt. Spring Boot's auto-configured ResourceHttpRequestHandler serves everything in static/ at the root URL — no controller needed.

HandlerInterceptor vs OncePerRequestFilter — which should I use?

If you have Spring Security: use OncePerRequestFilter and register it in your SecurityFilterChain. If you don't have Spring Security: use HandlerInterceptor via WebMvcConfigurer.addInterceptors() — fewer dependencies.

Does Spring Boot have a built-in way to block bots by user agent?

No. There is no application.properties setting for user-agent blocking. You need to write a filter or interceptor, or block at the nginx/ingress layer.

How do I add noai meta tags to all pages in a Spring Boot Thymeleaf app?

Add <meta name="robots" content="noai, noimageai"> to your base layout template (Thymeleaf Layout Dialect) or a shared th:fragment. Pages that extend the layout inherit it. Override per-page with a layout:fragment block.

How do I block AI bots in a Spring Boot app deployed to Kubernetes?

Use the nginx ingress controller's nginx.ingress.kubernetes.io/server-snippet annotation to inject user-agent blocking rules at the ingress layer. This blocks before traffic reaches your pods. Alternatively, implement blocking in-application via OncePerRequestFilter — works regardless of infrastructure.

Does my Spring Boot filter also block requests in development?

It will unless you add a profile check. Use @Profile("!dev") on your FilterRegistrationBean or SecurityConfig bean to disable it in the dev profile. Or inject @Value("${spring.profiles.active}") and short-circuit the filter logic in non-production.

Is your site protected from AI bots?

Run a free scan to check your robots.txt, meta tags, and overall AI readiness score.