How is this different from Google Analytics?

Google Analytics shows you traffic. Shadow shows you traffic, AI bot activity, what AI platforms say about your brand, AND tells you what to do about all of it. It's analytics + AI intelligence + action steps in one tool.

Do I need to install anything?

For basic monitoring (bot detection, AI perception, readiness score) — nope, just enter your URL. For full visitor analytics (clicks, behavior, sessions), add one script tag. One-click integrations for Vercel, Shopify, WordPress, and more.

Will it slow down my site?

No. The script is under 5KB and loads async. Zero impact on page speed or Core Web Vitals. External monitoring has literally no impact — it watches from the outside.

What AI bots does Shadow detect?

All of them. GPTBot, ClaudeBot, PerplexityBot, Google-Extended, Bytespider, Amazonbot, and dozens more. The Shadow Network means new bots get identified across all users instantly.

What do you mean by "actionable steps"?

Shadow doesn't just show you graphs. It says things like: "ChatGPT has your pricing wrong — add structured data to /pricing to fix it" or "Your bounce rate on /features is 68% — here's why and what to change." Specific, do-it-today recommendations.

Can Shadow block bots?

Shadow is a telescope, not a shield. It shows you who's visiting and what AI says about you. It generates block rules and robots.txt configs you can apply — but it doesn't intercept traffic.

Yes. Shadow never collects PII. IP addresses are hashed after classification. No cookies on your visitors. All Shadow Network data is anonymized. GDPR compliant by design.

Traefik · Reverse Proxy · Docker · Kubernetes·9 min read

How to Block AI Bots on Traefik: Complete 2026 Guide

Traefik is a cloud-native reverse proxy popular in Docker Compose and Kubernetes deployments. Its bot-blocking model is different from nginx and Apache: Traefik has no built-in User-Agent blocking middleware, so the strategy is to combine Traefik's Headers middleware (for X-Robots-Tag), robots.txt served from your upstream app, and either a Traefik plugin or your application-layer middleware for hard UA blocking.

Traefik has no built-in User-Agent blocking

Unlike nginx (map $http_user_agent) and Apache (mod_rewrite), Traefik's middleware library does not include a UA-matching block. Your options: (1) Traefik plugin (e.g. traefik-plugin-bot-blocker from the plugin catalog), (2) application middleware in your upstream service (Next.js middleware.ts, Express middleware, etc.), (3) nginx sidecar in front of Traefik's upstream. X-Robots-Tag and robots.txt work cleanly from Traefik.

Methods at a glance

Method	What it does	Where
robots.txt (via upstream app)	Signals bots which paths are off-limits	Your app / nginx sidecar
Headers middleware	Adds X-Robots-Tag to all responses	Traefik dynamic config
Plugin middleware	Hard UA block at Traefik layer	Traefik plugin catalog
App-layer middleware	Hard UA block in your service	Next.js / Express / nginx
IPAllowList middleware	IP-range blocking (no UA needed)	Traefik dynamic config
Cloudflare WAF	Rule-based UA + rate limiting	Cloudflare dashboard

1. robots.txt — serve from upstream

Traefik is a proxy — it doesn't serve static files. Serve robots.txt from your upstream application. For a dedicated static file, add a minimal nginx container to your Docker Compose stack and route /robots.txt requests to it:

# docker-compose.yml — robots.txt via nginx sidecar
services:
  traefik:
    image: traefik:v3
    command:
      - --providers.docker=true
      - --providers.docker.exposedbydefault=false
      - --entrypoints.web.address=:80
      - --entrypoints.websecure.address=:443
      - --certificatesresolvers.le.acme.tlschallenge=true
      - --certificatesresolvers.le.acme.email=admin@example.com
      - --certificatesresolvers.le.acme.storage=/letsencrypt/acme.json
    ports:
      - "80:80"
      - "443:443"
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock:ro
      - ./letsencrypt:/letsencrypt

  # Serves robots.txt at /robots.txt
  robots:
    image: nginx:alpine
    volumes:
      - ./robots.txt:/usr/share/nginx/html/robots.txt:ro
      - ./nginx-robots.conf:/etc/nginx/conf.d/default.conf:ro
    labels:
      - "traefik.enable=true"
      - "traefik.http.routers.robots.rule=Host(`example.com`) && Path(`/robots.txt`)"
      - "traefik.http.routers.robots.entrypoints=websecure"
      - "traefik.http.routers.robots.tls.certresolver=le"
      - "traefik.http.services.robots.loadbalancer.server.port=80"

  app:
    image: myapp:latest
    labels:
      - "traefik.enable=true"
      - "traefik.http.routers.app.rule=Host(`example.com`)"
      - "traefik.http.routers.app.entrypoints=websecure"
      - "traefik.http.routers.app.tls.certresolver=le"
      - "traefik.http.services.app.loadbalancer.server.port=3000"

# nginx-robots.conf — minimal config for the robots sidecar
server {
    listen 80;
    location = /robots.txt {
        root /usr/share/nginx/html;
        access_log off;
    }
}

If your upstream app already serves /robots.txt (e.g. Next.js public/robots.txt), skip the sidecar entirely.

2. X-Robots-Tag — Headers middleware

Traefik's Headers middleware adds, removes, or modifies HTTP headers on requests and responses. customResponseHeaders injects headers into every response from the upstream.

Docker Compose labels:

services:
  app:
    image: myapp:latest
    labels:
      - "traefik.enable=true"
      - "traefik.http.routers.app.rule=Host(`example.com`)"
      - "traefik.http.routers.app.entrypoints=websecure"
      - "traefik.http.routers.app.tls.certresolver=le"
      - "traefik.http.routers.app.middlewares=security-headers@docker"
      - "traefik.http.services.app.loadbalancer.server.port=3000"

      # Headers middleware — adds X-Robots-Tag to all responses
      - "traefik.http.middlewares.security-headers.headers.customresponseheaders.X-Robots-Tag=noai, noimageai"
      - "traefik.http.middlewares.security-headers.headers.customresponseheaders.X-Content-Type-Options=nosniff"

Dynamic file config (dynamic.yml with file provider):

# dynamic.yml — loaded by Traefik file provider
http:
  middlewares:
    security-headers:
      headers:
        customResponseHeaders:
          X-Robots-Tag: "noai, noimageai"
          X-Content-Type-Options: "nosniff"
          X-Frame-Options: "SAMEORIGIN"

  routers:
    app:
      rule: "Host(`example.com`)"
      entryPoints:
        - websecure
      middlewares:
        - security-headers
      service: app-service
      tls:
        certResolver: le

  services:
    app-service:
      loadBalancer:
        servers:
          - url: "http://app:3000"

Provider suffix in middleware references

When referencing a middleware in a router, always include the provider suffix: security-headers@docker (defined via Docker labels), security-headers@file (defined in a YAML file provider), security-headers@kubernetescrd (Kubernetes). Omitting the suffix causes middleware not found errors at runtime.

3. User-Agent blocking — plugin middleware

Traefik's plugin system (Traefik Plugin Catalog) lets you install community middleware plugins. The traefik-plugin-bot-blocker plugin blocks requests by User-Agent before they reach your upstream.

# traefik.yml (static config) — enable the plugin
experimental:
  plugins:
    bot-blocker:
      moduleName: "github.com/bots-garden/traefik-plugin-bot-blocker"
      version: "v0.1.0"

entryPoints:
  web:
    address: ":80"
  websecure:
    address: ":443"

providers:
  docker:
    exposedByDefault: false
  file:
    filename: /etc/traefik/dynamic.yml

# dynamic.yml — configure the plugin middleware
http:
  middlewares:
    block-ai-bots:
      plugin:
        bot-blocker:
          # List of User-Agent substrings to block
          botUserAgents:
            - "GPTBot"
            - "ChatGPT-User"
            - "ClaudeBot"
            - "Claude-Web"
            - "anthropic-ai"
            - "CCBot"
            - "Google-Extended"
            - "PerplexityBot"
            - "Amazonbot"
            - "Bytespider"
            - "YouBot"
            - "DuckAssistBot"
            - "meta-externalagent"
            - "MistralAI-Spider"
            - "oai-searchbot"
          blockedStatusCode: 403

  routers:
    app:
      rule: "Host(`example.com`)"
      entryPoints:
        - websecure
      middlewares:
        - block-ai-bots
        - security-headers
      service: app-service
      tls:
        certResolver: le

Plugin availability depends on the Traefik version and whether the plugin is in the official catalog. Check plugins.traefik.io for the current plugin list and versions.

4. Application-layer blocking

The most reliable approach with Traefik is to handle User-Agent blocking in your upstream service. Traefik passes the original User-Agent header through unchanged, so your app sees the real value.

// Next.js middleware.ts — works with any Traefik setup
import { NextResponse } from 'next/server';
import type { NextRequest } from 'next/server';

const BLOCKED_UA = /GPTBot|ChatGPT-User|ClaudeBot|Claude-Web|anthropic-ai|CCBot|Google-Extended|PerplexityBot|Amazonbot|Bytespider|YouBot|DuckAssistBot|meta-externalagent|MistralAI-Spider|oai-searchbot/i;

export function middleware(request: NextRequest) {
  const ua = request.headers.get('user-agent') ?? '';

  if (BLOCKED_UA.test(ua)) {
    return new NextResponse('Forbidden', { status: 403 });
  }

  return NextResponse.next();
}

export const config = {
  matcher: ['/((?!_next|robots.txt|favicon.ico).*)'],
};

# Express.js middleware (Node.js)
const BLOCKED_UA = /GPTBot|ChatGPT-User|ClaudeBot|anthropic-ai|CCBot|Google-Extended|PerplexityBot|Amazonbot|Bytespider/i;

app.use((req, res, next) => {
  if (req.path === '/robots.txt') return next(); // always allow
  if (BLOCKED_UA.test(req.get('user-agent') || '')) {
    return res.status(403).send('Forbidden');
  }
  next();
});

5. Kubernetes — IngressRoute + Middleware

In Kubernetes, Traefik uses Custom Resource Definitions (CRDs). Define a Middleware resource for the headers, then reference it in your IngressRoute.

# middleware.yaml — X-Robots-Tag via Traefik Middleware CRD
apiVersion: traefik.io/v1alpha1
kind: Middleware
metadata:
  name: security-headers
  namespace: default
spec:
  headers:
    customResponseHeaders:
      X-Robots-Tag: "noai, noimageai"
      X-Content-Type-Options: "nosniff"
---
# ingressroute.yaml
apiVersion: traefik.io/v1alpha1
kind: IngressRoute
metadata:
  name: app-ingress
  namespace: default
spec:
  entryPoints:
    - websecure
  routes:
    - match: Host(`example.com`)
      kind: Rule
      middlewares:
        - name: security-headers    # references Middleware CRD above
      services:
        - name: app-service
          port: 3000
  tls:
    certResolver: le

Middleware CRDs are namespaced. To use a middleware from a different namespace, use the namespace@kubernetescrd reference format in the IngressRoute.

6. Complete traefik.yml (static config)

Static config defines entrypoints, providers, and certificate resolvers. It does not define routers, services, or middlewares — those live in dynamic config (Docker labels or file provider).

# traefik.yml (static config — requires restart to change)

api:
  dashboard: false   # Disable in production

entryPoints:
  web:
    address: ":80"
    http:
      redirections:
        entrypoint:
          to: websecure
          scheme: https
  websecure:
    address: ":443"
    http3: {}        # Enable HTTP/3

certificatesResolvers:
  le:
    acme:
      email: admin@example.com
      storage: /letsencrypt/acme.json
      tlsChallenge: {}

providers:
  docker:
    exposedByDefault: false
    network: proxy           # Only route services on this Docker network
  file:
    filename: /etc/traefik/dynamic.yml
    watch: true              # Hot-reload on file changes

log:
  level: INFO

accessLog:
  filePath: /var/log/traefik/access.log
  bufferingSize: 100

7. Full Docker Compose example

Complete Docker Compose stack: Traefik v3 with automatic HTTPS, security headers middleware, and an app service.

# docker-compose.yml
networks:
  proxy:
    external: true   # Create with: docker network create proxy

services:
  traefik:
    image: traefik:v3
    restart: unless-stopped
    networks:
      - proxy
    ports:
      - "80:80"
      - "443:443"
      - "443:443/udp"   # HTTP/3
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock:ro
      - ./traefik.yml:/etc/traefik/traefik.yml:ro
      - ./dynamic.yml:/etc/traefik/dynamic.yml:ro
      - ./letsencrypt:/letsencrypt
    labels:
      - "traefik.enable=true"

  app:
    image: myapp:latest
    restart: unless-stopped
    networks:
      - proxy
    # No ports exposed — Traefik handles all ingress
    labels:
      - "traefik.enable=true"
      - "traefik.docker.network=proxy"
      - "traefik.http.routers.app.rule=Host(`example.com`)"
      - "traefik.http.routers.app.entrypoints=websecure"
      - "traefik.http.routers.app.tls.certresolver=le"
      - "traefik.http.routers.app.middlewares=security-headers@file"
      - "traefik.http.services.app.loadbalancer.server.port=3000"

# dynamic.yml (watched by Traefik — no restart needed on changes)
http:
  middlewares:
    security-headers:
      headers:
        customResponseHeaders:
          X-Robots-Tag: "noai, noimageai"
          X-Frame-Options: "SAMEORIGIN"
          X-Content-Type-Options: "nosniff"
        # Force HTTPS in browser for 1 year:
        stsSeconds: 31536000
        stsIncludeSubdomains: true

Frequently asked questions

How do I block bots by User-Agent in Traefik?

Traefik has no built-in UA-blocking middleware. Options: (1) a Traefik plugin from the Plugin Catalog, (2) application-layer middleware in your upstream (Next.js middleware.ts, Express middleware, etc.), (3) an nginx sidecar in front of your service with a map $http_user_agent $bad_bot block. For most setups, app-layer blocking (option 2) is the simplest.

How do I add response headers in Traefik?

Use the Headers middleware with customResponseHeaders. Via Docker labels: traefik.http.middlewares.hdr.headers.customresponseheaders.X-Robots-Tag=noai, noimageai. Then attach to a router: traefik.http.routers.app.middlewares=hdr@docker. The @docker suffix is required.

What is static vs dynamic config in Traefik?

Static config (traefik.yml) defines entrypoints, providers, and certificate resolvers — requires a restart to change. Dynamic config (Docker labels, file provider YAML, Kubernetes CRDs) defines routers, services, and middlewares — updated live with no restart. Always put bot-blocking middleware in dynamic config.

How do I serve robots.txt with Traefik?

Traefik doesn't serve static files. Options: (1) serve from your upstream app (Next.js public/robots.txt, nginx root), (2) add a minimal nginx sidecar container with a router rule Host(`example.com`) && Path(`/robots.txt`) pointing to it. Most apps already serve robots.txt natively, so a sidecar is usually unnecessary.

Does Traefik work with Kubernetes for bot blocking?

Yes. Use Traefik's IngressRoute CRD with a Middleware CRD for customResponseHeaders. For hard UA blocking, use an application-layer middleware (e.g. Next.js middleware.ts) or a Kubernetes NetworkPolicy + WAF solution.

What is the @file vs @docker suffix in Traefik middleware references?

The suffix indicates which provider defined the middleware. @docker — defined via Docker labels; @file — defined in a YAML/TOML file provider; @kubernetescrd — Kubernetes Middleware CRD. Always include the suffix in router middleware references — omitting it causes middleware not found errors.

Is your site protected from AI bots?

Run a free scan to check your robots.txt, meta tags, and overall AI readiness score.