How to Block AI Bots on Traefik: Complete 2026 Guide
Traefik is a cloud-native reverse proxy popular in Docker Compose and Kubernetes deployments. Its bot-blocking model is different from nginx and Apache: Traefik has no built-in User-Agent blocking middleware, so the strategy is to combine Traefik's Headers middleware (for X-Robots-Tag), robots.txt served from your upstream app, and either a Traefik plugin or your application-layer middleware for hard UA blocking.
Traefik has no built-in User-Agent blocking
Unlike nginx (map $http_user_agent) and Apache (mod_rewrite), Traefik's middleware library does not include a UA-matching block. Your options: (1) Traefik plugin (e.g. traefik-plugin-bot-blocker from the plugin catalog), (2) application middleware in your upstream service (Next.js middleware.ts, Express middleware, etc.), (3) nginx sidecar in front of Traefik's upstream. X-Robots-Tag and robots.txt work cleanly from Traefik.
Methods at a glance
| Method | What it does | Where |
|---|---|---|
| robots.txt (via upstream app) | Signals bots which paths are off-limits | Your app / nginx sidecar |
| Headers middleware | Adds X-Robots-Tag to all responses | Traefik dynamic config |
| Plugin middleware | Hard UA block at Traefik layer | Traefik plugin catalog |
| App-layer middleware | Hard UA block in your service | Next.js / Express / nginx |
| IPAllowList middleware | IP-range blocking (no UA needed) | Traefik dynamic config |
| Cloudflare WAF | Rule-based UA + rate limiting | Cloudflare dashboard |
1. robots.txt — serve from upstream
Traefik is a proxy — it doesn't serve static files. Serve robots.txt from your upstream application. For a dedicated static file, add a minimal nginx container to your Docker Compose stack and route /robots.txt requests to it:
# docker-compose.yml — robots.txt via nginx sidecar
services:
traefik:
image: traefik:v3
command:
- --providers.docker=true
- --providers.docker.exposedbydefault=false
- --entrypoints.web.address=:80
- --entrypoints.websecure.address=:443
- --certificatesresolvers.le.acme.tlschallenge=true
- --certificatesresolvers.le.acme.email=admin@example.com
- --certificatesresolvers.le.acme.storage=/letsencrypt/acme.json
ports:
- "80:80"
- "443:443"
volumes:
- /var/run/docker.sock:/var/run/docker.sock:ro
- ./letsencrypt:/letsencrypt
# Serves robots.txt at /robots.txt
robots:
image: nginx:alpine
volumes:
- ./robots.txt:/usr/share/nginx/html/robots.txt:ro
- ./nginx-robots.conf:/etc/nginx/conf.d/default.conf:ro
labels:
- "traefik.enable=true"
- "traefik.http.routers.robots.rule=Host(`example.com`) && Path(`/robots.txt`)"
- "traefik.http.routers.robots.entrypoints=websecure"
- "traefik.http.routers.robots.tls.certresolver=le"
- "traefik.http.services.robots.loadbalancer.server.port=80"
app:
image: myapp:latest
labels:
- "traefik.enable=true"
- "traefik.http.routers.app.rule=Host(`example.com`)"
- "traefik.http.routers.app.entrypoints=websecure"
- "traefik.http.routers.app.tls.certresolver=le"
- "traefik.http.services.app.loadbalancer.server.port=3000"# nginx-robots.conf — minimal config for the robots sidecar
server {
listen 80;
location = /robots.txt {
root /usr/share/nginx/html;
access_log off;
}
}If your upstream app already serves /robots.txt (e.g. Next.js public/robots.txt), skip the sidecar entirely.
2. X-Robots-Tag — Headers middleware
Traefik's Headers middleware adds, removes, or modifies HTTP headers on requests and responses. customResponseHeaders injects headers into every response from the upstream.
Docker Compose labels:
services:
app:
image: myapp:latest
labels:
- "traefik.enable=true"
- "traefik.http.routers.app.rule=Host(`example.com`)"
- "traefik.http.routers.app.entrypoints=websecure"
- "traefik.http.routers.app.tls.certresolver=le"
- "traefik.http.routers.app.middlewares=security-headers@docker"
- "traefik.http.services.app.loadbalancer.server.port=3000"
# Headers middleware — adds X-Robots-Tag to all responses
- "traefik.http.middlewares.security-headers.headers.customresponseheaders.X-Robots-Tag=noai, noimageai"
- "traefik.http.middlewares.security-headers.headers.customresponseheaders.X-Content-Type-Options=nosniff"Dynamic file config (dynamic.yml with file provider):
# dynamic.yml — loaded by Traefik file provider
http:
middlewares:
security-headers:
headers:
customResponseHeaders:
X-Robots-Tag: "noai, noimageai"
X-Content-Type-Options: "nosniff"
X-Frame-Options: "SAMEORIGIN"
routers:
app:
rule: "Host(`example.com`)"
entryPoints:
- websecure
middlewares:
- security-headers
service: app-service
tls:
certResolver: le
services:
app-service:
loadBalancer:
servers:
- url: "http://app:3000"Provider suffix in middleware references
When referencing a middleware in a router, always include the provider suffix: security-headers@docker (defined via Docker labels), security-headers@file (defined in a YAML file provider), security-headers@kubernetescrd (Kubernetes). Omitting the suffix causes middleware not found errors at runtime.
3. User-Agent blocking — plugin middleware
Traefik's plugin system (Traefik Plugin Catalog) lets you install community middleware plugins. The traefik-plugin-bot-blocker plugin blocks requests by User-Agent before they reach your upstream.
# traefik.yml (static config) — enable the plugin
experimental:
plugins:
bot-blocker:
moduleName: "github.com/bots-garden/traefik-plugin-bot-blocker"
version: "v0.1.0"
entryPoints:
web:
address: ":80"
websecure:
address: ":443"
providers:
docker:
exposedByDefault: false
file:
filename: /etc/traefik/dynamic.yml# dynamic.yml — configure the plugin middleware
http:
middlewares:
block-ai-bots:
plugin:
bot-blocker:
# List of User-Agent substrings to block
botUserAgents:
- "GPTBot"
- "ChatGPT-User"
- "ClaudeBot"
- "Claude-Web"
- "anthropic-ai"
- "CCBot"
- "Google-Extended"
- "PerplexityBot"
- "Amazonbot"
- "Bytespider"
- "YouBot"
- "DuckAssistBot"
- "meta-externalagent"
- "MistralAI-Spider"
- "oai-searchbot"
blockedStatusCode: 403
routers:
app:
rule: "Host(`example.com`)"
entryPoints:
- websecure
middlewares:
- block-ai-bots
- security-headers
service: app-service
tls:
certResolver: lePlugin availability depends on the Traefik version and whether the plugin is in the official catalog. Check plugins.traefik.io for the current plugin list and versions.
4. Application-layer blocking
The most reliable approach with Traefik is to handle User-Agent blocking in your upstream service. Traefik passes the original User-Agent header through unchanged, so your app sees the real value.
// Next.js middleware.ts — works with any Traefik setup
import { NextResponse } from 'next/server';
import type { NextRequest } from 'next/server';
const BLOCKED_UA = /GPTBot|ChatGPT-User|ClaudeBot|Claude-Web|anthropic-ai|CCBot|Google-Extended|PerplexityBot|Amazonbot|Bytespider|YouBot|DuckAssistBot|meta-externalagent|MistralAI-Spider|oai-searchbot/i;
export function middleware(request: NextRequest) {
const ua = request.headers.get('user-agent') ?? '';
if (BLOCKED_UA.test(ua)) {
return new NextResponse('Forbidden', { status: 403 });
}
return NextResponse.next();
}
export const config = {
matcher: ['/((?!_next|robots.txt|favicon.ico).*)'],
};# Express.js middleware (Node.js)
const BLOCKED_UA = /GPTBot|ChatGPT-User|ClaudeBot|anthropic-ai|CCBot|Google-Extended|PerplexityBot|Amazonbot|Bytespider/i;
app.use((req, res, next) => {
if (req.path === '/robots.txt') return next(); // always allow
if (BLOCKED_UA.test(req.get('user-agent') || '')) {
return res.status(403).send('Forbidden');
}
next();
});5. Kubernetes — IngressRoute + Middleware
In Kubernetes, Traefik uses Custom Resource Definitions (CRDs). Define a Middleware resource for the headers, then reference it in your IngressRoute.
# middleware.yaml — X-Robots-Tag via Traefik Middleware CRD
apiVersion: traefik.io/v1alpha1
kind: Middleware
metadata:
name: security-headers
namespace: default
spec:
headers:
customResponseHeaders:
X-Robots-Tag: "noai, noimageai"
X-Content-Type-Options: "nosniff"
---
# ingressroute.yaml
apiVersion: traefik.io/v1alpha1
kind: IngressRoute
metadata:
name: app-ingress
namespace: default
spec:
entryPoints:
- websecure
routes:
- match: Host(`example.com`)
kind: Rule
middlewares:
- name: security-headers # references Middleware CRD above
services:
- name: app-service
port: 3000
tls:
certResolver: leMiddleware CRDs are namespaced. To use a middleware from a different namespace, use the namespace@kubernetescrd reference format in the IngressRoute.
6. Complete traefik.yml (static config)
Static config defines entrypoints, providers, and certificate resolvers. It does not define routers, services, or middlewares — those live in dynamic config (Docker labels or file provider).
# traefik.yml (static config — requires restart to change)
api:
dashboard: false # Disable in production
entryPoints:
web:
address: ":80"
http:
redirections:
entrypoint:
to: websecure
scheme: https
websecure:
address: ":443"
http3: {} # Enable HTTP/3
certificatesResolvers:
le:
acme:
email: admin@example.com
storage: /letsencrypt/acme.json
tlsChallenge: {}
providers:
docker:
exposedByDefault: false
network: proxy # Only route services on this Docker network
file:
filename: /etc/traefik/dynamic.yml
watch: true # Hot-reload on file changes
log:
level: INFO
accessLog:
filePath: /var/log/traefik/access.log
bufferingSize: 1007. Full Docker Compose example
Complete Docker Compose stack: Traefik v3 with automatic HTTPS, security headers middleware, and an app service.
# docker-compose.yml
networks:
proxy:
external: true # Create with: docker network create proxy
services:
traefik:
image: traefik:v3
restart: unless-stopped
networks:
- proxy
ports:
- "80:80"
- "443:443"
- "443:443/udp" # HTTP/3
volumes:
- /var/run/docker.sock:/var/run/docker.sock:ro
- ./traefik.yml:/etc/traefik/traefik.yml:ro
- ./dynamic.yml:/etc/traefik/dynamic.yml:ro
- ./letsencrypt:/letsencrypt
labels:
- "traefik.enable=true"
app:
image: myapp:latest
restart: unless-stopped
networks:
- proxy
# No ports exposed — Traefik handles all ingress
labels:
- "traefik.enable=true"
- "traefik.docker.network=proxy"
- "traefik.http.routers.app.rule=Host(`example.com`)"
- "traefik.http.routers.app.entrypoints=websecure"
- "traefik.http.routers.app.tls.certresolver=le"
- "traefik.http.routers.app.middlewares=security-headers@file"
- "traefik.http.services.app.loadbalancer.server.port=3000"# dynamic.yml (watched by Traefik — no restart needed on changes)
http:
middlewares:
security-headers:
headers:
customResponseHeaders:
X-Robots-Tag: "noai, noimageai"
X-Frame-Options: "SAMEORIGIN"
X-Content-Type-Options: "nosniff"
# Force HTTPS in browser for 1 year:
stsSeconds: 31536000
stsIncludeSubdomains: trueFrequently asked questions
How do I block bots by User-Agent in Traefik?
Traefik has no built-in UA-blocking middleware. Options: (1) a Traefik plugin from the Plugin Catalog, (2) application-layer middleware in your upstream (Next.js middleware.ts, Express middleware, etc.), (3) an nginx sidecar in front of your service with a map $http_user_agent $bad_bot block. For most setups, app-layer blocking (option 2) is the simplest.
How do I add response headers in Traefik?
Use the Headers middleware with customResponseHeaders. Via Docker labels: traefik.http.middlewares.hdr.headers.customresponseheaders.X-Robots-Tag=noai, noimageai. Then attach to a router: traefik.http.routers.app.middlewares=hdr@docker. The @docker suffix is required.
What is static vs dynamic config in Traefik?
Static config (traefik.yml) defines entrypoints, providers, and certificate resolvers — requires a restart to change. Dynamic config (Docker labels, file provider YAML, Kubernetes CRDs) defines routers, services, and middlewares — updated live with no restart. Always put bot-blocking middleware in dynamic config.
How do I serve robots.txt with Traefik?
Traefik doesn't serve static files. Options: (1) serve from your upstream app (Next.js public/robots.txt, nginx root), (2) add a minimal nginx sidecar container with a router rule Host(`example.com`) && Path(`/robots.txt`) pointing to it. Most apps already serve robots.txt natively, so a sidecar is usually unnecessary.
Does Traefik work with Kubernetes for bot blocking?
Yes. Use Traefik's IngressRoute CRD with a Middleware CRD for customResponseHeaders. For hard UA blocking, use an application-layer middleware (e.g. Next.js middleware.ts) or a Kubernetes NetworkPolicy + WAF solution.
What is the @file vs @docker suffix in Traefik middleware references?
The suffix indicates which provider defined the middleware. @docker — defined via Docker labels; @file — defined in a YAML/TOML file provider; @kubernetescrd — Kubernetes Middleware CRD. Always include the suffix in router middleware references — omitting it causes middleware not found errors.
Is your site protected from AI bots?
Run a free scan to check your robots.txt, meta tags, and overall AI readiness score.