Skip to content
Guides/Sanic (Python)

How to Block AI Bots on Sanic (Python): Complete 2026 Guide

Sanic is Python's async-native web framework — no WSGI, no synchronous handlers, built to run fast on its own event loop. Unlike Falcon (exception-based blocking via raise HTTPForbidden()) and Starlette/FastAPI (ASGI middleware class), Sanic uses a decorator-based, return-to-block pattern: return an HTTPResponse from @app.on_request to stop the chain.

Return-based blocking — not exception-based

Sanic request middleware blocks by returning an HTTPResponse. Return None (implicitly or explicitly) to continue to the route handler. This is the same pattern as Flask's before_request(), but async. Unlike Falcon, you do not raise an exception to block.

Protection layers

1
robots.txtapp.static() serves /robots.txt before middleware runs — bots always get the signal
2
noai meta tagrequest.ctx.robots set in on_request middleware — read by Jinja2 base template
3
X-Robots-Tag header@app.on_response adds X-Robots-Tag to every legitimate response
4
Hard 403 blockReturn text("Forbidden", status=403) from on_request — route handler never runs

Layer 1: robots.txt

Use app.static() to serve /robots.txt before middleware is considered. Sanic's static handler is wired at the routing layer — it runs before on_request:

# static/robots.txt

User-agent: *
Allow: /

User-agent: GPTBot
User-agent: ClaudeBot
User-agent: anthropic-ai
User-agent: Google-Extended
User-agent: CCBot
User-agent: cohere-ai
User-agent: Bytespider
User-agent: Amazonbot
User-agent: PerplexityBot
User-agent: YouBot
User-agent: Diffbot
User-agent: DeepSeekBot
User-agent: MistralBot
User-agent: xAI-Bot
User-agent: AI2Bot
Disallow: /
# server.py
from sanic import Sanic

app = Sanic("myapp")

# Register static BEFORE adding middleware
app.static('/robots.txt', './static/robots.txt')
app.static('/sitemap.xml', './static/sitemap.xml')

# Now add your middleware (registered after static doesn't matter for routing,
# but keep it explicit for clarity)
@app.on_request
async def block_ai_bots(request):
    ...

Sanic's static routing is implemented at the route-match level — static paths are resolved before middleware handlers are called.

Layers 2, 3 & 4: middleware module

Create middleware/ai_bot_blocker.py with two handlers — one for request phase (block + set context), one for response phase (inject header):

# middleware/ai_bot_blocker.py
from sanic.request import Request
from sanic.response import HTTPResponse, text

AI_BOTS = [
    'gptbot', 'chatgpt-user', 'claudebot', 'anthropic-ai',
    'ccbot', 'cohere-ai', 'bytespider', 'amazonbot',
    'applebot-extended', 'perplexitybot', 'youbot', 'diffbot',
    'google-extended', 'deepseekbot', 'mistralbot', 'xai-bot',
    'ai2bot', 'oai-searchbot', 'duckassistbot',
]

EXEMPT_PATHS = {'/robots.txt', '/sitemap.xml', '/favicon.ico'}


async def block_ai_bots(request: Request) -> HTTPResponse | None:
    """on_request handler — return a response to block, None to pass through."""
    # Always set noai meta context for templates
    request.ctx.robots = 'noai, noimageai'

    # Exempt paths bypass bot blocking (robots.txt must always be accessible)
    if request.path in EXEMPT_PATHS:
        return None

    ua = request.headers.get('user-agent', '').lower()
    if any(bot in ua for bot in AI_BOTS):
        return text('Forbidden: AI crawlers are not permitted.', status=403)

    # Return None (implicit) to continue to route handler
    return None


async def inject_robots_tag(request: Request, response: HTTPResponse) -> HTTPResponse | None:
    """on_response handler — add X-Robots-Tag to every outgoing response."""
    response.headers['X-Robots-Tag'] = 'noai, noimageai'
    return None  # Return None to keep the existing response

Register both handlers in your app:

# server.py
from sanic import Sanic
from middleware.ai_bot_blocker import block_ai_bots, inject_robots_tag

app = Sanic("myapp")

app.static('/robots.txt', './static/robots.txt')

# Register middleware
app.on_request(block_ai_bots)
app.on_response(inject_robots_tag)

# Alternatively, use the decorator form:
# @app.on_request
# async def block_ai_bots(request): ...

Layer 2: noai meta tag

request.ctx is Sanic's per-request context object — a SimpleNamespace scoped to the current request. It is the correct way to pass data from middleware to route handlers and templates (equivalent to Flask's g, but request-scoped and async-safe):

# In block_ai_bots handler (already set above):
request.ctx.robots = 'noai, noimageai'

# In a Jinja2 template (via sanic-ext or sanic-jinja2):
# base.html
# <meta name="robots" content="{{ request.ctx.robots | default('noai, noimageai') }}">

# Route handler — override per-page if needed:
@app.route('/public-page')
async def public_page(request):
    request.ctx.robots = 'index, follow'  # Override for public content
    return await render('page.html', request=request)
request.ctx vs app.ctx
request.ctx = per-request, scoped to one HTTP request, reset each time. app.ctx = application-level, lives for the app lifetime (like a global store). Always use request.ctx for per-request data.

Blueprint-scoped middleware

Use Blueprint middleware to restrict blocking to specific route groups (e.g., only your API routes). Blueprint on_request runs in addition to any app-level middleware:

from sanic import Sanic, Blueprint
from sanic.response import text

app = Sanic("myapp")

# --- API Blueprint with bot blocking ---
api = Blueprint('api', url_prefix='/api')

AI_BOTS = ['gptbot', 'claudebot', 'ccbot', 'anthropic-ai', ...]

@api.on_request
async def block_bots_on_api(request):
    ua = request.headers.get('user-agent', '').lower()
    if any(bot in ua for bot in AI_BOTS):
        return text('Forbidden', status=403)

@api.route('/data')
async def api_data(request):
    return text('{"results": [...]}')

# --- Public Blueprint — no bot blocking ---
web = Blueprint('web', url_prefix='/')

@web.route('/')
async def index(request):
    return text('Hello!')  # Bots can access this

app.blueprint(api)
app.blueprint(web)

Blueprint middleware runs after app-level middleware. Registration order within a Blueprint matters — handlers are called in the order they are registered.

Middleware execution order

Sanic middleware executes in FIFO order — first registered runs first. This differs from Starlette's LIFO add_middleware():

app.on_request(block_ai_bots)   # runs FIRST for requests
app.on_request(log_requests)    # runs SECOND for requests

# For responses, on_response runs in REVERSE order (LIFO)
app.on_response(inject_robots_tag)  # runs SECOND for responses
app.on_response(log_responses)      # runs FIRST for responses
FIFO for requests, LIFO for responses
Sanic request middleware fires in registration order (first in, first out). Response middleware fires in reverse registration order (last in, first out). Always register your bot blocker as the first on_request handler.

Older Sanic syntax (pre-21.12)

If you're on Sanic older than 21.12, use @app.middleware('request') and @app.middleware('response'). The handler signatures are identical:

# Sanic < 21.12 — @app.middleware decorator
@app.middleware('request')
async def block_ai_bots(request):
    request.ctx.robots = 'noai, noimageai'
    if request.path in EXEMPT_PATHS:
        return
    ua = request.headers.get('user-agent', '').lower()
    if any(bot in ua for bot in AI_BOTS):
        return text('Forbidden', status=403)

@app.middleware('response')
async def inject_robots_tag(request, response):
    response.headers['X-Robots-Tag'] = 'noai, noimageai'

Both syntaxes work in Sanic 21.12+. The on_request / on_response form is preferred for new code.

Sanic vs Flask vs Falcon — blocking comparison

Sanic — async, return response

# sanic on_request middleware (async)
@app.on_request
async def block_bots(request):
    ua = request.headers.get('user-agent', '').lower()
    if any(b in ua for b in AI_BOTS):
        return text('Forbidden', status=403)  # return stops chain

Flask — sync, return response from before_request

# flask
@app.before_request
def block_bots():
    ua = request.headers.get('User-Agent', '').lower()
    if any(b in ua for b in AI_BOTS):
        return Response('Forbidden', 403)  # return stops chain

Falcon — raise exception

# falcon middleware
def process_request(self, req, resp):
    ua = req.get_header('User-Agent') or ''
    if any(b in ua.lower() for b in AI_BOTS):
        raise falcon.HTTPForbidden()  # raise stops chain

FastAPI / Starlette — ASGI middleware

# starlette BaseHTTPMiddleware
async def dispatch(self, request, call_next):
    ua = request.headers.get('user-agent', '').lower()
    if any(b in ua for b in AI_BOTS):
        return Response('Forbidden', status_code=403)
    return await call_next(request)

Sanic and Flask share the return-based pattern. Falcon is exception-based. Starlette/FastAPI uses an ASGI middleware class with call_next().

Testing

Sanic ships with sanic.testing.SanicTestClient. Use app.test_client():

import pytest
from sanic import Sanic
from server import create_app

@pytest.fixture
def app():
    return create_app()

def test_blocks_ai_bot(app):
    _, response = app.test_client.get(
        '/articles/test',
        headers={'User-Agent': 'GPTBot/1.0'},
    )
    assert response.status == 403

def test_allows_browser(app):
    _, response = app.test_client.get(
        '/articles/test',
        headers={'User-Agent': 'Mozilla/5.0 (compatible)'},
    )
    assert response.status == 200
    assert response.headers.get('X-Robots-Tag') == 'noai, noimageai'

def test_robots_txt_accessible_to_bots(app):
    _, response = app.test_client.get(
        '/robots.txt',
        headers={'User-Agent': 'GPTBot/1.0'},
    )
    # Static route bypasses on_request middleware
    assert response.status == 200

AI bot User-Agent strings (2026)

GPTBotChatGPT-UserClaudeBotanthropic-aiCCBotcohere-aiBytespiderAmazonbotApplebot-ExtendedPerplexityBotYouBotDiffbotGoogle-ExtendedFacebookBotomgiliomgilibotDeepSeekBotMistralBotxAI-BotAI2Bot

Lowercase and check with bot in ua.lower() for case-insensitive matching in Sanic.

Is your site protected from AI bots?

Run a free scan to check your robots.txt, meta tags, and overall AI readiness score.