Skip to content
DjangoPythonNew8 min read

How to Block AI Bots on Django

Django has a unique built-in setting — DISALLOWED_USER_AGENTS — that blocks specific bots with zero new files. Combined with a robots.txt view, base template noai tags, and custom middleware, Django gives you complete layered control over AI crawler access.

Quick fix — robots.txt view

Add to views.py and wire in urls.py. No static file configuration needed.

# views.py
from django.http import HttpResponse

def robots_txt(request):
    content = """User-agent: GPTBot
Disallow: /

User-agent: ClaudeBot
Disallow: /

User-agent: CCBot
Disallow: /
"""
    return HttpResponse(content, content_type="text/plain")

# urls.py
from django.urls import path
from . import views

urlpatterns = [
    path("robots.txt", views.robots_txt),
    # ... your other urls
]

All Methods

robots.txt view (Recommended)

Easy

All deployments

views.py + urls.py

A simple view function that returns an HttpResponse with content_type='text/plain'. Wire it in urls.py at path('robots.txt', ...). Works on every Django deployment without static file configuration.

Use TemplateResponse with a templates/robots.txt file for cleaner separation of the bot list from code.

DISALLOWED_USER_AGENTS (Built-in)

Easy

All deployments

settings.py → DISALLOWED_USER_AGENTS

Django's built-in setting processed by CommonMiddleware. Add compiled regex patterns to block specific bots — they receive a 404 before any view runs. Zero new files required.

Requires CommonMiddleware in MIDDLEWARE (default in all Django projects). Returns 404, not 403.

noai meta tag in base template

Easy

All deployments

templates/base.html

Add <meta name="robots" content="noai, noimageai"> to the base Django template's <head> section. All pages that extend this template inherit the tag automatically.

Use {% block robots_meta %} for per-page override capability.

Custom middleware — hard blocking

Easy

All deployments

yourapp/middleware.py

A Django middleware class that checks User-Agent in __call__ and returns HttpResponseForbidden(403) for matched bots. More control than DISALLOWED_USER_AGENTS — returns 403, logs requests, supports dynamic bot lists.

Register in settings.py MIDDLEWARE list above CommonMiddleware for early interception.

nginx — server-level block

Intermediate

nginx only

nginx server block config

Match AI bot user agents in the nginx config and return 403 before Gunicorn/uWSGI and Django are invoked. Most efficient — zero Python overhead for blocked bots.

Requires access to nginx config (VPS, DigitalOcean, AWS EC2). Not available on Platform-as-a-Service (Heroku, Railway, Render).

django-robots package

Easy

All deployments

pip install django-robots

Third-party package for database-driven robots.txt rules. Manage Allow/Disallow rules via Django admin without code changes. Useful if content editors need to manage crawl rules.

Overkill for simple static AI bot blocking. Adds a database model and admin UI. Best for CMS-backed sites.

Method 1: robots.txt View

The most portable Django approach — a view function at path('robots.txt', ...) that returns an HttpResponse with content_type='text/plain'. Works on Heroku, Railway, Render, VPS — any deployment where Django handles the request.

# views.py
import re
from django.http import HttpResponse

AI_BOTS = [
    "GPTBot", "ChatGPT-User", "OAI-SearchBot",
    "ClaudeBot", "anthropic-ai", "Google-Extended",
    "Bytespider", "CCBot", "PerplexityBot",
    "meta-externalagent", "Amazonbot", "Applebot-Extended",
    "xAI-Bot", "DeepSeekBot", "MistralBot", "Diffbot",
    "cohere-ai", "AI2Bot", "Ai2Bot-Dolma", "YouBot",
    "DuckAssistBot", "omgili", "omgilibot",
    "webzio-extended", "gemini-deep-research",
]

def robots_txt(request):
    lines = ["User-agent: *", "Allow: /", ""]

    for bot in AI_BOTS:
        lines += [f"User-agent: {bot}", "Disallow: /", ""]

    lines.append(f"Sitemap: {request.build_absolute_uri('/sitemap.xml')}")

    return HttpResponse("\n".join(lines), content_type="text/plain")
# urls.py (project-level)
from django.urls import path
from yourapp.views import robots_txt

urlpatterns = [
    path("robots.txt", robots_txt, name="robots_txt"),
    # ... rest of your urls
]

For cleaner separation, use a template instead of a hardcoded string. Create templates/robots.txt (plain text, no HTML — just the robots.txt content) and use TemplateResponse:

# views.py (template approach)
from django.template.response import TemplateResponse

def robots_txt(request):
    return TemplateResponse(
        request,
        "robots.txt",   # templates/robots.txt — plain text, not HTML
        content_type="text/plain",
    )

Method 2: DISALLOWED_USER_AGENTS (Built-in Django)

This is Django's hidden gem for bot blocking. The DISALLOWED_USER_AGENTS setting is processed by django.middleware.common.CommonMiddleware — which is in every Django project's default MIDDLEWARE list. Any request matching a pattern receives a 404 response before your view runs. No new files, no new views, no new URLs.

Add to settings.py:

# settings.py
import re

DISALLOWED_USER_AGENTS = [
    re.compile(r"GPTBot"),
    re.compile(r"ChatGPT-User"),
    re.compile(r"OAI-SearchBot"),
    re.compile(r"ClaudeBot"),
    re.compile(r"anthropic-ai"),
    re.compile(r"Google-Extended"),
    re.compile(r"Bytespider"),
    re.compile(r"CCBot"),
    re.compile(r"PerplexityBot"),
    re.compile(r"meta-externalagent"),
    re.compile(r"Amazonbot"),
    re.compile(r"Applebot-Extended"),
    re.compile(r"xAI-Bot"),
    re.compile(r"DeepSeekBot"),
    re.compile(r"MistralBot"),
    re.compile(r"Diffbot"),
    re.compile(r"cohere-ai"),
    re.compile(r"AI2Bot"),
    re.compile(r"Ai2Bot-Dolma"),
    re.compile(r"YouBot"),
    re.compile(r"DuckAssistBot"),
    re.compile(r"omgili"),
    re.compile(r"webzio-extended"),
    re.compile(r"gemini-deep-research"),
]

CommonMiddleware must be in MIDDLEWARE

Verify django.middleware.common.CommonMiddleware is in your MIDDLEWARE list. It's included in all projects created with django-admin startproject by default. Without it, DISALLOWED_USER_AGENTS is silently ignored.

Returns 404, not 403. Django's CommonMiddleware uses HttpResponseNotFound for DISALLOWED_USER_AGENTS matches. If you specifically need a 403, use the custom middleware below instead.

Method 3: noai Meta Tag in Base Template

Add the noai and noimageai meta tags to your base Django template — typically templates/base.html. All templates that {% extends "base.html" %} inherit the tag automatically:

{# templates/base.html #}
<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>{% block title %}My Site{% endblock %}</title>

    {# Block AI training crawlers on every page #}
    {% block robots_meta %}
    <meta name="robots" content="noai, noimageai">
    {% endblock %}

    {% block extra_head %}{% endblock %}
</head>
<body>
    {% block content %}{% endblock %}
</body>
</html>

To allow AI indexing on specific pages (e.g. a public blog), override the block in the child template:

{# templates/blog/post.html — allow AI to see this page #}
{% extends "base.html" %}

{% block robots_meta %}
<meta name="robots" content="index, follow">
{% endblock %}

Method 4: Custom Middleware (Hard Blocking)

For full control — 403 response, request logging, runtime-updatable bot lists — write a Django middleware class. Create yourapp/middleware.py:

# yourapp/middleware.py
import re
from django.http import HttpResponseForbidden

BLOCKED_UAS = re.compile(
    r"GPTBot|ChatGPT-User|OAI-SearchBot|ClaudeBot|anthropic-ai"
    r"|Google-Extended|Bytespider|CCBot|PerplexityBot"
    r"|meta-externalagent|Amazonbot|Applebot-Extended|xAI-Bot"
    r"|DeepSeekBot|MistralBot|Diffbot|cohere-ai|AI2Bot|Ai2Bot-Dolma"
    r"|YouBot|DuckAssistBot|omgili|omgilibot|webzio-extended"
    r"|gemini-deep-research",
    re.IGNORECASE,
)

class BlockAiBotsMiddleware:
    def __init__(self, get_response):
        self.get_response = get_response

    def __call__(self, request):
        ua = request.META.get("HTTP_USER_AGENT", "")
        if BLOCKED_UAS.search(ua):
            return HttpResponseForbidden("Forbidden")
        return self.get_response(request)

Register in settings.py MIDDLEWARE list, before CommonMiddleware:

# settings.py
MIDDLEWARE = [
    "django.middleware.security.SecurityMiddleware",
    "yourapp.middleware.BlockAiBotsMiddleware",  # add here — early in the list
    "django.middleware.common.CommonMiddleware",
    "django.contrib.sessions.middleware.SessionMiddleware",
    # ... rest of middleware
]

Method 5: nginx (Server-Level Blocking)

On VPS deployments (DigitalOcean, AWS EC2, Hetzner), Django typically runs behind nginx as a reverse proxy to Gunicorn or uWSGI. Adding a user agent block to nginx means matched bots never reach Python — the most efficient approach:

# /etc/nginx/sites-available/yourproject.conf
server {
    listen 80;
    server_name yourdomain.com;

    # Block AI training crawlers at the edge
    if ($http_user_agent ~* "(GPTBot|ClaudeBot|anthropic-ai|CCBot|Bytespider|Google-Extended|PerplexityBot|Diffbot|DeepSeekBot|MistralBot|cohere-ai|meta-externalagent|Amazonbot|xAI-Bot|AI2Bot|omgili|webzio-extended|gemini-deep-research|OAI-SearchBot|ChatGPT-User)") {
        return 403;
    }

    location / {
        proxy_pass http://127.0.0.1:8000;  # Gunicorn
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
    }

    # Serve robots.txt as a static file (optional — bypasses Django)
    location = /robots.txt {
        alias /var/www/yourproject/static/robots.txt;
    }
}

PaaS deployments (Heroku, Railway, Render)

On platform-as-a-service hosts you have no nginx access. Use the Django-layer methods:

  • DISALLOWED_USER_AGENTS in settings.py — simplest, zero new files
  • BlockAiBotsMiddleware — more control, 403 response
  • Cloudflare in front of your PaaS domain — WAF custom rules for edge blocking without server access
  • WhiteNoise for static file serving — if using whitenoise, place robots.txt in your STATIC_ROOT and it's served automatically

AI Bots to Block

25 user agents covering AI training crawlers and AI search bots. The view and middleware patterns above include all of them.

GPTBotChatGPT-UserOAI-SearchBotClaudeBotanthropic-aiGoogle-ExtendedBytespiderCCBotPerplexityBotmeta-externalagentAmazonbotApplebot-ExtendedxAI-BotDeepSeekBotMistralBotDiffbotcohere-aiAI2BotAi2Bot-DolmaYouBotDuckAssistBotomgiliomgilibotwebzio-extendedgemini-deep-research

Frequently Asked Questions

How do I create a robots.txt view in Django?

Create a view function in views.py that returns an HttpResponse with content_type='text/plain'. Wire it in urls.py with path('robots.txt', views.robots_txt). The view can generate content statically or dynamically — for example, checking settings.DEBUG to block all crawlers in non-production environments. Alternatively, use Django's TemplateResponse with a template file at templates/robots.txt (a plain text file, not HTML) and render it with content_type='text/plain'.

What is DISALLOWED_USER_AGENTS in Django?

DISALLOWED_USER_AGENTS is a built-in Django setting processed by django.middleware.common.CommonMiddleware. Set it to a list of compiled regular expressions — any request whose User-Agent matches a pattern receives a 404 response automatically, before your view runs. It requires CommonMiddleware to be in your MIDDLEWARE list (it is by default in all Django projects). This is the simplest zero-code approach to blocking specific bots. Example: DISALLOWED_USER_AGENTS = [re.compile(r'GPTBot'), re.compile(r'ClaudeBot'), re.compile(r'CCBot')].

Should I use DISALLOWED_USER_AGENTS or custom middleware for AI bot blocking?

DISALLOWED_USER_AGENTS is simpler — add patterns to settings.py, no new files needed. Its limitation is that it returns 404, not 403, and the pattern list must be in settings.py (not easily updatable at runtime). Custom middleware is more flexible: you can return 403, log blocked requests, load the bot list from a database or cache, and update it without redeploying. For most sites, DISALLOWED_USER_AGENTS is sufficient and the fastest to implement.

How do I serve robots.txt as a static file in Django?

In production Django deployments, static files are typically served by nginx or Apache directly — not by Django. The cleanest approach is to have nginx serve robots.txt from your project's static files directory, bypassing Django entirely. Add the file to your STATICFILES_DIRS (e.g. project/static/robots.txt) and configure nginx to serve /robots.txt from that path. In development (DEBUG=True), Django's staticfiles app serves it automatically at /robots.txt if placed in a static/ directory.

How do I add noai meta tags to every page in Django?

Add the meta tag to your base Django template — typically templates/base.html or templates/base/base.html. Place <meta name="robots" content="noai, noimageai"> inside the {% block head %} section or directly in the <head> before any blocks. All templates that extend this base will inherit the tag. For per-page override, define a {% block robots_meta %} in base.html and override it in specific templates with a different robots directive.

Does blocking AI bots affect Django's SEO or sitemap framework?

No. Blocking GPTBot, ClaudeBot, CCBot, and other AI crawlers does not affect Googlebot or Bingbot. Django's django.contrib.sitemaps framework continues generating sitemap.xml normally. Your search engine rankings are unaffected. If using DISALLOWED_USER_AGENTS, be careful to only include AI training bot names — not 'bot' or 'crawler' as generic patterns, as this would block legitimate search engine crawlers.

Is your site protected from AI bots?

Run a free scan to check your robots.txt, meta tags, and overall AI readiness score.

Related Guides