Skip to content
HugoStatic SiteNew8 min read

How to Block AI Bots on Hugo

Hugo's fast build times and clean semantic HTML make it a popular target for AI crawlers. Blocking them takes two minutes — a plain robots.txt in your static/ directory and a meta tag in your base template. No server access, no plugins.

Hugo's static/ directory is the easiest path

Hugo has two ways to serve a robots.txt: (1) drop a plain text file in static/robots.txt — Hugo copies it directly to public/robots.txt with zero processing, or (2) create a layouts/robots.txt template and configure an output format in hugo.toml. For AI bot blocking, the static/ approach is strongly recommended — it's simpler, harder to misconfigure, and works on every Hugo hosting platform without extra setup.

Quick fix — create static/robots.txt in your Hugo root

Same folder structure as your content/ and layouts/ directories. Run hugo to build, then verify at public/robots.txt.

User-agent: GPTBot
Disallow: /

User-agent: ClaudeBot
Disallow: /

User-agent: CCBot
Disallow: /

User-agent: Google-Extended
Disallow: /

Available Methods

static/robots.txt (Recommended)

Easy

static/robots.txt (copied as-is to public/robots.txt at build)

Hugo copies everything in static/ to public/ unchanged. A plain robots.txt here requires no configuration and works on all Hugo deployments and hosting platforms.

No Go template syntax needed — just plain text. Hugo passes static files through without processing.

layouts/robots.txt template (Advanced)

Advanced

layouts/robots.txt + hugo.toml output format config

Define robots.txt as a Hugo output format to generate it via Go templates. Useful if you need dynamic rules based on site config, environment, or content params.

Requires configuring mediaTypes and outputFormats in hugo.toml. For AI blocking, static/ is simpler.

noai meta tag in baseof.html

Easy

layouts/_default/baseof.html (or layouts/partials/head.html)

Add the noai/noimageai meta tag in your base template's <head> block. Applies to every page. Use Go template front matter conditionals for per-page control.

For Hugo module themes, copy the relevant partial into your own layouts/ to override it.

Cloudflare WAF / _headers file

Intermediate

Cloudflare Dashboard → WAF → Custom Rules (or static/_headers for Cloudflare Pages)

Edge-level blocking before requests reach your Hugo static files. Only reliable method for bots that ignore robots.txt. _headers file works natively on Cloudflare Pages and Netlify.

Cloudflare Pages: use static/_headers (copied to output root). Netlify: netlify.toml [[headers]] block.

Method 1: static/robots.txt (Recommended)

Hugo copies every file in static/ to public/ verbatim during the build. No Go template processing, no configuration required. This is the simplest and most reliable approach.

  1. 1

    In your Hugo project root, create static/robots.txt. The static/ folder is at the same level as content/, layouts/, and hugo.toml.

  2. 2

    Paste the full AI bot block list:

User-agent: *
Allow: /

User-agent: GPTBot
Disallow: /

User-agent: ChatGPT-User
Disallow: /

User-agent: OAI-SearchBot
Disallow: /

User-agent: ClaudeBot
Disallow: /

User-agent: anthropic-ai
Disallow: /

User-agent: Google-Extended
Disallow: /

User-agent: Bytespider
Disallow: /

User-agent: CCBot
Disallow: /

User-agent: PerplexityBot
Disallow: /

User-agent: meta-externalagent
Disallow: /

User-agent: Amazonbot
Disallow: /

User-agent: Applebot-Extended
Disallow: /

User-agent: xAI-Bot
Disallow: /

User-agent: DeepSeekBot
Disallow: /

User-agent: MistralBot
Disallow: /

User-agent: Diffbot
Disallow: /

User-agent: cohere-ai
Disallow: /

User-agent: AI2Bot
Disallow: /

User-agent: Ai2Bot-Dolma
Disallow: /

User-agent: YouBot
Disallow: /

User-agent: DuckAssistBot
Disallow: /

User-agent: omgili
Disallow: /

User-agent: omgilibot
Disallow: /

User-agent: webzio-extended
Disallow: /

User-agent: gemini-deep-research
Disallow: /
  1. 3

    Build and verify locally:

    hugo
    cat public/robots.txt | head -10
  2. 4

    Deploy. If using git-based deployment (Netlify, Cloudflare Pages, Vercel, GitHub Pages):

    git add static/robots.txt
    git commit -m "Block AI training bots via robots.txt"
    git push origin main
Check your hugo.toml for a disableRobotsTXT setting: Some Hugo configurations include disableRobotsTXT = true — this only affects Hugo's automatic robots.txt generation from the layouts/ template system. It does not affect files in static/. A file in static/robots.txt is always copied regardless of this setting.

Method 2: layouts/robots.txt Template (Advanced)

For dynamic robots.txt generation using Go templates — useful if you want to reference site config variables or environment — you can define robots.txt as a Hugo output format.

Step 1: Add output format to hugo.toml

[mediaTypes]
  [mediaTypes."text/plain"]
    suffixes = ["txt"]

[outputFormats]
  [outputFormats.robots]
    mediaType = "text/plain"
    baseName = "robots"
    isPlainText = true
    notAlternative = true

[outputs]
  home = ["HTML", "RSS", "robots"]

Step 2: Create layouts/robots.txt

User-agent: *
Allow: /

User-agent: GPTBot
Disallow: /

User-agent: ClaudeBot
Disallow: /

User-agent: CCBot
Disallow: /

User-agent: Google-Extended
Disallow: /

# ... (add full bot list)

Sitemap: {{ .Site.BaseURL }}sitemap.xml

Method 3: noai Meta Tag via Base Template

Add the noai meta tag to every Hugo page by editing your base template or head partial.

Option A: Edit layouts/_default/baseof.html

<!-- In the <head> block, before </head>: -->
<meta name="robots" content="noai, noimageai">

Option B: Edit or create layouts/partials/head.html

<!-- Add to layouts/partials/head.html: -->
<meta name="robots" content="noai, noimageai">

Per-page control via front matter:

{{/* In layouts/_default/baseof.html or head partial: */}}
{{ if .Params.noai }}
  <meta name="robots" content="noai, noimageai">
{{ end }}

# Then in any content file's front matter (YAML):
---
title: "Protected Post"
noai: true
---

# Or TOML:
+++
title = "Protected Post"
noai = true
+++"}
Using a Hugo theme? If your theme is installed as a Hugo module (go.mod) or a git submodule in themes/, the template files live inside the theme directory, not your project root. To override: create a matching path in your project's layouts/ directory and Hugo will use your version. For example, to override a theme's head partial at themes/mytheme/layouts/partials/head.html, create layouts/partials/head.html in your project root.

Method 4: Cloudflare WAF / _headers File

For edge-level blocking — the only method effective against bots that ignore robots.txt — proxy through Cloudflare or use platform-native header files.

Cloudflare Pages — static/_headers

Create static/_headers — Hugo copies it to public/_headers, which Cloudflare Pages reads natively.

/*
  X-Robots-Tag: noai, noimageai

Sets the HTTP X-Robots-Tag header on all responses — more authoritative than the HTML meta tag.

Netlify — netlify.toml

Or use static/_redirects / netlify.toml for header control:

[[headers]]
  for = "/*"
  [headers.values]
    X-Robots-Tag = "noai, noimageai"

Cloudflare WAF custom rule expression:

(http.user_agent contains "GPTBot") or (http.user_agent contains "ClaudeBot") or (http.user_agent contains "CCBot") or (http.user_agent contains "Bytespider") or (http.user_agent contains "Google-Extended") or (http.user_agent contains "Diffbot") or (http.user_agent contains "meta-externalagent") or (http.user_agent contains "DeepSeekBot")

Action: Block. Free plan supports basic string matching. Pro adds regex and rate limiting.

Full AI Bot Reference

All 25 AI bots covered by the robots.txt block list above:

GPTBotChatGPT-UserOAI-SearchBotClaudeBotanthropic-aiGoogle-ExtendedBytespiderCCBotPerplexityBotmeta-externalagentAmazonbotApplebot-ExtendedxAI-BotDeepSeekBotMistralBotDiffbotcohere-aiAI2BotAi2Bot-DolmaYouBotDuckAssistBotomgiliomgilibotwebzio-extendedgemini-deep-research

Frequently Asked Questions

Where do I put robots.txt in a Hugo site?

The simplest approach: put robots.txt in the static/ directory at your Hugo project root. Hugo copies everything in static/ directly to the public/ output directory unchanged. So static/robots.txt becomes public/robots.txt and is served at yourdomain.com/robots.txt. Alternatively, you can create a layouts/robots.txt template file and configure Hugo to output it as plain text — this lets you use Go template logic to generate rules dynamically. Both methods work on Hugo 0.100+.

How do I add a noai meta tag to every Hugo page?

Edit your base template: layouts/_default/baseof.html. Find the <head> section and add <meta name="robots" content="noai, noimageai"> before </head>. If your theme uses a head partial (common pattern), look for layouts/partials/head.html or similar and add the tag there. For themes installed as Hugo modules (go.mod), copy the relevant partial into your own layouts/partials/ directory to override it — Hugo will prefer your local copy over the theme's.

Can Hugo generate robots.txt as a template?

Yes. Create layouts/robots.txt with your rules as a Go template file. Then in your hugo.toml (or config.toml), add: [outputs] home = ["HTML", "RSS", "robots"] and define the media type: [mediaTypes] [mediaTypes."text/plain"] suffixes = ["txt"] [outputFormats] [outputFormats.robots] mediaType = "text/plain" baseName = "robots" isPlainText = true notAlternative = true. This generates robots.txt at build time from your template. For simple static AI bot blocking, putting a plain file in static/ is faster and less error-prone.

How do I control AI bot access per page in Hugo?

Add a custom front matter variable to specific content files (e.g. noai: true in the YAML/TOML front matter), then use a Go template conditional in your base layout or head partial: {{ if .Params.noai }}<meta name="robots" content="noai, noimageai">{{ end }}. This renders the noai meta tag only on pages where noai: true is set, giving you per-page control without affecting the rest of the site.

Will blocking AI bots affect Hugo's sitemap or SEO?

No. Blocking GPTBot, ClaudeBot, CCBot, and other AI training bots has zero effect on Googlebot or Bingbot. Hugo's built-in sitemap generation (public/sitemap.xml) is completely unaffected. Search engine crawlers operate independently of AI training crawlers and read robots.txt separately. Your Hugo site's search rankings, sitemap, and canonical URLs will continue working normally.

Does Hugo work with Cloudflare for bot blocking?

Yes — proxy your domain through Cloudflare (any DNS-based setup works) and create custom WAF rules to block AI crawlers by user agent. This is the most reliable method for bots like Bytespider that have been documented ignoring robots.txt. Hugo static sites are commonly deployed on Cloudflare Pages, Netlify, Vercel, or GitHub Pages — all of which can be fronted with Cloudflare. For Cloudflare Pages specifically, you can also add a _headers file to your static/ directory to set response headers for AI bot blocking.

Is your site protected from AI bots?

Run a free scan to check your robots.txt, meta tags, and overall AI readiness score.

Related Guides