Skip to content
WordPressNew10 min read

How to Block AI Bots on WordPress

WordPress powers 43% of the web — and AI crawlers know it. This guide covers every method from a 30-second robots.txt edit to server-level .htaccess blocking.

WordPress robots.txt works differently

WordPress generates a virtual robots.txt — there is no physical file by default. This means you can't just FTP in and edit it. You need to either use an SEO plugin (Yoast, AIOSEO) to edit the virtual file, or create a physical robots.txt that overrides it. Both methods work — the physical file takes precedence when it exists.

Quick block — paste into your WordPress robots.txt

Via Yoast: SEO → Tools → File Editor. Via AIOSEO: All in One SEO → Tools → Robots.txt. Via physical file: /public_html/robots.txt.

User-agent: GPTBot
Disallow: /

User-agent: ChatGPT-User
Disallow: /

User-agent: OAI-SearchBot
Disallow: /

User-agent: ClaudeBot
Disallow: /

User-agent: anthropic-ai
Disallow: /

User-agent: Google-Extended
Disallow: /

User-agent: Bytespider
Disallow: /

User-agent: CCBot
Disallow: /

User-agent: PerplexityBot
Disallow: /

User-agent: meta-externalagent
Disallow: /

User-agent: Amazonbot
Disallow: /

User-agent: Applebot-Extended
Disallow: /

User-agent: xAI-Bot
Disallow: /

User-agent: DeepSeekBot
Disallow: /

User-agent: MistralBot
Disallow: /

User-agent: Diffbot
Disallow: /

User-agent: cohere-ai
Disallow: /

User-agent: AI2Bot
Disallow: /

User-agent: Ai2Bot-Dolma
Disallow: /

User-agent: YouBot
Disallow: /

User-agent: DuckAssistBot
Disallow: /

User-agent: omgili
Disallow: /

User-agent: omgilibot
Disallow: /

User-agent: webzio-extended
Disallow: /

User-agent: gemini-deep-research
Disallow: /

User-agent: GoogleOther-ImageFetcher
Disallow: /

This does NOT affect Googlebot, Bingbot, or other traditional search crawlers — your SEO rankings are safe.

5 Methods to Block AI Bots on WordPress

Yoast SEO (Free)

Easy

SEO → Tools → File Editor

Edit virtual robots.txt directly in the WordPress admin panel. Changes are instant, no FTP needed.

Overrides physical robots.txt. Cannot block bots that ignore robots.txt.

All in One SEO (Free)

Easy

All in One SEO → Tools → Robots.txt

Full robots.txt editor built in. Supports per-rule management. Widely used alternative to Yoast.

Same virtual robots.txt limitation as Yoast.

Physical robots.txt file

Intermediate

FTP / cPanel File Manager → /public_html/robots.txt

Creates a real file that takes precedence over WordPress virtual robots.txt. Requires FTP or hosting control panel access.

Some managed hosts (WP Engine, Pantheon) override physical files. Check with your host.

.htaccess (Apache)

Advanced

FTP → /public_html/.htaccess

Server-level blocking. Stops requests before WordPress loads. Only method that stops bots ignoring robots.txt.

Apache only. Nginx sites need a different config. Requires care — a syntax error can take your site down.

Cloudflare WAF Rules

Advanced

Cloudflare Dashboard → Security → WAF

Block at the network edge before requests reach your server. Handles Bytespider and other robots.txt violators.

Requires Cloudflare (free plan works for basic rules). Paid plan needed for advanced WAF.

Method 1: Yoast SEO (Easiest)

Yoast is installed on over 10 million WordPress sites. It includes a robots.txt editor in the free version — no premium upgrade needed for this.

  1. 1In your WordPress admin, go to SEO → Tools → File Editor
  2. 2You'll see the robots.txt editor. It will already have a default block for Yoast-generated sitemaps.
  3. 3Paste the AI bot rules from the Quick Block section above after the existing rules.
  4. 4Click Save changes to robots.txt. Done — no FTP required.
  5. 5Verify by visiting yourdomain.com/robots.txt in your browser.
Note: If you don't see a File Editor under SEO → Tools, it may be disabled. Go to SEO → General → Features and enable the "Advanced settings pages" toggle.

Method 2: All in One SEO

  1. 1Go to All in One SEO → Tools → Robots.txt Editor
  2. 2Toggle to Enable Custom Robots.txt if not already on.
  3. 3Paste the full block rules from above into the editor.
  4. 4Click Save Changes. Verify at yourdomain.com/robots.txt.

Method 3: Physical robots.txt File

If a physical robots.txt file exists at your domain root, it overrides WordPress's virtual one. This is the most reliable method when you have FTP or hosting file manager access.

  1. 1Connect via FTP or open your hosting control panel's File Manager.
  2. 2Navigate to /public_html/ (your WordPress root folder).
  3. 3Check if robots.txt already exists. If it does, edit it. If not, create a new file named robots.txt.
  4. 4Paste the rules from the Quick Block section. Make sure to keep any existing Sitemap: lines.
WP Engine / Flywheel / Pantheon users: Some managed hosts intercept robots.txt at the server layer. Check your host documentation — you may need to use their dashboard to set robots.txt instead of uploading a file directly.

Method 4: .htaccess Server Blocking

robots.txt is a convention — bots that ignore it (looking at you, Bytespider) will still crawl your site. .htaccess blocking happens at the Apache server layer before WordPress even loads. It's the strongest protection available on Apache-based hosts.

⚠ Back up your .htaccess before editing

A syntax error in .htaccess will return a 500 error for your entire site. Download the file first.

Add this block near the top of your .htaccess, before the WordPress permalink block:

# Block AI training crawlers
<IfModule mod_setenvif.c>
  SetEnvIfNoCase User-Agent "GPTBot" bad_bot
  SetEnvIfNoCase User-Agent "ChatGPT-User" bad_bot
  SetEnvIfNoCase User-Agent "OAI-SearchBot" bad_bot
  SetEnvIfNoCase User-Agent "ClaudeBot" bad_bot
  SetEnvIfNoCase User-Agent "anthropic-ai" bad_bot
  SetEnvIfNoCase User-Agent "Google-Extended" bad_bot
  SetEnvIfNoCase User-Agent "Bytespider" bad_bot
  SetEnvIfNoCase User-Agent "CCBot" bad_bot
  SetEnvIfNoCase User-Agent "PerplexityBot" bad_bot
  SetEnvIfNoCase User-Agent "meta-externalagent" bad_bot
  SetEnvIfNoCase User-Agent "Amazonbot" bad_bot
  SetEnvIfNoCase User-Agent "Applebot-Extended" bad_bot
  SetEnvIfNoCase User-Agent "xAI-Bot" bad_bot
  SetEnvIfNoCase User-Agent "DeepSeekBot" bad_bot
  SetEnvIfNoCase User-Agent "MistralBot" bad_bot
  SetEnvIfNoCase User-Agent "Diffbot" bad_bot
  SetEnvIfNoCase User-Agent "cohere-ai" bad_bot
  SetEnvIfNoCase User-Agent "AI2Bot" bad_bot
  SetEnvIfNoCase User-Agent "YouBot" bad_bot
  SetEnvIfNoCase User-Agent "DuckAssistBot" bad_bot
  SetEnvIfNoCase User-Agent "omgilibot" bad_bot
  SetEnvIfNoCase User-Agent "webzio-extended" bad_bot
  SetEnvIfNoCase User-Agent "gemini-deep-research" bad_bot
</IfModule>

<IfModule mod_authz_core.c>
  <RequireAll>
    Require all granted
    Require not env bad_bot
  </RequireAll>
</IfModule>

This returns 403 Forbidden to blocked bots. They cannot read your content regardless of what they put in robots.txt. Works on Apache (most shared hosting). Nginx users need a different approach — see below.

On Nginx? Use location blocks instead

# In your Nginx site config
if ($http_user_agent ~* "(GPTBot|ClaudeBot|Google-Extended|Bytespider|CCBot|anthropic-ai|PerplexityBot|meta-externalagent|DeepSeekBot|MistralBot|xAI-Bot|Diffbot|cohere-ai|AI2Bot)") {
    return 403;
}

Add inside the server block. Restart Nginx after changes.

Adding noai Meta Tags to WordPress

The noai and noimageai meta tags tell AI training bots not to use your page content — even when they do visit. It's a belt-and-suspenders approach alongside robots.txt.

Option A: Add via functions.php (all pages)

// Add to your child theme's functions.php
add_action('wp_head', function() {
    echo '<meta name="robots" content="noai, noimageai">' . PHP_EOL;
});

Option B: Add via Yoast SEO (per-page control)

  1. 1.In the WordPress editor, scroll to the Yoast SEO panel below the post content.
  2. 2.Click Advanced tab.
  3. 3.In the "Meta robots" section, Yoast allows custom robot directives including noai in newer versions.

Option C: Rank Math plugin

Rank Math SEO has built-in noai/noimageai toggle in its Advanced tab for each post/page. Enable "No AI Training" per-page or globally under Rank Math → Titles & Meta → Global Meta.

AI Bot Reference Table

What you're actually blocking — and why each matters for WordPress sites.

User-AgentCompanyTypePriority
GPTBotOpenAI (ChatGPT)TrainingHigh
ChatGPT-UserOpenAILive browsingHigh
OAI-SearchBotOpenAI (ChatGPT Search)AI SearchMedium
ClaudeBotAnthropicTrainingHigh
anthropic-aiAnthropicTrainingHigh
Google-ExtendedGoogle (Gemini)TrainingHigh
BytespiderByteDance (TikTok)TrainingCritical — ignores robots.txt
CCBotCommon CrawlTraining (50+ models)High
PerplexityBotPerplexity AIAI SearchMedium
meta-externalagentMeta (Llama)TrainingHigh
xAI-BotxAI (Grok)TrainingMedium
DeepSeekBotDeepSeekTrainingMedium
MistralBotMistral AITrainingMedium

Will Blocking AI Bots Hurt My WordPress SEO?

Safe to block — no SEO impact

  • ✓ GPTBot (OpenAI training)
  • ✓ ClaudeBot (Anthropic training)
  • ✓ Google-Extended (Gemini training)
  • ✓ CCBot (Common Crawl)
  • ✓ Bytespider (ByteDance training)
  • ✓ meta-externalagent (Meta Llama)
  • ✓ DeepSeekBot, MistralBot, xAI-Bot
  • ✓ All other AI training crawlers

Think before blocking — SEO tradeoff

  • ⚠ OAI-SearchBot → removes from ChatGPT Search
  • ⚠ PerplexityBot → removes from Perplexity AI
  • ⚠ YouBot → removes from You.com AI
  • ⚠ DuckAssistBot → removes from DuckDuckGo AI
  • These are AI search bots, not training bots. Only block if you don't want AI search visibility.

Verify Your WordPress Block Is Working

1. Check robots.txt is live

Visit https://yourdomain.com/robots.txt — you should see your new Disallow rules for each AI bot.

2. Test with Google's robots.txt Tester

Google Search Console → Settings → robots.txt → Test. Enter GPTBot as the user agent and verify it shows "Blocked".

3. Check server logs for past visits

If you have access to cPanel > Logs > Raw Access, download your access logs and search for GPTBot/ClaudeBot to see historical crawl activity.

grep -i "gptbot\|claudebot\|bytespider" /var/log/apache2/access.log | tail -20

4. Run the Open Shadow scanner

The free site scanner checks your robots.txt, detects which bots are blocked, and gives you an AI readiness score. No login required.

Notes for Managed WordPress Hosts

WP Engine

WP Engine intercepts robots.txt at the server layer. Edit via their User Portal → Sites → [your site] → Robots.txt. Do not try to upload a physical file — it will be ignored.

Kinsta

Kinsta respects physical robots.txt files. You can upload one via SFTP to the public root, or use Yoast/AIOSEO normally.

SiteGround / Bluehost / GoDaddy Hosting

Standard Apache hosting — all methods work. Physical robots.txt file via cPanel File Manager is usually the simplest approach without plugins.

Cloudflare (proxied)

If your WordPress site is behind Cloudflare, you can add WAF Custom Rules to block AI bots at the network edge. This is the strongest option — requests are blocked before they reach WordPress.

WordPress.com (hosted)

On WordPress.com hosted plans (not self-hosted), you cannot edit .htaccess. You can edit robots.txt under Tools → Marketing → Traffic. The Business plan and above also allow custom code.

Frequently Asked Questions

Does WordPress automatically block AI bots?

No. WordPress generates a virtual robots.txt that allows all user agents by default. Without action, GPTBot, ClaudeBot, CCBot, and all other AI crawlers can and will crawl your site.

Which method should I use — Yoast, physical file, or .htaccess?

For most WordPress sites: start with Yoast or AIOSEO (easiest, no FTP). If you don't use an SEO plugin, use the physical robots.txt file via cPanel. Add .htaccess blocking only if you want to stop bots that ignore robots.txt (primarily Bytespider). You can use all three together — they don't conflict.

Will blocking AI bots hurt my WordPress SEO?

No. Blocking AI training bots (GPTBot, ClaudeBot, CCBot, Google-Extended, Bytespider) has zero effect on Google Search, Bing, or any traditional search rankings. The only tradeoff is with AI search bots like OAI-SearchBot or PerplexityBot — blocking those removes you from AI-powered search results.

Can Bytespider bypass my WordPress robots.txt?

Yes. Bytespider (ByteDance's crawler) has been documented ignoring robots.txt on some sites. For Bytespider specifically, use .htaccess or Cloudflare WAF rules for server-level blocking. robots.txt is a convention, not enforcement.

How do I add a noai meta tag to all my WordPress posts?

Add to your child theme's functions.php: add_action('wp_head', function() { echo '<meta name="robots" content="noai, noimageai">'; }); This adds the tag to every page. Use a child theme so it survives theme updates.

I'm on a multisite install — does this affect all subsites?

For WordPress Multisite, robots.txt at the network root applies to all subsites by default. If subsites are on subdomains, each subdomain needs its own robots.txt. For per-subsite robots.txt, use a plugin that handles multisite — Yoast Multisite or AIOSEO Pro both support this.

Is your site protected from AI bots?

Run a free scan to check your robots.txt, meta tags, and overall AI readiness score.

Scan My Site Free →

Related Guides