Applebot-Extended is Apple's AI training crawler — separate from the Applebot that powers Siri and Spotlight. Here's how to opt out of Apple Intelligence training without losing your search presence.
Apple launched Applebot-Extended alongside Apple Intelligence — the suite of on-device and cloud AI features introduced in iOS 18 and macOS Sequoia. While the original Applebot has been crawling the web since 2015 for Siri, Spotlight, and App Store indexing, Applebot-Extended is a dedicated crawler whose sole purpose is gathering training data for Apple's AI models.
The key difference: regular Applebot helps your content get found (Siri suggestions, Spotlight results, Safari Reader). Applebot-Extended does something different — it consumes your content as model training data, which means your writing, product descriptions, and original content could end up shaping Apple's AI outputs without any attribution or compensation.
User agent string
Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/17.0 Safari/605.1.15 (Applebot-Extended/1.0)The token Applebot-Extended is what robots.txt and server rules match against.
These are two distinct crawlers. Blocking one does not block the other — unless you block the parent (Applebot), which causes Applebot-Extended to inherit that block.
| Property | Applebot | Applebot-Extended |
|---|---|---|
| Purpose | Siri, Spotlight, App Store, Safari Reader | Apple Intelligence AI model training |
| Active since | 2015 | 2024 (Apple Intelligence launch) |
| User agent token | Applebot | Applebot-Extended |
| Respects robots.txt | Yes | Yes (inherits from Applebot rules) |
| Blocking affects Siri/Spotlight | Yes | No |
| Blocking affects AI training | Yes (parent) | Yes (direct) |
Add a dedicated block for Applebot-Extended while leaving Applebot allowed. Your Siri and Spotlight presence is unaffected.
# robots.txt
User-agent: Applebot-Extended
Disallow: /
Place this in robots.txt at your domain root (e.g. https://yoursite.com/robots.txt).
If you want to fully exclude Apple's crawlers, block both. Note: this will remove your site from Siri suggestions and Spotlight indexing.
# robots.txt — block all Apple crawlers
User-agent: Applebot
Disallow: /
User-agent: Applebot-Extended
Disallow: /
Protect original writing and premium content from AI training while allowing Apple to crawl marketing pages.
# robots.txt — protect specific paths
User-agent: Applebot-Extended
Disallow: /blog/
Disallow: /articles/
Disallow: /members/
Allow: /
For stronger enforcement — robots.txt relies on the crawler honoring it; a server-level block returns 403 regardless.
# nginx — block Applebot-Extended
if ($http_user_agent ~* "Applebot-Extended") {
return 403;
}
# Apache .htaccess — block Applebot-Extended
BrowserMatch "Applebot-Extended" bad_bot
Order Allow,Deny
Allow from all
Deny from env=bad_bot
⚠️ Applebot-Extended inherits from Applebot
If you block Applebot from a path, Applebot-Extended is also blocked from that path — even if you have no explicit Applebot-Extended rule. This means if you already run a blanket Disallow: /for Applebot, Applebot-Extended cannot access your site at all.
The reverse is not true: blocking Applebot-Extended does not affect Applebot. The inheritance is one-way — child inherits from parent, not the other way around.
After updating robots.txt, verify it's correct before Applebot-Extended's next crawl:
Check your robots.txt is live
Visit https://yoursite.com/robots.txt in a browser. You should see the Applebot-Extended Disallow rule you added.
Use Google Search Console's robots.txt tester
While GSC is Google-specific, its robots.txt parser follows the same spec. Enter your robots.txt and test the path "/" with user agent "Applebot-Extended" to confirm it shows "Blocked".
Check server logs
Search your access logs for "Applebot-Extended". After the block is live, you should see 403 responses (server-level block) or the crawler simply stops appearing (robots.txt compliance).
Use Open Shadow's bot checker
Run your domain through Open Shadow's free bot check at /check — it scans your robots.txt and reports which AI bots are allowed, disallowed, or unaddressed.
There's a legitimate tradeoff. Apple Intelligence generates answers, summaries, and suggestions from web content — blocking Applebot-Extended means your brand, products, and content are less likely to influence those outputs. For some publishers that's a feature; for others it's a cost.
Block if you…
Allow if you…
Run a free bot check on any domain — Open Shadow scans your robots.txt, headers, and bot signals to show you exactly which AI crawlers have access and which are blocked.
Check My Site for FreeIs your site protected from AI bots?
Run a free scan to check your robots.txt, meta tags, and overall AI readiness score.
Scan My Site Free →