Angular's index.html is a real editable file — unlike Vue.js, you can drop a <meta name="robots" content="noai, noimageai"> there and it reaches every crawler, no JavaScript required. For robots.txt, configure the assets array in angular.json (Angular 15 and below) or use the public/ folder (Angular 17+). Hard blocking requires nginx or the Angular Universal Express server.
| Method |
|---|
| robots.txt (angular.json assets / public/) Always — foundation layer |
| noai in index.html (global, no JS required) Every Angular app — fastest win |
| Angular Meta service (per-route) Angular Universal (SSR) for reliable delivery |
| Angular Universal Express middleware Angular Universal (SSR) apps |
| X-Robots-Tag response header nginx / hosting platform headers config |
| nginx hard block nginx serving Angular dist/ |
| Netlify / Vercel / Firebase Hosting Static Angular SPA on hosting platforms |
Angular CLI does not serve files from your project root — only files explicitly listed in the assets array of angular.json are copied to the output directory. The location and config differ by Angular version.
Angular 17 introduced a public/ folder at the project root. Files here are automatically copied to dist/ — no angular.json change needed.
User-agent: GPTBot Disallow: / User-agent: ChatGPT-User Disallow: / User-agent: OAI-SearchBot Disallow: / User-agent: ClaudeBot Disallow: / User-agent: anthropic-ai Disallow: / User-agent: Google-Extended Disallow: / User-agent: Bytespider Disallow: / User-agent: CCBot Disallow: / User-agent: PerplexityBot Disallow: / User-agent: meta-externalagent Disallow: / User-agent: Amazonbot Disallow: / User-agent: Applebot-Extended Disallow: / User-agent: xAI-Bot Disallow: / User-agent: DeepSeekBot Disallow: / User-agent: MistralBot Disallow: / User-agent: Diffbot Disallow: / User-agent: cohere-ai Disallow: / User-agent: AI2Bot Disallow: / User-agent: Ai2Bot-Dolma Disallow: / User-agent: YouBot Disallow: / User-agent: DuckAssistBot Disallow: / User-agent: omgili Disallow: / User-agent: omgilibot Disallow: / User-agent: webzio-extended Disallow: / User-agent: gemini-deep-research Disallow: / User-agent: * Allow: /
Place robots.txt in src/robots.txt and add it to the assets array:
{
"projects": {
"your-app": {
"architect": {
"build": {
"options": {
"assets": [
"src/favicon.ico",
"src/assets",
"src/robots.txt" // ← add this line
]
}
}
}
}
}
}Without the assets entry: Angular CLI will silently ignore src/robots.txt — it won't appear in your dist/ folder and won't be served. Always verify by running ng build and checking that dist/your-app/browser/robots.txt exists.
Angular's src/index.html is your app shell — the actual HTML file served to every visitor. Add the noai meta tag here and it is present in the initial HTTP response, visible to every crawler regardless of whether they execute JavaScript.
<!doctype html> <html lang="en"> <head> <meta charset="utf-8"> <title>Your App</title> <base href="/"> <meta name="viewport" content="width=device-width, initial-scale=1"> <!-- Block AI training on all pages — no JavaScript required --> <meta name="robots" content="noai, noimageai"> <link rel="icon" type="image/x-icon" href="favicon.ico"> </head> <body> <app-root></app-root> </body> </html>
Angular vs Vue SPA: Vue's index.html is typically a Vite template processed at build time. Angular's is a plain HTML file you can edit directly. This means the noai tag in index.html is the single most reliable way to signal AI training opt-out in an Angular SPA — no server configuration needed.
Angular's built-in Meta service lets you set different robots tags per route. In a standard SPA this is JavaScript-only (meta tags are injected after Angular bootstraps). In Angular Universal (SSR), meta tags are server-rendered and present in the initial HTML.
import { Component } from '@angular/core';
import { Meta } from '@angular/platform-browser';
@Component({
selector: 'app-root',
templateUrl: './app.component.html',
})
export class AppComponent {
constructor(private meta: Meta) {
// Global noai — applies to every route unless overridden
this.meta.addTag({ name: 'robots', content: 'noai, noimageai' });
}
}import { Component, OnInit } from '@angular/core';
import { Meta } from '@angular/platform-browser';
@Component({
selector: 'app-blog-post',
templateUrl: './blog-post.component.html',
})
export class BlogPostComponent implements OnInit {
constructor(private meta: Meta) {}
ngOnInit(): void {
// This page: allow AI indexing but not training
this.meta.updateTag({ name: 'robots', content: 'noimageai' });
}
ngOnDestroy(): void {
// Restore default when navigating away
this.meta.updateTag({ name: 'robots', content: 'noai, noimageai' });
}
}SPA caveat: In a standard Angular SPA, the Meta service injects tags client-side. AI training crawlers like GPTBot and CCBot typically do not run JavaScript — they never see these tags. For reliable delivery to all bots, combine with the index.html approach above. The Meta service is most useful in Angular Universal (SSR) where tags are rendered server-side.
Angular Universal renders pages server-side using an Express server. Add middleware in server.ts before the Angular rendering handler to check the User-Agent header and return a 403 before Angular renders anything.
import { APP_BASE_HREF } from '@angular/common';
import { CommonEngine } from '@angular/ssr';
import express from 'express';
import { fileURLToPath } from 'node:url';
import { dirname, join, resolve } from 'node:path';
import bootstrap from './src/main.server';
const AI_BOT_PATTERN = /GPTBot|ChatGPT-User|OAI-SearchBot|ClaudeBot|anthropic-ai|Google-Extended|Bytespider|CCBot|PerplexityBot|meta-externalagent|Amazonbot|Applebot-Extended|xAI-Bot|DeepSeekBot|MistralBot|Diffbot|cohere-ai|AI2Bot|YouBot|DuckAssistBot|omgili|webzio-extended|gemini-deep-research/i;
export function app(): express.Express {
const server = express();
const serverDistFolder = dirname(fileURLToPath(import.meta.url));
const browserDistFolder = resolve(serverDistFolder, '../browser');
const indexHtml = join(serverDistFolder, 'index.server.html');
const commonEngine = new CommonEngine();
// Serve static files (robots.txt is served here — before the bot blocker)
server.get('*.*', express.static(browserDistFolder, { maxAge: '1y' }));
// Block AI bots before Angular renders any page
server.use((req, res, next) => {
const ua = req.headers['user-agent'] ?? '';
if (AI_BOT_PATTERN.test(ua)) {
res.status(403).type('text').send('Forbidden');
return;
}
next();
});
// Angular SSR rendering — only runs for allowed requests
server.get('**', (req, res, next) => {
const { protocol, originalUrl, baseUrl, headers } = req;
commonEngine
.render({
bootstrap,
documentFilePath: indexHtml,
url: `${protocol}://${headers.host}${originalUrl}`,
publicPath: browserDistFolder,
providers: [{ provide: APP_BASE_HREF, useValue: baseUrl }],
})
.then((html) => res.send(html))
.catch((err) => next(err));
});
return server;
}
function run(): void {
const port = process.env['PORT'] || 4000;
const server = app();
server.listen(port, () => {
console.log(`Node Express server listening on http://localhost:${port}`);
});
}
run();Middleware order: The bot-blocking middleware is placed after express.static() so that robots.txt and other static files are still served to blocked bots. This lets them read your opt-out signal before the block kicks in for rendered pages.
Add X-Robots-Tag: noai, noimageai to all HTTP responses. Unlike the <meta> tag, this header is delivered with every asset — including JavaScript bundles, CSS, and images — and doesn't depend on the HTML being parsed.
server {
listen 80;
server_name yourapp.com;
root /var/www/yourapp/dist/your-app/browser;
index index.html;
# Add X-Robots-Tag to all responses
add_header X-Robots-Tag "noai, noimageai" always;
location / {
try_files $uri $uri/ /index.html;
}
}{
"hosting": {
"public": "dist/your-app/browser",
"ignore": ["firebase.json", "**/.*", "**/node_modules/**"],
"rewrites": [
{ "source": "**", "destination": "/index.html" }
],
"headers": [
{
"source": "**",
"headers": [
{
"key": "X-Robots-Tag",
"value": "noai, noimageai"
}
]
}
]
}
}nginx serves Angular's static dist/ folder directly. Use a map block to flag AI bot user agents and return 403 before any file is served.
map $http_user_agent $block_ai_bot {
default 0;
~*GPTBot 1;
~*ChatGPT-User 1;
~*OAI-SearchBot 1;
~*ClaudeBot 1;
~*anthropic-ai 1;
~*Google-Extended 1;
~*Bytespider 1;
~*CCBot 1;
~*PerplexityBot 1;
~*meta-externalagent 1;
~*Amazonbot 1;
~*Applebot-Extended 1;
~*xAI-Bot 1;
~*DeepSeekBot 1;
~*MistralBot 1;
~*Diffbot 1;
~*cohere-ai 1;
~*AI2Bot 1;
~*YouBot 1;
~*DuckAssistBot 1;
~*omgili 1;
~*webzio-extended 1;
~*gemini-deep-research 1;
}
server {
listen 443 ssl;
server_name yourapp.com;
root /var/www/yourapp/dist/your-app/browser;
index index.html;
add_header X-Robots-Tag "noai, noimageai" always;
# Always serve robots.txt so bots can read your opt-out
location = /robots.txt {
try_files $uri =404;
}
location / {
if ($block_ai_bot) {
return 403 "Forbidden";
}
try_files $uri $uri/ /index.html;
}
}Static hosting platforms don't support user-agent-conditional responses natively — they serve fixed headers to all visitors. Use their headers config for X-Robots-Tag and put Cloudflare WAF in front for hard bot blocking.
Add a _headers file to your public/ or src/ folder (whichever is in your assets config), or use netlify.toml:
[build]
publish = "dist/your-app/browser"
command = "ng build --configuration=production"
[[headers]]
for = "/*"
[headers.values]
X-Robots-Tag = "noai, noimageai"
X-Frame-Options = "DENY"
X-Content-Type-Options = "nosniff"{
"buildCommand": "ng build --configuration=production",
"outputDirectory": "dist/your-app/browser",
"rewrites": [
{ "source": "/((?!.*\.).*)", "destination": "/index.html" }
],
"headers": [
{
"source": "/(.*)",
"headers": [
{
"key": "X-Robots-Tag",
"value": "noai, noimageai"
}
]
}
]
}/* X-Robots-Tag: noai, noimageai
Hard blocking on static hosts: Netlify, Vercel, and Firebase Hosting don't support user-agent-based 403 responses in their config files. To hard-block specific AI bots by user agent, put Cloudflare in front of your deployment and create a WAF rule: User Agent contains GPTBot → Block. This works regardless of your hosting provider.
| Method |
|---|
| robots.txt |
| index.html noai meta |
| Meta service tags |
| Express middleware hard block |
| X-Robots-Tag header |
| nginx hard block |
Angular 17+: put it in the public/ folder at the project root — no config needed. Angular 15 and below: put it in src/robots.txt and add "src/robots.txt" to the assets array in angular.json under build options. Without the assets entry, Angular CLI ignores the file and it won't be in your dist/ output.
Open src/index.html and add <meta name="robots" content="noai, noimageai"> inside <head>. This file is your app shell — it's served in the initial HTTP response, before any JavaScript runs. Every crawler sees it. Takes 30 seconds.
Only in Angular Universal (SSR). In a standard SPA, the Meta service injects tags client-side after Angular bootstraps — AI training crawlers that don't execute JavaScript never see them. In Universal, meta tags are rendered server-side and present in the initial HTML, so they're visible to all bots.
Add a headers array to firebase.json with X-Robots-Tag: noai, noimageai for the "**" source pattern. For hard bot blocking by user agent, Firebase Hosting doesn't support conditional responses — use Cloudflare WAF in front of Firebase or switch to Angular Universal with Express middleware.
No. Angular route guards run in the browser after the JS bundle loads. AI training crawlers that don't execute JavaScript never reach the router. Client-side guards, canActivate, and resolvers are ineffective against non-JS crawlers. Use server-level blocking (Universal Express middleware, nginx, or Cloudflare WAF).
Is your site protected from AI bots?
Run a free scan to check your robots.txt, meta tags, and overall AI readiness score.