ASP.NET Core's UseStaticFiles() middleware serves robots.txt from wwwroot/ automatically — no controller needed. For hard blocking, implement a custom IMiddleware class registered before UseRouting() in Program.cs. For IIS and Azure App Service, use web.config URL rewrite rules to block at the IIS layer before Kestrel.
| Method |
|---|
| robots.txt in wwwroot/ Always — UseStaticFiles() serves it automatically |
| Minimal API endpoint (dynamic rules) Need staging vs production rules |
| IMiddleware hard block (403) Kestrel-hosted apps |
| X-Robots-Tag response header Complement to robots.txt |
| noai meta tag in _Layout.cshtml Razor Pages or MVC views |
| web.config IIS rewrite rule IIS / Azure App Service hosting |
| nginx reverse proxy block nginx in front of Kestrel (Linux) |
| Bot |
|---|
| GPTBot |
| ChatGPT-User |
| OAI-SearchBot |
| ClaudeBot |
| anthropic-ai |
| Google-Extended |
| Bytespider |
| CCBot |
| PerplexityBot |
| meta-externalagent |
| Amazonbot |
| Applebot-Extended |
| xAI-Bot |
| DeepSeekBot |
| MistralBot |
| Diffbot |
| cohere-ai |
| AI2Bot |
| Ai2Bot-Dolma |
| YouBot |
| DuckAssistBot |
| omgili |
| omgilibot |
| webzio-extended |
| gemini-deep-research |
Every ASP.NET Core template calls app.UseStaticFiles() in Program.cs — this middleware serves all files in wwwroot/ at the root URL. No controller, no endpoint, no configuration needed. Just create the file.
User-agent: GPTBot Disallow: / User-agent: ChatGPT-User Disallow: / User-agent: OAI-SearchBot Disallow: / User-agent: ClaudeBot Disallow: / User-agent: anthropic-ai Disallow: / User-agent: Google-Extended Disallow: / User-agent: Bytespider Disallow: / User-agent: CCBot Disallow: / User-agent: PerplexityBot Disallow: / User-agent: meta-externalagent Disallow: / User-agent: Amazonbot Disallow: / User-agent: Applebot-Extended Disallow: / User-agent: xAI-Bot Disallow: / User-agent: DeepSeekBot Disallow: / User-agent: MistralBot Disallow: / User-agent: Diffbot Disallow: / User-agent: cohere-ai Disallow: / User-agent: AI2Bot Disallow: / User-agent: Ai2Bot-Dolma Disallow: / User-agent: YouBot Disallow: / User-agent: DuckAssistBot Disallow: / User-agent: omgili Disallow: / User-agent: omgilibot Disallow: / User-agent: webzio-extended Disallow: / User-agent: gemini-deep-research Disallow: / User-agent: * Allow: /
Build action: Set wwwroot/robots.txt to Content with Copy to Output Directory: Copy if newer in Visual Studio, or add <Content Include="wwwroot/robots.txt" CopyToOutputDirectory="PreserveNewest" /> in your .csproj if it's not being copied on publish.
Use a Minimal API endpoint when you need different rules per environment — block all crawlers in staging, block only AI bots in production. Register the endpoint before app.Run() in Program.cs.
var builder = WebApplication.CreateBuilder(args);
builder.Services.AddRazorPages(); // or AddControllersWithViews()
var app = builder.Build();
app.UseStaticFiles();
// Dynamic robots.txt — environment-aware
app.MapGet("/robots.txt", (IHostEnvironment env) =>
{
const string aiBotsDisallow = @"User-agent: GPTBot
Disallow: /
User-agent: ChatGPT-User
Disallow: /
User-agent: OAI-SearchBot
Disallow: /
User-agent: ClaudeBot
Disallow: /
User-agent: anthropic-ai
Disallow: /
User-agent: Google-Extended
Disallow: /
User-agent: Bytespider
Disallow: /
User-agent: CCBot
Disallow: /
User-agent: PerplexityBot
Disallow: /
User-agent: Diffbot
Disallow: /
User-agent: *
Allow: /";
const string blockAll = @"User-agent: *
Disallow: /";
var body = env.IsProduction() ? aiBotsDisallow : blockAll;
return Results.Text(body, "text/plain");
});
app.UseRouting();
app.MapRazorPages();
app.Run();Static vs endpoint: If both wwwroot/robots.txt and a MapGet("/robots.txt", ...) endpoint exist, the static file middleware runs first and serves the static file — the endpoint is never reached. Remove one or the other.
Custom middleware intercepts every request before routing. Return a 403 immediately when a known AI bot user agent is detected. Register it after UseStaticFiles() so bots can still fetch robots.txt, but before UseRouting() so no controller ever runs.
using System.Text.RegularExpressions;
namespace YourApp.Middleware;
public class AiBotBlockingMiddleware : IMiddleware
{
private static readonly Regex BlockedUAs = new(
@"GPTBot|ChatGPT-User|OAI-SearchBot|ClaudeBot|anthropic-ai|" +
@"Google-Extended|Bytespider|CCBot|PerplexityBot|meta-externalagent|" +
@"Amazonbot|Applebot-Extended|xAI-Bot|DeepSeekBot|MistralBot|Diffbot|" +
@"cohere-ai|AI2Bot|Ai2Bot-Dolma|YouBot|DuckAssistBot|omgili|omgilibot|" +
@"webzio-extended|gemini-deep-research",
RegexOptions.IgnoreCase | RegexOptions.Compiled
);
public async Task InvokeAsync(HttpContext context, RequestDelegate next)
{
// Always allow robots.txt through so bots can read your opt-out
if (context.Request.Path.StartsWithSegments("/robots.txt"))
{
await next(context);
return;
}
var userAgent = context.Request.Headers.UserAgent.ToString();
if (!string.IsNullOrEmpty(userAgent) && BlockedUAs.IsMatch(userAgent))
{
context.Response.StatusCode = StatusCodes.Status403Forbidden;
context.Response.ContentType = "text/plain";
await context.Response.WriteAsync("Forbidden");
return;
}
await next(context);
}
}using YourApp.Middleware; var builder = WebApplication.CreateBuilder(args); builder.Services.AddRazorPages(); // Register as transient so IMiddleware DI injection works builder.Services.AddTransient<AiBotBlockingMiddleware>(); var app = builder.Build(); // Pipeline order matters: app.UseStaticFiles(); // 1. Serve wwwroot/ (robots.txt gets through) app.UseMiddleware<AiBotBlockingMiddleware>(); // 2. Block AI bots app.UseRouting(); // 3. Route the request app.UseAuthentication(); app.UseAuthorization(); app.MapRazorPages(); // 4. Controllers / Razor Pages app.Run();
Pipeline order is critical. If you call UseMiddleware<AiBotBlockingMiddleware>() after UseRouting(), the request has already been matched to an endpoint and some work has been done. Always place bot-blocking middleware before UseRouting().
For quick prototyping, use an inline delegate instead of a class. Less testable, same effect:
using System.Text.RegularExpressions;
// After app.UseStaticFiles(), before app.UseRouting()
app.Use(async (context, next) =>
{
if (!context.Request.Path.StartsWithSegments("/robots.txt"))
{
var ua = context.Request.Headers.UserAgent.ToString();
if (Regex.IsMatch(ua,
@"GPTBot|ClaudeBot|anthropic-ai|Google-Extended|Bytespider|CCBot|" +
@"PerplexityBot|meta-externalagent|Diffbot|xAI-Bot|DeepSeekBot",
RegexOptions.IgnoreCase))
{
context.Response.StatusCode = 403;
await context.Response.WriteAsync("Forbidden");
return;
}
}
await next(context);
});The X-Robots-Tag: noai, noimageai header tells compliant crawlers not to use your content for AI training. Add it globally via middleware, or per-endpoint via a filter.
// Add after UseStaticFiles, before UseRouting
app.Use(async (context, next) =>
{
context.Response.OnStarting(() =>
{
// Only add to HTML responses
if (context.Response.ContentType?.Contains("text/html") == true)
{
context.Response.Headers["X-Robots-Tag"] = "noai, noimageai";
}
return Task.CompletedTask;
});
await next(context);
});public class NoAiHeaderFilter : IActionFilter
{
public void OnActionExecuting(ActionExecutingContext context) { }
public void OnActionExecuted(ActionExecutedContext context)
{
context.HttpContext.Response.Headers["X-Robots-Tag"] = "noai, noimageai";
}
}
// Apply to a controller:
[ServiceFilter(typeof(NoAiHeaderFilter))]
public class HomeController : Controller { ... }
// Or register globally in Program.cs:
builder.Services.AddControllersWithViews(options =>
options.Filters.Add<NoAiHeaderFilter>()
);Add the noai and noimageai meta tags to your shared layout. Use an @section Head override for pages where you want different behaviour (e.g., public blog posts with different rules than private admin pages).
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<title>@ViewData["Title"]</title>
@* Default: block AI training on all pages *@
@if (IsSectionDefined("Head"))
{
@RenderSection("Head", required: false)
}
else
{
<meta name="robots" content="noai, noimageai" />
}
<link rel="stylesheet" href="~/css/site.css" asp-append-version="true" />
</head>
<body>
@RenderBody()
</body>
</html>@{
ViewData["Title"] = "Home";
}
@section Head {
@* This page allows AI indexing but not training *@
<meta name="robots" content="noimageai" />
}
<h1>Welcome</h1>Same pattern works for Razor Pages. The layout file lives at Pages/Shared/_Layout.cshtml. Override per-page via @section Head {} in any .cshtml page file.
@page
@model IndexModel
@section Head {
<meta name="robots" content="index, follow" />
@* No noai — this page allows AI discovery *@
}
<h1>Home</h1>When hosting on IIS or Azure App Service, use a web.config URL rewrite rule to block at the IIS layer — before the request ever reaches Kestrel. The IIS URL Rewrite Module must be installed (it is available by default on Azure App Service).
<?xml version="1.0" encoding="utf-8"?>
<configuration>
<system.webServer>
<rewrite>
<rules>
<!-- Block AI bots — return 403 before Kestrel -->
<rule name="BlockAiBots" stopProcessing="true">
<match url=".*" />
<conditions>
<!-- Skip robots.txt — let bots read your opt-out -->
<add input="{REQUEST_URI}" pattern="^/robots.txt$" negate="true" />
<add input="{HTTP_USER_AGENT}" pattern="GPTBot|ChatGPT-User|OAI-SearchBot|ClaudeBot|anthropic-ai|Google-Extended|Bytespider|CCBot|PerplexityBot|meta-externalagent|Amazonbot|Applebot-Extended|xAI-Bot|DeepSeekBot|MistralBot|Diffbot|cohere-ai|AI2Bot|YouBot|DuckAssistBot|omgili|webzio-extended|gemini-deep-research" />
</conditions>
<action type="CustomResponse" statusCode="403" statusReason="Forbidden" statusDescription="Forbidden" />
</rule>
</rules>
</rewrite>
<!-- Required for ASP.NET Core on IIS -->
<handlers>
<add name="aspNetCore" path="*" verb="*" modules="AspNetCoreModuleV2" resourceType="Unspecified" />
</handlers>
<aspNetCore processPath="dotnet" arguments=".YourApp.dll"
stdoutLogEnabled="false" stdoutLogFile=".logsstdout"
hostingModel="inprocess" />
</system.webServer>
</configuration>Azure App Service: Azure App Service runs IIS/ARR in front of your app. The web.config rewrite rule blocks at the ARR layer. Alternatively, deploy Azure Front Door (WAF rules) or put Cloudflare in front of App Service for managed, no-code bot blocking.
On Linux (self-hosted or containers), nginx typically sits in front of Kestrel as a reverse proxy. Add user-agent blocking in the nginx config — the request never reaches your .NET process.
# Map user-agent to a block flag
map $http_user_agent $block_ai_bot {
default 0;
~*GPTBot 1;
~*ChatGPT-User 1;
~*OAI-SearchBot 1;
~*ClaudeBot 1;
~*anthropic-ai 1;
~*Google-Extended 1;
~*Bytespider 1;
~*CCBot 1;
~*PerplexityBot 1;
~*meta-externalagent 1;
~*Amazonbot 1;
~*Applebot-Extended 1;
~*xAI-Bot 1;
~*DeepSeekBot 1;
~*MistralBot 1;
~*Diffbot 1;
~*cohere-ai 1;
~*AI2Bot 1;
~*YouBot 1;
~*DuckAssistBot 1;
~*omgili 1;
~*webzio-extended 1;
~*gemini-deep-research 1;
}
server {
listen 443 ssl;
server_name yourapp.com;
ssl_certificate /etc/letsencrypt/live/yourapp.com/fullchain.pem;
ssl_certificate_key /etc/letsencrypt/live/yourapp.com/privkey.pem;
# Allow robots.txt even for blocked bots
location = /robots.txt {
proxy_pass http://127.0.0.1:5000;
}
location / {
# Block matched AI bots
if ($block_ai_bot) {
return 403 "Forbidden";
}
proxy_pass http://127.0.0.1:5000;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection keep-alive;
proxy_set_header Host $host;
proxy_cache_bypass $http_upgrade;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
}
}Kestrel port: ASP.NET Core apps listen on port 5000 (HTTP) and 5001 (HTTPS) by default. Set the port via ASPNETCORE_URLS or in appsettings.json under Kestrel:Endpoints.
| Hosting |
|---|
| IIS (Windows) |
| Azure App Service |
| Linux + nginx |
| Linux / self-hosted (no nginx) |
| Docker / Kubernetes |
| Azure Container Apps |
In wwwroot/. The UseStaticFiles() middleware (called in every default project template) automatically serves everything in wwwroot/ at the root URL — robots.txt in wwwroot/ is available at /robots.txt with zero additional config.
UseStaticFiles() → UseMiddleware<AiBotBlockingMiddleware>() → UseRouting() → UseAuthentication() → UseAuthorization() → MapControllers(). Static files run first (so robots.txt is served to bots), then your blocker runs, then routing. Never put the blocker after UseRouting() — routing will have already matched the request.
Add a web.config with a <rewrite> rule under <system.webServer> matching AI bot user agents and returning a 403. Azure App Service has the IIS URL Rewrite Module pre-installed. Alternatively, put Azure Front Door (WAF policy) or Cloudflare in front of your App Service.
Use both for defence in depth. robots.txt is a polite signal — compliant bots respect it, but bad actors ignore it. The IMiddleware hard block returns 403 regardless of whether the bot respects robots.txt. Together they cover compliant bots and ignore-the-rules scrapers.
Yes, with caveats. Blazor Server runs on ASP.NET Core — all methods apply. For Blazor WebAssembly, the app is pure client-side; host-level methods (wwwroot/robots.txt, nginx, web.config) work as normal. IMiddleware blocks the initial HTML request but not subsequent SignalR/WebSocket frames for Blazor Server.
Is your site protected from AI bots?
Run a free scan to check your robots.txt, meta tags, and overall AI readiness score.