Skip to content
openshadow.io/guides/blocking-ai-bots-aspnet-core

How to Block AI Bots on ASP.NET Core: Complete 2026 Guide

ASP.NET Core's UseStaticFiles() middleware serves robots.txt from wwwroot/ automatically — no controller needed. For hard blocking, implement a custom IMiddleware class registered before UseRouting() in Program.cs. For IIS and Azure App Service, use web.config URL rewrite rules to block at the IIS layer before Kestrel.

9 min read·Updated April 2026·ASP.NET Core 8+ / .NET 8+

Methods overview

Method
robots.txt in wwwroot/

Always — UseStaticFiles() serves it automatically

Minimal API endpoint (dynamic rules)

Need staging vs production rules

IMiddleware hard block (403)

Kestrel-hosted apps

X-Robots-Tag response header

Complement to robots.txt

noai meta tag in _Layout.cshtml

Razor Pages or MVC views

web.config IIS rewrite rule

IIS / Azure App Service hosting

nginx reverse proxy block

nginx in front of Kestrel (Linux)

AI bots to block

Bot
GPTBot
ChatGPT-User
OAI-SearchBot
ClaudeBot
anthropic-ai
Google-Extended
Bytespider
CCBot
PerplexityBot
meta-externalagent
Amazonbot
Applebot-Extended
xAI-Bot
DeepSeekBot
MistralBot
Diffbot
cohere-ai
AI2Bot
Ai2Bot-Dolma
YouBot
DuckAssistBot
omgili
omgilibot
webzio-extended
gemini-deep-research

1. robots.txt in wwwroot/

Every ASP.NET Core template calls app.UseStaticFiles() in Program.cs — this middleware serves all files in wwwroot/ at the root URL. No controller, no endpoint, no configuration needed. Just create the file.

wwwroot/robots.txtserved at /robots.txt automatically
User-agent: GPTBot
Disallow: /

User-agent: ChatGPT-User
Disallow: /

User-agent: OAI-SearchBot
Disallow: /

User-agent: ClaudeBot
Disallow: /

User-agent: anthropic-ai
Disallow: /

User-agent: Google-Extended
Disallow: /

User-agent: Bytespider
Disallow: /

User-agent: CCBot
Disallow: /

User-agent: PerplexityBot
Disallow: /

User-agent: meta-externalagent
Disallow: /

User-agent: Amazonbot
Disallow: /

User-agent: Applebot-Extended
Disallow: /

User-agent: xAI-Bot
Disallow: /

User-agent: DeepSeekBot
Disallow: /

User-agent: MistralBot
Disallow: /

User-agent: Diffbot
Disallow: /

User-agent: cohere-ai
Disallow: /

User-agent: AI2Bot
Disallow: /

User-agent: Ai2Bot-Dolma
Disallow: /

User-agent: YouBot
Disallow: /

User-agent: DuckAssistBot
Disallow: /

User-agent: omgili
Disallow: /

User-agent: omgilibot
Disallow: /

User-agent: webzio-extended
Disallow: /

User-agent: gemini-deep-research
Disallow: /

User-agent: *
Allow: /

Build action: Set wwwroot/robots.txt to Content with Copy to Output Directory: Copy if newer in Visual Studio, or add <Content Include="wwwroot/robots.txt" CopyToOutputDirectory="PreserveNewest" /> in your .csproj if it's not being copied on publish.

2. Minimal API endpoint (dynamic rules)

Use a Minimal API endpoint when you need different rules per environment — block all crawlers in staging, block only AI bots in production. Register the endpoint before app.Run() in Program.cs.

Program.cs (Minimal API)
var builder = WebApplication.CreateBuilder(args);
builder.Services.AddRazorPages(); // or AddControllersWithViews()

var app = builder.Build();

app.UseStaticFiles();

// Dynamic robots.txt — environment-aware
app.MapGet("/robots.txt", (IHostEnvironment env) =>
{
    const string aiBotsDisallow = @"User-agent: GPTBot
Disallow: /

User-agent: ChatGPT-User
Disallow: /

User-agent: OAI-SearchBot
Disallow: /

User-agent: ClaudeBot
Disallow: /

User-agent: anthropic-ai
Disallow: /

User-agent: Google-Extended
Disallow: /

User-agent: Bytespider
Disallow: /

User-agent: CCBot
Disallow: /

User-agent: PerplexityBot
Disallow: /

User-agent: Diffbot
Disallow: /

User-agent: *
Allow: /";

    const string blockAll = @"User-agent: *
Disallow: /";

    var body = env.IsProduction() ? aiBotsDisallow : blockAll;
    return Results.Text(body, "text/plain");
});

app.UseRouting();
app.MapRazorPages();

app.Run();

Static vs endpoint: If both wwwroot/robots.txt and a MapGet("/robots.txt", ...) endpoint exist, the static file middleware runs first and serves the static file — the endpoint is never reached. Remove one or the other.

3. IMiddleware hard block (403)

Custom middleware intercepts every request before routing. Return a 403 immediately when a known AI bot user agent is detected. Register it after UseStaticFiles() so bots can still fetch robots.txt, but before UseRouting() so no controller ever runs.

Step 1 — write the middleware class

Middleware/AiBotBlockingMiddleware.cs
using System.Text.RegularExpressions;

namespace YourApp.Middleware;

public class AiBotBlockingMiddleware : IMiddleware
{
    private static readonly Regex BlockedUAs = new(
        @"GPTBot|ChatGPT-User|OAI-SearchBot|ClaudeBot|anthropic-ai|" +
        @"Google-Extended|Bytespider|CCBot|PerplexityBot|meta-externalagent|" +
        @"Amazonbot|Applebot-Extended|xAI-Bot|DeepSeekBot|MistralBot|Diffbot|" +
        @"cohere-ai|AI2Bot|Ai2Bot-Dolma|YouBot|DuckAssistBot|omgili|omgilibot|" +
        @"webzio-extended|gemini-deep-research",
        RegexOptions.IgnoreCase | RegexOptions.Compiled
    );

    public async Task InvokeAsync(HttpContext context, RequestDelegate next)
    {
        // Always allow robots.txt through so bots can read your opt-out
        if (context.Request.Path.StartsWithSegments("/robots.txt"))
        {
            await next(context);
            return;
        }

        var userAgent = context.Request.Headers.UserAgent.ToString();
        if (!string.IsNullOrEmpty(userAgent) && BlockedUAs.IsMatch(userAgent))
        {
            context.Response.StatusCode = StatusCodes.Status403Forbidden;
            context.Response.ContentType = "text/plain";
            await context.Response.WriteAsync("Forbidden");
            return;
        }

        await next(context);
    }
}

Step 2 — register in Program.cs

Program.cs
using YourApp.Middleware;

var builder = WebApplication.CreateBuilder(args);
builder.Services.AddRazorPages();

// Register as transient so IMiddleware DI injection works
builder.Services.AddTransient<AiBotBlockingMiddleware>();

var app = builder.Build();

// Pipeline order matters:
app.UseStaticFiles();                              // 1. Serve wwwroot/ (robots.txt gets through)
app.UseMiddleware<AiBotBlockingMiddleware>();      // 2. Block AI bots
app.UseRouting();                                  // 3. Route the request
app.UseAuthentication();
app.UseAuthorization();
app.MapRazorPages();                               // 4. Controllers / Razor Pages

app.Run();

Pipeline order is critical. If you call UseMiddleware<AiBotBlockingMiddleware>() after UseRouting(), the request has already been matched to an endpoint and some work has been done. Always place bot-blocking middleware before UseRouting().

Inline middleware (no class)

For quick prototyping, use an inline delegate instead of a class. Less testable, same effect:

Program.cs (inline)
using System.Text.RegularExpressions;

// After app.UseStaticFiles(), before app.UseRouting()
app.Use(async (context, next) =>
{
    if (!context.Request.Path.StartsWithSegments("/robots.txt"))
    {
        var ua = context.Request.Headers.UserAgent.ToString();
        if (Regex.IsMatch(ua,
            @"GPTBot|ClaudeBot|anthropic-ai|Google-Extended|Bytespider|CCBot|" +
            @"PerplexityBot|meta-externalagent|Diffbot|xAI-Bot|DeepSeekBot",
            RegexOptions.IgnoreCase))
        {
            context.Response.StatusCode = 403;
            await context.Response.WriteAsync("Forbidden");
            return;
        }
    }
    await next(context);
});

4. X-Robots-Tag response header

The X-Robots-Tag: noai, noimageai header tells compliant crawlers not to use your content for AI training. Add it globally via middleware, or per-endpoint via a filter.

Global — via middleware

Program.cs
// Add after UseStaticFiles, before UseRouting
app.Use(async (context, next) =>
{
    context.Response.OnStarting(() =>
    {
        // Only add to HTML responses
        if (context.Response.ContentType?.Contains("text/html") == true)
        {
            context.Response.Headers["X-Robots-Tag"] = "noai, noimageai";
        }
        return Task.CompletedTask;
    });
    await next(context);
});

Per-controller — via ActionFilter (MVC)

Filters/NoAiHeaderFilter.cs
public class NoAiHeaderFilter : IActionFilter
{
    public void OnActionExecuting(ActionExecutingContext context) { }

    public void OnActionExecuted(ActionExecutedContext context)
    {
        context.HttpContext.Response.Headers["X-Robots-Tag"] = "noai, noimageai";
    }
}

// Apply to a controller:
[ServiceFilter(typeof(NoAiHeaderFilter))]
public class HomeController : Controller { ... }

// Or register globally in Program.cs:
builder.Services.AddControllersWithViews(options =>
    options.Filters.Add<NoAiHeaderFilter>()
);

5. noai meta tag in _Layout.cshtml

Add the noai and noimageai meta tags to your shared layout. Use an @section Head override for pages where you want different behaviour (e.g., public blog posts with different rules than private admin pages).

_Layout.cshtml — global noai in <head>

Views/Shared/_Layout.cshtml
<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="utf-8" />
    <meta name="viewport" content="width=device-width, initial-scale=1.0" />
    <title>@ViewData["Title"]</title>

    @* Default: block AI training on all pages *@
    @if (IsSectionDefined("Head"))
    {
        @RenderSection("Head", required: false)
    }
    else
    {
        <meta name="robots" content="noai, noimageai" />
    }

    <link rel="stylesheet" href="~/css/site.css" asp-append-version="true" />
</head>
<body>
    @RenderBody()
</body>
</html>

Per-page override — allow indexing on specific pages

Views/Home/Index.cshtml
@{
    ViewData["Title"] = "Home";
}

@section Head {
    @* This page allows AI indexing but not training *@
    <meta name="robots" content="noimageai" />
}

<h1>Welcome</h1>

Razor Pages — Pages/Shared/_Layout.cshtml

Same pattern works for Razor Pages. The layout file lives at Pages/Shared/_Layout.cshtml. Override per-page via @section Head {} in any .cshtml page file.

Pages/Index.cshtml (Razor Pages)
@page
@model IndexModel

@section Head {
    <meta name="robots" content="index, follow" />
    @* No noai — this page allows AI discovery *@
}

<h1>Home</h1>

6. web.config IIS rewrite rule

When hosting on IIS or Azure App Service, use a web.config URL rewrite rule to block at the IIS layer — before the request ever reaches Kestrel. The IIS URL Rewrite Module must be installed (it is available by default on Azure App Service).

web.config
<?xml version="1.0" encoding="utf-8"?>
<configuration>
  <system.webServer>

    <rewrite>
      <rules>
        <!-- Block AI bots — return 403 before Kestrel -->
        <rule name="BlockAiBots" stopProcessing="true">
          <match url=".*" />
          <conditions>
            <!-- Skip robots.txt — let bots read your opt-out -->
            <add input="{REQUEST_URI}" pattern="^/robots.txt$" negate="true" />
            <add input="{HTTP_USER_AGENT}" pattern="GPTBot|ChatGPT-User|OAI-SearchBot|ClaudeBot|anthropic-ai|Google-Extended|Bytespider|CCBot|PerplexityBot|meta-externalagent|Amazonbot|Applebot-Extended|xAI-Bot|DeepSeekBot|MistralBot|Diffbot|cohere-ai|AI2Bot|YouBot|DuckAssistBot|omgili|webzio-extended|gemini-deep-research" />
          </conditions>
          <action type="CustomResponse" statusCode="403" statusReason="Forbidden" statusDescription="Forbidden" />
        </rule>
      </rules>
    </rewrite>

    <!-- Required for ASP.NET Core on IIS -->
    <handlers>
      <add name="aspNetCore" path="*" verb="*" modules="AspNetCoreModuleV2" resourceType="Unspecified" />
    </handlers>
    <aspNetCore processPath="dotnet" arguments=".YourApp.dll"
                stdoutLogEnabled="false" stdoutLogFile=".logsstdout"
                hostingModel="inprocess" />

  </system.webServer>
</configuration>

Azure App Service: Azure App Service runs IIS/ARR in front of your app. The web.config rewrite rule blocks at the ARR layer. Alternatively, deploy Azure Front Door (WAF rules) or put Cloudflare in front of App Service for managed, no-code bot blocking.

7. nginx reverse proxy block

On Linux (self-hosted or containers), nginx typically sits in front of Kestrel as a reverse proxy. Add user-agent blocking in the nginx config — the request never reaches your .NET process.

/etc/nginx/sites-available/yourapp
# Map user-agent to a block flag
map $http_user_agent $block_ai_bot {
    default                 0;
    ~*GPTBot                1;
    ~*ChatGPT-User          1;
    ~*OAI-SearchBot         1;
    ~*ClaudeBot             1;
    ~*anthropic-ai          1;
    ~*Google-Extended       1;
    ~*Bytespider            1;
    ~*CCBot                 1;
    ~*PerplexityBot         1;
    ~*meta-externalagent    1;
    ~*Amazonbot             1;
    ~*Applebot-Extended     1;
    ~*xAI-Bot               1;
    ~*DeepSeekBot           1;
    ~*MistralBot            1;
    ~*Diffbot               1;
    ~*cohere-ai             1;
    ~*AI2Bot                1;
    ~*YouBot                1;
    ~*DuckAssistBot         1;
    ~*omgili                1;
    ~*webzio-extended       1;
    ~*gemini-deep-research  1;
}

server {
    listen 443 ssl;
    server_name yourapp.com;

    ssl_certificate     /etc/letsencrypt/live/yourapp.com/fullchain.pem;
    ssl_certificate_key /etc/letsencrypt/live/yourapp.com/privkey.pem;

    # Allow robots.txt even for blocked bots
    location = /robots.txt {
        proxy_pass http://127.0.0.1:5000;
    }

    location / {
        # Block matched AI bots
        if ($block_ai_bot) {
            return 403 "Forbidden";
        }

        proxy_pass         http://127.0.0.1:5000;
        proxy_http_version 1.1;
        proxy_set_header   Upgrade $http_upgrade;
        proxy_set_header   Connection keep-alive;
        proxy_set_header   Host $host;
        proxy_cache_bypass $http_upgrade;
        proxy_set_header   X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header   X-Forwarded-Proto $scheme;
    }
}

Kestrel port: ASP.NET Core apps listen on port 5000 (HTTP) and 5001 (HTTPS) by default. Set the port via ASPNETCORE_URLS or in appsettings.json under Kestrel:Endpoints.

Hosting comparison

Hosting
IIS (Windows)
Azure App Service
Linux + nginx
Linux / self-hosted (no nginx)
Docker / Kubernetes
Azure Container Apps

Frequently asked questions

Where does robots.txt go in an ASP.NET Core project?

In wwwroot/. The UseStaticFiles() middleware (called in every default project template) automatically serves everything in wwwroot/ at the root URL — robots.txt in wwwroot/ is available at /robots.txt with zero additional config.

What is the correct middleware order for AI bot blocking?

UseStaticFiles() → UseMiddleware<AiBotBlockingMiddleware>() → UseRouting() → UseAuthentication() → UseAuthorization() → MapControllers(). Static files run first (so robots.txt is served to bots), then your blocker runs, then routing. Never put the blocker after UseRouting() — routing will have already matched the request.

How do I block AI bots on Azure App Service?

Add a web.config with a <rewrite> rule under <system.webServer> matching AI bot user agents and returning a 403. Azure App Service has the IIS URL Rewrite Module pre-installed. Alternatively, put Azure Front Door (WAF policy) or Cloudflare in front of your App Service.

Do I need both robots.txt and the middleware?

Use both for defence in depth. robots.txt is a polite signal — compliant bots respect it, but bad actors ignore it. The IMiddleware hard block returns 403 regardless of whether the bot respects robots.txt. Together they cover compliant bots and ignore-the-rules scrapers.

Does this work for Blazor Server and Blazor WebAssembly?

Yes, with caveats. Blazor Server runs on ASP.NET Core — all methods apply. For Blazor WebAssembly, the app is pure client-side; host-level methods (wwwroot/robots.txt, nginx, web.config) work as normal. IMiddleware blocks the initial HTML request but not subsequent SignalR/WebSocket frames for Blazor Server.

Is your site protected from AI bots?

Run a free scan to check your robots.txt, meta tags, and overall AI readiness score.

Related Guides