How is this different from Google Analytics?

Google Analytics shows you traffic. Shadow shows you traffic, AI bot activity, what AI platforms say about your brand, AND tells you what to do about all of it. It's analytics + AI intelligence + action steps in one tool.

Do I need to install anything?

For basic monitoring (bot detection, AI perception, readiness score) — nope, just enter your URL. For full visitor analytics (clicks, behavior, sessions), add one script tag. One-click integrations for Vercel, Shopify, WordPress, and more.

Will it slow down my site?

No. The script is under 5KB and loads async. Zero impact on page speed or Core Web Vitals. External monitoring has literally no impact — it watches from the outside.

What AI bots does Shadow detect?

All of them. GPTBot, ClaudeBot, PerplexityBot, Google-Extended, Bytespider, Amazonbot, and dozens more. The Shadow Network means new bots get identified across all users instantly.

What do you mean by "actionable steps"?

Shadow doesn't just show you graphs. It says things like: "ChatGPT has your pricing wrong — add structured data to /pricing to fix it" or "Your bounce rate on /features is 68% — here's why and what to change." Specific, do-it-today recommendations.

Can Shadow block bots?

Shadow is a telescope, not a shield. It shows you who's visiting and what AI says about you. It generates block rules and robots.txt configs you can apply — but it doesn't intercept traffic.

Yes. Shadow never collects PII. IP addresses are hashed after classification. No cookies on your visitors. All Shadow Network data is anonymized. GDPR compliant by design.

Kotlin · Ktor·9 min read

How to Block AI Bots on Kotlin (Ktor): Complete 2026 Guide

Ktor is JetBrains' Kotlin-first async web framework — coroutine-native, lightweight, and built around a pipeline of interceptable phases. This guide covers every method from resources/static/robots.txt to a production-ready ApplicationPlugin that blocks AI crawlers in the Plugins phase before any route handler runs.

Ktor 2.x and 3.x

This guide targets Ktor 2.3+. The createApplicationPlugin API and ApplicationCallPipeline.Plugins phase are stable in both 2.x and 3.x. The StaticContent plugin was renamed to staticResources / staticFiles in Ktor 2.3 — both variants are shown where they differ.

Methods at a glance

Method	What it does	Blocks JS-less bots?
resources/static/robots.txt	Signals crawlers to stay out	Signal only
GET /robots.txt route	Dynamic robots.txt with env-based rules	Signal only
noai in Thymeleaf/FreeMarker	Opt out of AI training per page	✓ (server-rendered)
X-Robots-Tag header	noai via HTTP header (all responses)	✓ (header)
ApplicationPlugin intercept()	Hard 403 in Plugins phase — before routing	✓
nginx map block	Hard 403 at reverse proxy	✓

1. robots.txt — static resource

Create src/main/resources/static/robots.txt and install the static content plugin. Ktor serves every file under static/ at the root path — no extra routing needed.

src/main/resources/static/robots.txt

User-agent: GPTBot
Disallow: /

User-agent: ChatGPT-User
Disallow: /

User-agent: OAI-SearchBot
Disallow: /

User-agent: ClaudeBot
Disallow: /

User-agent: Claude-Web
Disallow: /

User-agent: anthropic-ai
Disallow: /

User-agent: Google-Extended
Disallow: /

User-agent: Bytespider
Disallow: /

User-agent: CCBot
Disallow: /

User-agent: meta-externalagent
Disallow: /

User-agent: Amazonbot
Disallow: /

User-agent: Applebot-Extended
Disallow: /

User-agent: PerplexityBot
Disallow: /

User-agent: cohere-ai
Disallow: /

User-agent: YouBot
Disallow: /

User-agent: DuckAssistBot
Disallow: /

User-agent: Diffbot
Disallow: /

User-agent: omgilibot
Disallow: /

User-agent: omgili
Disallow: /

User-agent: Webzio
Disallow: /

User-agent: AI2Bot
Disallow: /

User-agent: DeepSeekBot
Disallow: /

User-agent: MistralAI
Disallow: /

User-agent: xAI-Bot
Disallow: /

User-agent: gemini-deep-research
Disallow: /

User-agent: *
Allow: /

Application.kt — wire the static plugin

// Ktor 2.3+ — staticResources DSL
fun Application.configureRouting() {
    routing {
        // Serves src/main/resources/static/ at /
        // robots.txt → accessible at /robots.txt
        staticResources("/", "static")

        // ... your other routes
    }
}

// Ktor 2.0–2.2 — install(StaticContent) + static DSL
// install(StaticContent)
// routing {
//     static("/") {
//         resources("static")
//     }
// }

Static files take precedence over routes

When you serve static/ at /, a request for /robots.txt resolves to the static file before any get("/robots.txt") {} route is checked. If you later add a dynamic route to override robots.txt in staging, remove or guard the static resource first.

2. Dynamic robots.txt with env-based rules

If you want different rules in staging vs production — blocking all bots in staging to prevent accidental crawling — serve robots.txt from a route handler instead of a static file.

// Application.kt
fun Application.configureRouting() {
    val isProd = System.getenv("APP_ENV") == "production"

    routing {
        get("/robots.txt") {
            val content = if (isProd) {
                buildString {
                    appendLine("User-agent: GPTBot")
                    appendLine("Disallow: /")
                    appendLine()
                    appendLine("User-agent: ClaudeBot")
                    appendLine("Disallow: /")
                    appendLine()
                    // ... other AI bots ...
                    appendLine("User-agent: *")
                    appendLine("Allow: /")
                }
            } else {
                // Block everything in non-production
                "User-agent: *
Disallow: /"
            }

            call.respondText(content, ContentType.Text.Plain)
        }
    }
}

3. ApplicationPlugin — hard 403 in the Plugins phase

The idiomatic Ktor way to intercept every request is createApplicationPlugin. Installing it with install(BlockAiBots) adds it to the ApplicationCallPipeline.Plugins phase — this runs before routing resolves, so no route handler executes for a blocked bot.

Always exempt /robots.txt so blocked bots can still read your disallow rules — some will respect them even if they ignore HTTP 403.

// plugins/BlockAiBots.kt
import io.ktor.http.*
import io.ktor.server.application.*
import io.ktor.server.request.*
import io.ktor.server.response.*

private val BLOCKED_UA = Regex(
    "GPTBot|ChatGPT-User|OAI-SearchBot|ClaudeBot|Claude-Web|anthropic-ai" +
    "|Google-Extended|Bytespider|CCBot|meta-externalagent|Amazonbot" +
    "|Applebot-Extended|PerplexityBot|cohere-ai|YouBot|DuckAssistBot" +
    "|Diffbot|omgilibot|omgili|Webzio|AI2Bot|DeepSeekBot|MistralAI" +
    "|xAI-Bot|gemini-deep-research",
    RegexOption.IGNORE_CASE
)

val BlockAiBots = createApplicationPlugin(name = "BlockAiBots") {
    onCall { call ->
        // Always let /robots.txt through
        if (call.request.path() == "/robots.txt") return@onCall

        val ua = call.request.userAgent() ?: return@onCall

        if (BLOCKED_UA.containsMatchIn(ua)) {
            call.respond(HttpStatusCode.Forbidden, "Forbidden")
            // finish() stops the pipeline — no further phases execute
            finish()
        }
    }
}

// Application.kt
fun Application.module() {
    install(BlockAiBots)          // install before routing
    configureRouting()
    configureTemplating()
    // ...
}

Why finish() matters

After call.respond(), Ktor's pipeline would normally continue to the next phase. Calling finish() on the pipeline context terminates all remaining phases immediately — your route handlers, authentication plugins, and response filters never run for the blocked request.

Regex: compile once, reuse forever

Declaring BLOCKED_UA at file level (outside the plugin body) means the regex is compiled exactly once when the class is loaded. If you declared it inside onCall { }, it would recompile on every request — a measurable overhead under load.

4. Direct intercept() — inline variant

If you prefer not to create a named plugin file, you can intercept the pipeline directly in your Application.module() function. The behaviour is identical — same phase, same finish() semantics.

// Application.kt — inline intercept variant
fun Application.module() {
    intercept(ApplicationCallPipeline.Plugins) {
        val path = call.request.path()
        if (path == "/robots.txt") return@intercept

        val ua = call.request.userAgent() ?: return@intercept
        if (BLOCKED_UA.containsMatchIn(ua)) {
            call.respond(HttpStatusCode.Forbidden, "Forbidden")
            finish()
        }
    }

    configureRouting()
    // ...
}

Named plugin vs inline intercept

Named plugins (createApplicationPlugin) are testable in isolation with Ktor's test engine and can be conditionally installed or removed at startup. The inline intercept() approach is simpler but harder to unit-test. For production use, the named plugin is preferred.

5. X-Robots-Tag response header

Add X-Robots-Tag: noai, noimageai to every response using a plugin that runs in the ApplicationCallPipeline.Plugins phase and hooks into the response pipeline.

// plugins/XRobotsTag.kt
val XRobotsTag = createApplicationPlugin(name = "XRobotsTag") {
    onCallRespond { call, _ ->
        call.response.headers.append("X-Robots-Tag", "noai, noimageai")
    }
}

// Application.kt
fun Application.module() {
    install(XRobotsTag)
    install(BlockAiBots)
    configureRouting()
    // ...
}

Header vs meta tag

X-Robots-Tag applies to every response type (HTML, JSON, PDF, images) without touching your templates. A <meta name="robots"> tag only signals for HTML pages and only if the bot processes HTML. For comprehensive coverage, use both.

6. noai meta tag in server-rendered templates

If your Ktor app renders HTML with Thymeleaf or FreeMarker, add the noai meta tag globally in your base layout. Because Ktor renders templates server-side, the tag appears in every HTML response — bots see it without executing JavaScript.

Thymeleaf — base.html

<!DOCTYPE html>
<html xmlns:th="http://www.thymeleaf.org">
<head>
  <meta charset="UTF-8"/>
  <meta name="viewport" content="width=device-width, initial-scale=1.0"/>

  <!-- AI training opt-out — server-rendered, visible to all crawlers -->
  <meta name="robots" content="noai, noimageai"/>

  <!-- Per-page override: th:replace="~{::meta[name='robots']}" in child -->
  <th:block th:replace="${robotsMeta} ?: _"></th:block>

  <title th:replace="${pageTitle} ?: 'My App'">My App</title>
</head>
<body>
  <!-- content -->
</body>
</html>

FreeMarker — base.ftl

<!DOCTYPE html>
<html>
<head>
  <meta charset="UTF-8"/>
  <meta name="viewport" content="width=device-width, initial-scale=1.0"/>

  <!-- Global noai — override per-page by passing robotsContent to model -->
  <meta name="robots" content="${robotsContent!"noai, noimageai"}"/>

  <title>${pageTitle!"My App"}</title>
</head>
<body>
  <#nested>
</body>
</html>

Per-page override

Pass robotsContent = "index, follow" in the route's model map to allow AI training on specific pages (e.g., a public blog post you want indexed). Pages that don't set the variable inherit the global noai default.

7. nginx — hard block at reverse proxy

Running nginx in front of Ktor means blocked bots never reach the JVM process. This is the most efficient option — especially useful if you want to block bots without redeploying your application.

# /etc/nginx/conf.d/myapp.conf
map $http_user_agent $blocked_bot {
    default          0;
    ~*GPTBot         1;
    ~*ChatGPT-User   1;
    ~*OAI-SearchBot  1;
    ~*ClaudeBot      1;
    ~*Claude-Web     1;
    ~*anthropic-ai   1;
    ~*Google-Extended 1;
    ~*Bytespider     1;
    ~*CCBot          1;
    ~*meta-externalagent 1;
    ~*Amazonbot      1;
    ~*Applebot-Extended  1;
    ~*PerplexityBot  1;
    ~*cohere-ai      1;
    ~*YouBot         1;
    ~*DuckAssistBot  1;
    ~*Diffbot        1;
    ~*omgilibot      1;
    ~*omgili         1;
    ~*Webzio         1;
    ~*AI2Bot         1;
    ~*DeepSeekBot    1;
    ~*MistralAI      1;
    ~*xAI-Bot        1;
    ~*gemini-deep-research 1;
}

server {
    listen 80;
    server_name example.com;

    # Always allow robots.txt through
    location = /robots.txt {
        proxy_pass http://127.0.0.1:8080;
    }

    location / {
        if ($blocked_bot) {
            return 403 "Forbidden";
        }

        proxy_pass http://127.0.0.1:8080;  # Ktor default port
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
    }
}

X-Forwarded-For and real IP in Ktor

When nginx proxies to Ktor, call.request.host() returns the proxy IP. Install Ktor's ForwardedHeaders plugin (install(ForwardedHeaders)) to make call.request.origin.remoteHost return the real client IP from X-Forwarded-For.

Deployment comparison

Environment	robots.txt	Hard 403	Notes
VPS + nginx	static resource ✓	nginx map + ApplicationPlugin	Most flexible — nginx blocks before JVM
Docker (fat JAR)	static resource ✓	ApplicationPlugin ✓	No nginx by default — plugin is your only guard
Docker + nginx sidecar	static resource ✓	nginx map ✓	nginx as reverse proxy in same Compose stack
Heroku / Railway	static resource ✓	ApplicationPlugin ✓	No nginx layer — plugin is required
Fly.io	static resource ✓	ApplicationPlugin ✓	Add nginx via fly.toml [processes] for proxy layer
Kubernetes + nginx Ingress	static resource ✓	nginx Ingress annotation ✓	server-snippet annotation with map block
AWS App Runner	static resource ✓	ApplicationPlugin ✓	Add CloudFront + WAF rule for edge blocking
Google Cloud Run	static resource ✓	ApplicationPlugin ✓	Add Cloud Armor security policy for edge blocking

build.gradle.kts — relevant dependencies

No external libraries are needed for the blocking plugin — it uses only Ktor core. The regex is Kotlin stdlib. Add templating engines only if you are rendering HTML server-side.

// build.gradle.kts
val ktor_version = "2.3.10"  // or 3.x when stable
val kotlin_version = "1.9.24"

dependencies {
    implementation("io.ktor:ktor-server-core-jvm:$ktor_version")
    implementation("io.ktor:ktor-server-netty-jvm:$ktor_version")

    // Static file serving
    // No extra dependency — included in ktor-server-core

    // Forwarded headers (real IP behind nginx proxy)
    implementation("io.ktor:ktor-server-forwarded-header-jvm:$ktor_version")

    // Templating — choose one
    implementation("io.ktor:ktor-server-thymeleaf-jvm:$ktor_version")
    // implementation("io.ktor:ktor-server-freemarker-jvm:$ktor_version")

    testImplementation("io.ktor:ktor-server-test-host-jvm:$ktor_version")
    testImplementation("org.jetbrains.kotlin:kotlin-test-junit:$kotlin_version")
}

Testing the plugin

Use Ktor's test engine to verify the plugin in unit tests without starting a real server:

testApplication {
    application {
        install(BlockAiBots)
        routing { get("/") { call.respondText("OK") } }
    }
    val response = client.get("/") {
        header(HttpHeaders.UserAgent, "GPTBot/1.0")
    }
    assertEquals(HttpStatusCode.Forbidden, response.status)
}

FAQ

How do I serve robots.txt in Ktor?

Place robots.txt in src/main/resources/static/ and call staticResources("/", "static") in your routing block. Ktor serves every file in the static/ resource directory at the root path — no additional routing configuration is required.

What is the idiomatic way to block AI bots in Ktor?

Create a named ApplicationPlugin with createApplicationPlugin("BlockAiBots") and use the onCall hook to intercept requests. This runs in the Plugins phase — before routing resolves — and calling finish() stops all further pipeline execution for matched bots.

What is the difference between ApplicationCallPipeline.Plugins and ApplicationCallPipeline.Call?

Plugins phase runs before routing resolves and is the correct phase for cross-cutting concerns like bot blocking. Call phase runs inside the route handler context, after routing has already resolved the handler. Intercepting in Plugins means your bot check runs for every request regardless of which route handles it.

Does robots.txt work to block AI bots on Ktor?

robots.txt signals crawlers which paths to avoid, but most AI training bots ignore it. For guaranteed blocking you need server-level enforcement — either the Ktor ApplicationPlugin intercept approach (returning 403) or an nginx map block in front of Ktor.

Should I use createApplicationPlugin or intercept() directly?

createApplicationPlugin is preferred in production because named plugins are independently testable with Ktor's test engine and can be conditionally installed at startup. The direct intercept() call in module() achieves the same result but is harder to test in isolation.

Does the Kotlin Regex need any flags for case-insensitive matching?

Yes — pass RegexOption.IGNORE_CASE when constructing the Regex. Without it, Kotlin regex is case-sensitive by default and a bot sending its User-Agent as 'gptbot' instead of 'GPTBot' would slip through. Most AI bot crawlers send their names in the documented casing, but defensive matching with IGNORE_CASE costs nothing.

Is your site protected from AI bots?

Run a free scan to check your robots.txt, meta tags, and overall AI readiness score.