How to Block AI Bots on Kotlin (Ktor): Complete 2026 Guide
Ktor is JetBrains' Kotlin-first async web framework — coroutine-native, lightweight, and built around a pipeline of interceptable phases. This guide covers every method from resources/static/robots.txt to a production-ready ApplicationPlugin that blocks AI crawlers in the Plugins phase before any route handler runs.
Ktor 2.x and 3.x
This guide targets Ktor 2.3+. The createApplicationPlugin API and ApplicationCallPipeline.Plugins phase are stable in both 2.x and 3.x. The StaticContent plugin was renamed to staticResources / staticFiles in Ktor 2.3 — both variants are shown where they differ.
Methods at a glance
| Method | What it does | Blocks JS-less bots? |
|---|---|---|
| resources/static/robots.txt | Signals crawlers to stay out | Signal only |
| GET /robots.txt route | Dynamic robots.txt with env-based rules | Signal only |
| noai in Thymeleaf/FreeMarker | Opt out of AI training per page | ✓ (server-rendered) |
| X-Robots-Tag header | noai via HTTP header (all responses) | ✓ (header) |
| ApplicationPlugin intercept() | Hard 403 in Plugins phase — before routing | ✓ |
| nginx map block | Hard 403 at reverse proxy | ✓ |
1. robots.txt — static resource
Create src/main/resources/static/robots.txt and install the static content plugin. Ktor serves every file under static/ at the root path — no extra routing needed.
src/main/resources/static/robots.txt
User-agent: GPTBot
Disallow: /
User-agent: ChatGPT-User
Disallow: /
User-agent: OAI-SearchBot
Disallow: /
User-agent: ClaudeBot
Disallow: /
User-agent: Claude-Web
Disallow: /
User-agent: anthropic-ai
Disallow: /
User-agent: Google-Extended
Disallow: /
User-agent: Bytespider
Disallow: /
User-agent: CCBot
Disallow: /
User-agent: meta-externalagent
Disallow: /
User-agent: Amazonbot
Disallow: /
User-agent: Applebot-Extended
Disallow: /
User-agent: PerplexityBot
Disallow: /
User-agent: cohere-ai
Disallow: /
User-agent: YouBot
Disallow: /
User-agent: DuckAssistBot
Disallow: /
User-agent: Diffbot
Disallow: /
User-agent: omgilibot
Disallow: /
User-agent: omgili
Disallow: /
User-agent: Webzio
Disallow: /
User-agent: AI2Bot
Disallow: /
User-agent: DeepSeekBot
Disallow: /
User-agent: MistralAI
Disallow: /
User-agent: xAI-Bot
Disallow: /
User-agent: gemini-deep-research
Disallow: /
User-agent: *
Allow: /Application.kt — wire the static plugin
// Ktor 2.3+ — staticResources DSL
fun Application.configureRouting() {
routing {
// Serves src/main/resources/static/ at /
// robots.txt → accessible at /robots.txt
staticResources("/", "static")
// ... your other routes
}
}
// Ktor 2.0–2.2 — install(StaticContent) + static DSL
// install(StaticContent)
// routing {
// static("/") {
// resources("static")
// }
// }Static files take precedence over routes
When you serve static/ at /, a request for /robots.txt resolves to the static file before any get("/robots.txt") {} route is checked. If you later add a dynamic route to override robots.txt in staging, remove or guard the static resource first.
2. Dynamic robots.txt with env-based rules
If you want different rules in staging vs production — blocking all bots in staging to prevent accidental crawling — serve robots.txt from a route handler instead of a static file.
// Application.kt
fun Application.configureRouting() {
val isProd = System.getenv("APP_ENV") == "production"
routing {
get("/robots.txt") {
val content = if (isProd) {
buildString {
appendLine("User-agent: GPTBot")
appendLine("Disallow: /")
appendLine()
appendLine("User-agent: ClaudeBot")
appendLine("Disallow: /")
appendLine()
// ... other AI bots ...
appendLine("User-agent: *")
appendLine("Allow: /")
}
} else {
// Block everything in non-production
"User-agent: *
Disallow: /"
}
call.respondText(content, ContentType.Text.Plain)
}
}
}3. ApplicationPlugin — hard 403 in the Plugins phase
The idiomatic Ktor way to intercept every request is createApplicationPlugin. Installing it with install(BlockAiBots) adds it to the ApplicationCallPipeline.Plugins phase — this runs before routing resolves, so no route handler executes for a blocked bot.
Always exempt /robots.txt so blocked bots can still read your disallow rules — some will respect them even if they ignore HTTP 403.
// plugins/BlockAiBots.kt
import io.ktor.http.*
import io.ktor.server.application.*
import io.ktor.server.request.*
import io.ktor.server.response.*
private val BLOCKED_UA = Regex(
"GPTBot|ChatGPT-User|OAI-SearchBot|ClaudeBot|Claude-Web|anthropic-ai" +
"|Google-Extended|Bytespider|CCBot|meta-externalagent|Amazonbot" +
"|Applebot-Extended|PerplexityBot|cohere-ai|YouBot|DuckAssistBot" +
"|Diffbot|omgilibot|omgili|Webzio|AI2Bot|DeepSeekBot|MistralAI" +
"|xAI-Bot|gemini-deep-research",
RegexOption.IGNORE_CASE
)
val BlockAiBots = createApplicationPlugin(name = "BlockAiBots") {
onCall { call ->
// Always let /robots.txt through
if (call.request.path() == "/robots.txt") return@onCall
val ua = call.request.userAgent() ?: return@onCall
if (BLOCKED_UA.containsMatchIn(ua)) {
call.respond(HttpStatusCode.Forbidden, "Forbidden")
// finish() stops the pipeline — no further phases execute
finish()
}
}
}// Application.kt
fun Application.module() {
install(BlockAiBots) // install before routing
configureRouting()
configureTemplating()
// ...
}Why finish() matters
After call.respond(), Ktor's pipeline would normally continue to the next phase. Calling finish() on the pipeline context terminates all remaining phases immediately — your route handlers, authentication plugins, and response filters never run for the blocked request.
Regex: compile once, reuse forever
Declaring BLOCKED_UA at file level (outside the plugin body) means the regex is compiled exactly once when the class is loaded. If you declared it inside onCall { }, it would recompile on every request — a measurable overhead under load.
4. Direct intercept() — inline variant
If you prefer not to create a named plugin file, you can intercept the pipeline directly in your Application.module() function. The behaviour is identical — same phase, same finish() semantics.
// Application.kt — inline intercept variant
fun Application.module() {
intercept(ApplicationCallPipeline.Plugins) {
val path = call.request.path()
if (path == "/robots.txt") return@intercept
val ua = call.request.userAgent() ?: return@intercept
if (BLOCKED_UA.containsMatchIn(ua)) {
call.respond(HttpStatusCode.Forbidden, "Forbidden")
finish()
}
}
configureRouting()
// ...
}Named plugin vs inline intercept
Named plugins (createApplicationPlugin) are testable in isolation with Ktor's test engine and can be conditionally installed or removed at startup. The inline intercept() approach is simpler but harder to unit-test. For production use, the named plugin is preferred.
5. X-Robots-Tag response header
Add X-Robots-Tag: noai, noimageai to every response using a plugin that runs in the ApplicationCallPipeline.Plugins phase and hooks into the response pipeline.
// plugins/XRobotsTag.kt
val XRobotsTag = createApplicationPlugin(name = "XRobotsTag") {
onCallRespond { call, _ ->
call.response.headers.append("X-Robots-Tag", "noai, noimageai")
}
}
// Application.kt
fun Application.module() {
install(XRobotsTag)
install(BlockAiBots)
configureRouting()
// ...
}Header vs meta tag
X-Robots-Tag applies to every response type (HTML, JSON, PDF, images) without touching your templates. A <meta name="robots"> tag only signals for HTML pages and only if the bot processes HTML. For comprehensive coverage, use both.
6. noai meta tag in server-rendered templates
If your Ktor app renders HTML with Thymeleaf or FreeMarker, add the noai meta tag globally in your base layout. Because Ktor renders templates server-side, the tag appears in every HTML response — bots see it without executing JavaScript.
Thymeleaf — base.html
<!DOCTYPE html>
<html xmlns:th="http://www.thymeleaf.org">
<head>
<meta charset="UTF-8"/>
<meta name="viewport" content="width=device-width, initial-scale=1.0"/>
<!-- AI training opt-out — server-rendered, visible to all crawlers -->
<meta name="robots" content="noai, noimageai"/>
<!-- Per-page override: th:replace="~{::meta[name='robots']}" in child -->
<th:block th:replace="${robotsMeta} ?: _"></th:block>
<title th:replace="${pageTitle} ?: 'My App'">My App</title>
</head>
<body>
<!-- content -->
</body>
</html>FreeMarker — base.ftl
<!DOCTYPE html>
<html>
<head>
<meta charset="UTF-8"/>
<meta name="viewport" content="width=device-width, initial-scale=1.0"/>
<!-- Global noai — override per-page by passing robotsContent to model -->
<meta name="robots" content="${robotsContent!"noai, noimageai"}"/>
<title>${pageTitle!"My App"}</title>
</head>
<body>
<#nested>
</body>
</html>Per-page override
Pass robotsContent = "index, follow" in the route's model map to allow AI training on specific pages (e.g., a public blog post you want indexed). Pages that don't set the variable inherit the global noai default.
7. nginx — hard block at reverse proxy
Running nginx in front of Ktor means blocked bots never reach the JVM process. This is the most efficient option — especially useful if you want to block bots without redeploying your application.
# /etc/nginx/conf.d/myapp.conf
map $http_user_agent $blocked_bot {
default 0;
~*GPTBot 1;
~*ChatGPT-User 1;
~*OAI-SearchBot 1;
~*ClaudeBot 1;
~*Claude-Web 1;
~*anthropic-ai 1;
~*Google-Extended 1;
~*Bytespider 1;
~*CCBot 1;
~*meta-externalagent 1;
~*Amazonbot 1;
~*Applebot-Extended 1;
~*PerplexityBot 1;
~*cohere-ai 1;
~*YouBot 1;
~*DuckAssistBot 1;
~*Diffbot 1;
~*omgilibot 1;
~*omgili 1;
~*Webzio 1;
~*AI2Bot 1;
~*DeepSeekBot 1;
~*MistralAI 1;
~*xAI-Bot 1;
~*gemini-deep-research 1;
}
server {
listen 80;
server_name example.com;
# Always allow robots.txt through
location = /robots.txt {
proxy_pass http://127.0.0.1:8080;
}
location / {
if ($blocked_bot) {
return 403 "Forbidden";
}
proxy_pass http://127.0.0.1:8080; # Ktor default port
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
}
}X-Forwarded-For and real IP in Ktor
When nginx proxies to Ktor, call.request.host() returns the proxy IP. Install Ktor's ForwardedHeaders plugin (install(ForwardedHeaders)) to make call.request.origin.remoteHost return the real client IP from X-Forwarded-For.
Deployment comparison
| Environment | robots.txt | Hard 403 | Notes |
|---|---|---|---|
| VPS + nginx | static resource ✓ | nginx map + ApplicationPlugin | Most flexible — nginx blocks before JVM |
| Docker (fat JAR) | static resource ✓ | ApplicationPlugin ✓ | No nginx by default — plugin is your only guard |
| Docker + nginx sidecar | static resource ✓ | nginx map ✓ | nginx as reverse proxy in same Compose stack |
| Heroku / Railway | static resource ✓ | ApplicationPlugin ✓ | No nginx layer — plugin is required |
| Fly.io | static resource ✓ | ApplicationPlugin ✓ | Add nginx via fly.toml [processes] for proxy layer |
| Kubernetes + nginx Ingress | static resource ✓ | nginx Ingress annotation ✓ | server-snippet annotation with map block |
| AWS App Runner | static resource ✓ | ApplicationPlugin ✓ | Add CloudFront + WAF rule for edge blocking |
| Google Cloud Run | static resource ✓ | ApplicationPlugin ✓ | Add Cloud Armor security policy for edge blocking |
build.gradle.kts — relevant dependencies
No external libraries are needed for the blocking plugin — it uses only Ktor core. The regex is Kotlin stdlib. Add templating engines only if you are rendering HTML server-side.
// build.gradle.kts
val ktor_version = "2.3.10" // or 3.x when stable
val kotlin_version = "1.9.24"
dependencies {
implementation("io.ktor:ktor-server-core-jvm:$ktor_version")
implementation("io.ktor:ktor-server-netty-jvm:$ktor_version")
// Static file serving
// No extra dependency — included in ktor-server-core
// Forwarded headers (real IP behind nginx proxy)
implementation("io.ktor:ktor-server-forwarded-header-jvm:$ktor_version")
// Templating — choose one
implementation("io.ktor:ktor-server-thymeleaf-jvm:$ktor_version")
// implementation("io.ktor:ktor-server-freemarker-jvm:$ktor_version")
testImplementation("io.ktor:ktor-server-test-host-jvm:$ktor_version")
testImplementation("org.jetbrains.kotlin:kotlin-test-junit:$kotlin_version")
}Testing the plugin
Use Ktor's test engine to verify the plugin in unit tests without starting a real server:
testApplication {
application {
install(BlockAiBots)
routing { get("/") { call.respondText("OK") } }
}
val response = client.get("/") {
header(HttpHeaders.UserAgent, "GPTBot/1.0")
}
assertEquals(HttpStatusCode.Forbidden, response.status)
}FAQ
How do I serve robots.txt in Ktor?
Place robots.txt in src/main/resources/static/ and call staticResources("/", "static") in your routing block. Ktor serves every file in the static/ resource directory at the root path — no additional routing configuration is required.
What is the idiomatic way to block AI bots in Ktor?
Create a named ApplicationPlugin with createApplicationPlugin("BlockAiBots") and use the onCall hook to intercept requests. This runs in the Plugins phase — before routing resolves — and calling finish() stops all further pipeline execution for matched bots.
What is the difference between ApplicationCallPipeline.Plugins and ApplicationCallPipeline.Call?
Plugins phase runs before routing resolves and is the correct phase for cross-cutting concerns like bot blocking. Call phase runs inside the route handler context, after routing has already resolved the handler. Intercepting in Plugins means your bot check runs for every request regardless of which route handles it.
Does robots.txt work to block AI bots on Ktor?
robots.txt signals crawlers which paths to avoid, but most AI training bots ignore it. For guaranteed blocking you need server-level enforcement — either the Ktor ApplicationPlugin intercept approach (returning 403) or an nginx map block in front of Ktor.
Should I use createApplicationPlugin or intercept() directly?
createApplicationPlugin is preferred in production because named plugins are independently testable with Ktor's test engine and can be conditionally installed at startup. The direct intercept() call in module() achieves the same result but is harder to test in isolation.
Does the Kotlin Regex need any flags for case-insensitive matching?
Yes — pass RegexOption.IGNORE_CASE when constructing the Regex. Without it, Kotlin regex is case-sensitive by default and a bot sending its User-Agent as 'gptbot' instead of 'GPTBot' would slip through. Most AI bot crawlers send their names in the documented casing, but defensive matching with IGNORE_CASE costs nothing.
Is your site protected from AI bots?
Run a free scan to check your robots.txt, meta tags, and overall AI readiness score.