How to Block AI Bots on F# Giraffe: Complete 2026 Guide
Giraffe is the functional F# web framework on ASP.NET Core. Handlers compose with >=> (the fish operator). To block an AI bot: call the 403-response pipeline with earlyReturn ctx — the next handler is never invoked. Place public routes (robots.txt, /health) before botBlocker in the choose list.
The >=> fish operator — how Giraffe composes handlers
h1 >=> h2 creates a new handler. When called, h1 receives h2 as its next function. If h1 calls next ctx, h2 runs. If h1 uses earlyReturn ctx, h2 never runs — the chain terminates.
type HttpHandler = HttpFunc -> HttpContext -> Task<HttpContext option>Protection layers
Step 1 — Bot detection (AiBots.fs)
String.IsNullOrEmpty guard before any comparison. StringComparison.OrdinalIgnoreCase in Contains — no manual lowercasing needed. List.exists short-circuits on first match.
// AiBots.fs — bot detection module
module AiBots
open System
// Known AI bot UA substrings — lowercase for comparison.
let private patterns = [
// OpenAI
"gptbot"; "chatgpt-user"; "oai-searchbot"
// Anthropic
"claudebot"; "claude-web"
// Common Crawl
"ccbot"
// Bytedance
"bytespider"
// Meta
"meta-externalagent"
// Perplexity
"perplexitybot"
// Google AI
"google-extended"; "googleother"
// Cohere
"cohere-ai"
// Amazon
"amazonbot"
// Diffbot
"diffbot"
// AI2
"ai2bot"
// DeepSeek
"deepseekbot"
// Mistral
"mistralai-user"
// xAI
"xai-bot"
// You.com
"youbot"
// DuckDuckGo AI
"duckassistbot"
]
/// isAiBot: returns true if ua contains any known AI bot pattern.
/// Case-insensitive — uses StringComparison.OrdinalIgnoreCase.
let isAiBot (ua: string) =
if String.IsNullOrEmpty ua then false
else
patterns |> List.exists (fun p ->
ua.Contains(p, StringComparison.OrdinalIgnoreCase))Step 2 — Bot-blocker handler (BotBlocker.fs)
ctx.Request.Headers["User-Agent"].ToString() — StringValues.ToString() returns "" (not null) when the header is absent. No null check needed. The blocked branch calls the response pipeline with earlyReturn ctx — not with next ctx.
// BotBlocker.fs — Giraffe HttpHandler
module BotBlocker
open System
open Microsoft.AspNetCore.Http
open Giraffe
open AiBots
// HttpHandler type: HttpFunc -> HttpContext -> Task<HttpContext option>
// HttpFunc type: HttpContext -> Task<HttpContext option>
//
// >=> (fish operator): composes two HttpHandlers.
// earlyReturn: terminal HttpFunc that returns Some ctx immediately.
//
// To short-circuit: call handler pipeline with earlyReturn ctx
// — next (the inner app) is never called.
// To pass through: call next ctx after optionally modifying context.
let botBlocker : HttpHandler =
fun next ctx ->
// ctx.Request.Headers["User-Agent"] returns StringValues.
// .ToString() returns "" if the header is absent — never null.
let ua = ctx.Request.Headers["User-Agent"].ToString()
if isAiBot ua then
// Short-circuit: compose 403 response handlers and call with earlyReturn.
// earlyReturn is the terminal HttpFunc — chain stops here.
// next is never called — no downstream handlers run.
( setStatusCode 403
>=> setHttpHeader "X-Robots-Tag" "noai, noimageai"
>=> setHttpHeader "Content-Type" "text/plain; charset=utf-8"
>=> text "Forbidden"
) earlyReturn ctx
else
// Pass through: add X-Robots-Tag then call next.
(setHttpHeader "X-Robots-Tag" "noai, noimageai" >=> next) next ctxStep 3 — Router with choose
choose tries each handler in list order. The first that doesn't return None wins. robots.txt and /health are listed before botBlocker — they're served to all crawlers without any bot check.
// App.fs — Giraffe router with bot-blocker middleware
module App
open Giraffe
let private robotsTxt = """User-agent: *
Allow: /
User-agent: GPTBot
Disallow: /
User-agent: ClaudeBot
Disallow: /
User-agent: CCBot
Disallow: /
User-agent: Bytespider
Disallow: /
User-agent: Google-Extended
Disallow: /
User-agent: PerplexityBot
Disallow: /
User-agent: Meta-ExternalAgent
Disallow: /
"""
// choose tries each handler in order.
// The first handler that does NOT return None wins.
// Place robots.txt and /health BEFORE botBlocker so they are
// served to all crawlers without hitting the bot check.
let webApp : HttpHandler =
choose [
// Public routes — bypasses botBlocker
route "/robots.txt" >=>
setHttpHeader "Content-Type" "text/plain; charset=utf-8" >=>
text robotsTxt
route "/health" >=> text "ok"
// Protected routes — botBlocker applied to all of these
BotBlocker.botBlocker >=>
choose [
route "/" >=> htmlView Views.homeView
route "/api/data" >=> json {| data = "protected" |}
setStatusCode 404 >=> text "Not Found"
]
]Step 4 — Host setup (Program.fs)
UseStaticFiles() before UseGiraffe — if robots.txt is in wwwroot/, ASP.NET Core serves it without entering the Giraffe pipeline at all.
// Program.fs — ASP.NET Core host with Giraffe
module Program
open Microsoft.AspNetCore.Builder
open Microsoft.AspNetCore.Hosting
open Microsoft.Extensions.Hosting
open Microsoft.Extensions.DependencyInjection
open Giraffe
open App
[<EntryPoint>]
let main argv =
Host.CreateDefaultBuilder(argv)
.ConfigureWebHostDefaults(fun webHost ->
webHost
.ConfigureServices(fun services ->
// Register Giraffe services (required)
services.AddGiraffe() |> ignore
)
.Configure(fun app ->
// UseStaticFiles serves wwwroot/ — including robots.txt if placed there.
// Runs BEFORE Giraffe, so /robots.txt is handled without entering the Giraffe pipeline.
app.UseStaticFiles() |> ignore
// UseGiraffe wires the Giraffe app as the terminal middleware.
app.UseGiraffe(webApp)
)
|> ignore
)
.Build()
.Run()
0Saturn and Falco variants
Saturn is built on Giraffe — the same HttpHandler type and >=> composition work unchanged. Falco uses a different handler model but the same HttpContext access pattern.
// Saturn variant — Saturn is built on Giraffe.
// Giraffe HttpHandlers (including botBlocker) work unchanged in Saturn.
open Saturn
open App
open BotBlocker
// Saturn router — uses a computation expression
let apiRouter = router {
get "/data" (json {| data = "protected" |})
}
// Apply botBlocker as a pipeline before the Saturn router.
// Saturn's pipe_through accepts HttpHandler values.
let appRouter = router {
get "/" (htmlView Views.homeView)
get "/robots.txt" (text robotsTxt)
get "/health" (text "ok")
// Forward /api/* through botBlocker first
forward "/api" (botBlocker >=> apiRouter)
}
let app = application {
use_router appRouter
use_static "wwwroot"
}
run app
// Falco variant — different HttpHandler model
// open Falco
//
// let botBlockerFalco : HttpHandler =
// fun ctx ->
// let ua = ctx.Request.Headers["User-Agent"].ToString()
// if AiBots.isAiBot ua then
// Response.withStatusCode 403
// >> Response.ofPlainText "Forbidden"
// <| ctx
// else
// ctx.Response.Headers.Append("X-Robots-Tag", "noai, noimageai")
// // call next handler — Falco uses different composition
// Task.CompletedTaskF# Giraffe vs C# ASP.NET Core vs Falco vs Saturn
| Feature | F# Giraffe | C# ASP.NET Core | F# Falco | F# Saturn |
|---|---|---|---|---|
| Handler type | HttpHandler = HttpFunc -> HttpContext -> Task<HttpContext option> | RequestDelegate = HttpContext -> Task (imperative, no return value) | HttpHandler = HttpContext -> Task<unit> — simpler, no option wrapper | Same as Giraffe — Saturn is built on top of Giraffe |
| Short-circuit | (setStatusCode 403 >=> text "Forbidden") earlyReturn ctx — next never called | ctx.Response.StatusCode = 403; await ctx.Response.WriteAsync(); return — do NOT call next() | Response.withStatusCode 403 >> Response.ofPlainText "Forbidden" — returns Task<unit> | Same as Giraffe earlyReturn — Saturn uses Giraffe under the hood |
| >=> composition | h1 >=> h2 — Kleisli composition; h2 only runs if h1 calls next | app.Use(async (ctx, next) => { ... await next(ctx); ... }) — imperative | No >=> — Falco uses |> piping and explicit composition | Same >=> as Giraffe — Saturn adds router { } CE on top |
| UA header access | ctx.Request.Headers["User-Agent"].ToString() — StringValues, .ToString() safe | ctx.Request.Headers["User-Agent"].ToString() — identical StringValues access | ctx.Request.Headers["User-Agent"].ToString() — same HttpContext | Same as Giraffe — same HttpContext |
| robots.txt | route "/robots.txt" before botBlocker in choose [...] list, or UseStaticFiles() | app.UseStaticFiles() in Configure before app.UseMiddleware<BotBlocker>() — wwwroot/robots.txt | get "/robots.txt" handler before botBlockerFalco in router, or UseStaticFiles() | use_static "wwwroot" in application CE — or route before botBlocker in router |
| Choose / routing | choose [r1; r2; r3] — tries handlers in order, first non-None wins | app.MapGet("/", ...) endpoint routing — pattern matching on path | Router.get "/" handler — similar route matching, different API | router { get "/" handler } computation expression — wraps Giraffe routing |
Summary
earlyReturn ctxshort-circuits — call the 403 pipeline withearlyReturninstead ofnext. The chain terminates; no downstream handlers run.>=>composes left-to-right —h1 >=> h2runs h1 first; h2 only if h1 calls next. Build your pipeline like a data pipeline.StringValues.ToString()— always safe, never null. Returns""when the header is absent.chooselist order matters — robots.txt and public routes beforebotBlocker. First match wins.- Saturn uses the same HttpHandler — Giraffe middleware works unchanged in Saturn apps. No port needed.
Is your site protected from AI bots?
Run a free scan to check your robots.txt, meta tags, and overall AI readiness score.