How to Block AI Bots in Ruby Roda
Roda is a routing-tree web framework for Ruby built on Rack. Unlike Rails or Sinatra, Roda evaluates routes as a tree — each r.on / r.get call consumes path segments progressively. Bot blocking uses the plugin :hooks plugin's before block to intercept all requests before routing begins. request.halt([status, headers, body]) throws :halt — Roda catches it and returns the given Rack response immediately, bypassing the entire routing tree. The header key is request.env['HTTP_USER_AGENT'] — Rack stores HTTP headers with an HTTP_ prefix and underscores in place of hyphens.
1. Bot detection
Pure Ruby, no gems. String#include? for literal substring matching — no regex, no external dependencies. Enumerable#any? short-circuits on first match.
# bot_utils.rb — AI bot detection, no gems required
AI_BOT_PATTERNS = %w[
gptbot
chatgpt-user
claudebot
anthropic-ai
ccbot
google-extended
cohere-ai
meta-externalagent
bytespider
omgili
diffbot
imagesiftbot
magpie-crawler
amazonbot
dataprovider
netcraft
].freeze
# Returns true if the User-Agent string matches a known AI crawler.
# String#include? — literal substring match, no regex.
def ai_bot?(ua)
return false if ua.nil? || ua.empty?
lower = ua.downcase
AI_BOT_PATTERNS.any? { |p| lower.include?(p) }
end2. Before hook — plugin :hooks
plugin :hooks must be declared before the before block — it is not part of Roda's minimal core. Inside the hook, use request (not r — that is only the route block parameter). Use next to skip the block and continue to routing; use request.halt to terminate immediately with a Rack response array.
# app.rb — Roda application with before hook bot blocking
require 'roda'
require_relative 'bot_utils'
class App < Roda
# plugin :hooks must be declared before using before/after blocks.
# It is not part of Roda's minimal core — omitting it raises NoMethodError.
plugin :hooks
before do
# Pass robots.txt through — crawlers read it to discover Disallow rules.
# Ruby next exits the before block; routing continues normally.
next if request.path == '/robots.txt'
# Rack env — headers are stored with HTTP_ prefix, uppercased, hyphens → underscores.
# request.env is the raw Rack environment hash. Returns nil when absent.
ua = request.env['HTTP_USER_AGENT'] || ''
if ai_bot?(ua)
# request.halt takes a Rack-compatible response array: [status, headers, body].
# Body MUST be an array of strings — Rack spec requirement.
# halt throws :halt, caught by Roda — stops all routing immediately.
# Inside before hooks use request, not r (r is the route block parameter).
request.halt [
403,
{
'Content-Type' => 'text/plain',
'X-Robots-Tag' => 'noai, noimageai',
},
['Forbidden'],
]
else
# Non-bot: set X-Robots-Tag then fall through to routing.
# response[] = sets a header on the outgoing response.
response['X-Robots-Tag'] = 'noai, noimageai'
end
end
route do |r|
r.get 'robots.txt' do
response['Content-Type'] = 'text/plain'
<<~TXT
User-agent: *
Allow: /
User-agent: GPTBot
Disallow: /
User-agent: ClaudeBot
Disallow: /
User-agent: CCBot
Disallow: /
User-agent: Google-Extended
Disallow: /
TXT
end
r.root do
response['Content-Type'] = 'application/json'
'{"message":"Hello"}'
end
r.on 'api' do
r.get 'data' do
response['Content-Type'] = 'application/json'
'{"data":"value"}'
end
end
end
end3. config.ru
Roda is a Rack application. Run with bundle exec rackup (Puma, Falcon, or WEBrick) or pass to any Rack-compatible server.
# config.ru — Rack entry point
require_relative 'app'
run App4. Routing tree guard — r.halt inside route
If you prefer not to use plugin :hooks, put the check at the top of the route block. Inside the route block, r is the request object — r.halt and request.halt are the same method. This approach is simpler when you only need one check point.
# Alternative: guard inside the routing tree instead of a before hook.
# No plugin :hooks needed — r.halt short-circuits the routing tree directly.
# Use this when you want path-based scoping (e.g., only block under /api).
class App < Roda
route do |r|
# Serve robots.txt unconditionally — comes before the bot check
r.get 'robots.txt' do
response['Content-Type'] = 'text/plain'
"User-agent: GPTBot\nDisallow: /\n"
end
# Bot check at the top of the routing tree — fires for all remaining paths.
ua = request.env['HTTP_USER_AGENT'] || ''
if ai_bot?(ua)
r.halt [
403,
{ 'Content-Type' => 'text/plain', 'X-Robots-Tag' => 'noai, noimageai' },
['Forbidden'],
]
end
response['X-Robots-Tag'] = 'noai, noimageai'
r.root { '{"message":"Hello"}' }
r.on 'api' do
r.get('data') { '{"data":"value"}' }
end
end
end5. Rack middleware via plugin :middleware
plugin :middleware lets you use a Roda app as Rack middleware with use BotBlockerMiddleware in config.ru. When the route block returns without halting, Roda calls the downstream app. This is useful for inserting bot blocking in front of a non-Roda Rack application (Rails, Sinatra, Hanami, etc.).
# Roda as Rack middleware via plugin :middleware.
# Useful when embedding bot blocking in front of another Rack application.
# The middleware passes through to the downstream app when no halt is thrown.
require 'roda'
require_relative 'bot_utils'
class BotBlockerMiddleware < Roda
# plugin :middleware enables use as Rack middleware: use BotBlockerMiddleware
plugin :middleware
route do |r|
ua = request.env['HTTP_USER_AGENT'] || ''
if ai_bot?(ua) && request.path != '/robots.txt'
r.halt [
403,
{ 'Content-Type' => 'text/plain', 'X-Robots-Tag' => 'noai, noimageai' },
['Forbidden'],
]
end
# No explicit match — Roda middleware passes the request to the next app.
end
end
# config.ru with a downstream Rack app:
#
# require_relative 'bot_blocker_middleware'
# require_relative 'main_app'
#
# use BotBlockerMiddleware
# run MainApp6. robots.txt
# public/robots.txt
User-agent: *
Allow: /
User-agent: GPTBot
Disallow: /
User-agent: ClaudeBot
Disallow: /
User-agent: CCBot
Disallow: /
User-agent: Google-Extended
Disallow: /Key points
plugin :hooksis required: Roda's core is minimal —beforeandafterhooks are not built in. Declareplugin :hooksat the top of your class body. Without it,beforeraisesNoMethodError.- Use
request, notr, in before hooks:ris the block parameter of theroute do |r|block — it is not in scope insidebefore. Both reference the same Roda request object; userequest.haltin hooks and either in routes. request.haltbody must be an Array: Rack requires the response body to be an object that responds toeach. Pass['Forbidden'](array of strings), not the bare string'Forbidden'. A bare string raises a Rack spec violation at runtime.request.env['HTTP_USER_AGENT']— Rack CGI naming: Rack stores HTTP request headers in the env hash with anHTTP_prefix, uppercased, hyphens replaced by underscores.User-AgentbecomesHTTP_USER_AGENT. Returnsnilwhen absent — always provide a default:|| ''.nextskips the before block: Standard Rubynextexits the before block and continues to the routing tree. Use it to pass specific paths (robots.txt) through without halting.- Before hooks fire for all paths including 404s: The routing tree only matches defined routes — unmatched paths get a 404. But
beforehooks run for every request regardless, including paths that will ultimately 404. The bot check fires first, before routing determines whether the path exists. response['X-Robots-Tag'] = '...'sets response headers:response[]=writes to the outgoing response headers hash. Set it beforerequest.halt(for blocked responses) or before the route handler writes the body (for pass-through). Headers accumulate and are sent together at the end of the Rack cycle.
Framework comparison — Ruby web frameworks
| Framework | Hook / Filter | Block | UA header |
|---|---|---|---|
| Roda | plugin :hooks; before { } | request.halt([403, h, ['Forbidden']]) | request.env['HTTP_USER_AGENT'] |
| Sinatra | before do ... end (built-in) | halt 403, 'Forbidden' | request.user_agent |
| Grape | before do ... end (built-in) | error!('Forbidden', 403) | headers['User-Agent'] |
| Rails | before_action :check_bot | head :forbidden | request.user_agent |
Roda's request.halt takes a full Rack response array, giving precise control over status, headers, and body. Sinatra's halt accepts a bare status and string body — more convenient but less explicit. The key Roda-specific requirement is plugin :hooks; Sinatra and Grape include before-hook support in their cores.
Dependencies & running
# Gemfile
gem 'roda'
gem 'puma' # recommended production server
# Install
bundle install
# Run
bundle exec rackup # uses WEBrick by default
bundle exec rackup -s puma # Puma
# Roda version: 3.x (hooks plugin stable since 3.0)
# Ruby: 2.5+ supported; 3.1+ recommended