How to Block AI Bots in PHP Laminas (MVC + Mezzio)
Laminas is the open-source continuation of Zend Framework, one of the oldest and most widely deployed PHP frameworks in enterprise applications. The Laminas project includes two distinct frameworks: Laminas MVC — the classic event-driven MVC stack with modules, a shared event manager, and MvcEvent lifecycle hooks — and Mezzio (formerly Zend Expressive) — a lightweight PSR-15 middleware pipeline. Both require different approaches to bot blocking. Laminas MVC uses SharedEventManager with stopPropagation(); Mezzio uses a standard MiddlewareInterface implementation with PSR-7 immutable responses.
1. Bot detection helper
A static helper class shared by both MVC and Mezzio implementations. strpos() performs literal substring matching — no regex overhead. strtolower() is applied once before iterating. The patterns list covers all major AI training crawlers.
<?php
// module/Application/src/Service/AiBotDetector.php
declare(strict_types=1);
namespace Application\Service;
class AiBotDetector
{
// All lowercase — matched against strtolower($ua)
private const PATTERNS = [
'gptbot',
'chatgpt-user',
'claudebot',
'anthropic-ai',
'ccbot',
'google-extended',
'cohere-ai',
'meta-externalagent',
'bytespider',
'omgili',
'diffbot',
'imagesiftbot',
'magpie-crawler',
'amazonbot',
'dataprovider',
'netcraft',
];
public static function isAiBot(string $ua): bool
{
if ($ua === '') {
return false;
}
$lower = strtolower($ua);
// strpos() — literal substring, no regex engine
foreach (self::PATTERNS as $pattern) {
if (strpos($lower, $pattern) !== false) {
return true;
}
}
return false;
}
}2. Laminas MVC — ListenerAggregate
Implement ListenerAggregateInterface and attach to MvcEvent::EVENT_ROUTE. The Laminas-specific detail: getHeader() returns a header object (not a string) when the header is present, or false when absent. You must call ->getFieldValue() to extract the string value. Short-circuiting requires both $event->setResponse($response) and $event->stopPropagation(true) — in that order.
<?php
// module/Application/src/Listener/AiBotListener.php
declare(strict_types=1);
namespace Application\Listener;
use Application\Service\AiBotDetector;
use Laminas\EventManager\ListenerAggregateInterface;
use Laminas\EventManager\ListenerAggregateTrait;
use Laminas\EventManager\EventManagerInterface;
use Laminas\Mvc\MvcEvent;
use Laminas\Http\Response;
class AiBotListener implements ListenerAggregateInterface
{
use ListenerAggregateTrait;
public function attach(EventManagerInterface $events, $priority = 1): void
{
// Attach to EVENT_ROUTE — fires after routing, before dispatch.
// Use a high negative priority to run before most other listeners.
$this->listeners[] = $events->attach(
MvcEvent::EVENT_ROUTE,
[$this, 'onRoute'],
-1
);
}
public function onRoute(MvcEvent $event): void
{
$request = $event->getRequest();
// Path guard: in standard setups, robots.txt is served by
// Apache/Nginx from public/ before PHP runs. This guard handles
// edge cases where robots.txt is served via a PHP route.
if ($request->getUri()->getPath() === '/robots.txt') {
return;
}
// getHeader() returns the header object or false when absent.
// Never call getFieldValue() without checking first.
$headerObj = $request->getHeader('User-Agent');
$ua = is_object($headerObj) ? $headerObj->getFieldValue() : '';
if (!AiBotDetector::isAiBot($ua)) {
return;
}
// Build a 403 response
/** @var Response $response */
$response = $event->getResponse();
$response->setStatusCode(Response::STATUS_CODE_403);
$response->setContent('Forbidden');
$response->getHeaders()->addHeaderLine('X-Robots-Tag', 'noai, noimageai');
// Attach the response to the event, then stop propagation.
// Order matters: setResponse first, stopPropagation second.
$event->setResponse($response);
$event->stopPropagation(true);
}
}3. Module.php — register the listener
Register the listener aggregate in the module's onBootstrap() method. The aggregate pattern is preferred over anonymous closures because it groups all event registrations in one class, supports unit testing, and can be loaded via the service container for dependency injection.
<?php
// module/Application/src/Module.php
declare(strict_types=1);
namespace Application;
use Application\Listener\AiBotListener;
use Laminas\Mvc\MvcEvent;
class Module
{
public function onBootstrap(MvcEvent $event): void
{
$application = $event->getApplication();
$eventManager = $application->getEventManager();
$serviceManager = $application->getServiceManager();
// Attach the listener aggregate to the application event manager.
// The listener registers itself on the shared event manager internally.
$botListener = new AiBotListener();
$botListener->attach($eventManager);
}
public function getConfig(): array
{
return include __DIR__ . '/../config/module.config.php';
}
}4. SharedEventManager variant (inline closure)
For quick prototypes or small applications, attach directly via SharedEventManager. The target identifier 'Laminas\Mvc\Application' scopes the listener to application-level events. The priority argument (-1) controls ordering — lower values run later in the stack.
<?php
// Alternative: attach via SharedEventManager directly (no aggregate)
// Useful for quick prototypes; prefer the aggregate pattern in production.
// In Module::onBootstrap():
$sharedEvents = $application->getEventManager()->getSharedManager();
$sharedEvents->attach(
'Laminas\Mvc\Application', // target identifier
MvcEvent::EVENT_ROUTE,
function (MvcEvent $event) use ($serviceManager) {
$request = $event->getRequest();
if ($request->getUri()->getPath() === '/robots.txt') {
return;
}
$headerObj = $request->getHeader('User-Agent');
$ua = is_object($headerObj) ? $headerObj->getFieldValue() : '';
if (!\Application\Service\AiBotDetector::isAiBot($ua)) {
return;
}
$response = $event->getResponse();
$response->setStatusCode(403);
$response->setContent('Forbidden');
$response->getHeaders()->addHeaderLine('X-Robots-Tag', 'noai, noimageai');
$event->setResponse($response);
$event->stopPropagation(true);
},
-1 // priority — negative = runs late in the listener stack
);5. Mezzio — PSR-15 middleware
Mezzio uses the PSR-15 MiddlewareInterface. PSR-7 header access is simpler than Laminas MVC: getHeaderLine() returns an empty string when the header is absent — no object check needed. PSR-7 responses are immutable — withHeader() returns a new instance; always capture the return value. Blocking means returning a response without calling $handler->handle().
<?php
// src/App/Middleware/AiBotMiddleware.php — Mezzio (PSR-15)
declare(strict_types=1);
namespace App\Middleware;
use App\Service\AiBotDetector;
use Psr\Http\Message\ResponseFactoryInterface;
use Psr\Http\Message\ResponseInterface;
use Psr\Http\Message\ServerRequestInterface;
use Psr\Http\Server\MiddlewareInterface;
use Psr\Http\Server\RequestHandlerInterface;
class AiBotMiddleware implements MiddlewareInterface
{
public function __construct(
private readonly ResponseFactoryInterface $responseFactory
) {}
public function process(
ServerRequestInterface $request,
RequestHandlerInterface $handler
): ResponseInterface {
// Path guard — same pattern as Laminas MVC
if ($request->getUri()->getPath() === '/robots.txt') {
return $handler->handle($request);
}
// PSR-7: getHeaderLine() returns "" when absent — no object check needed.
$ua = $request->getHeaderLine('user-agent');
if (AiBotDetector::isAiBot($ua)) {
return $this->responseFactory->createResponse(403)
->withHeader('X-Robots-Tag', 'noai, noimageai')
->withHeader('Content-Type', 'text/plain');
// PSR-7 responses are immutable — withHeader() returns a new instance.
// Do NOT call $handler->handle() — returning here skips all downstream middleware.
}
// Pass through: let downstream middleware and handlers run.
$response = $handler->handle($request);
// Inject X-Robots-Tag on all passing responses.
return $response->withHeader('X-Robots-Tag', 'noai, noimageai');
}
}6. Mezzio — factory and pipeline registration
Register the middleware via a PSR-11 container factory and add it to the pipeline in config/pipeline.php. Position it early in the pipeline — before routing — so blocked requests never reach the router or downstream middleware.
<?php
// src/App/Middleware/AiBotMiddlewareFactory.php
declare(strict_types=1);
namespace App\Middleware;
use Psr\Container\ContainerInterface;
use Psr\Http\Message\ResponseFactoryInterface;
class AiBotMiddlewareFactory
{
public function __invoke(ContainerInterface $container): AiBotMiddleware
{
return new AiBotMiddleware(
$container->get(ResponseFactoryInterface::class)
);
}
}<?php
// config/pipeline.php — Mezzio application pipeline
use App\Middleware\AiBotMiddleware;
use Laminas\Stratigility\MiddlewarePipe;
use Mezzio\Application;
use Mezzio\MiddlewareFactory;
return function (Application $app, MiddlewareFactory $factory, ContainerInterface $container): void {
// Add AiBotMiddleware early in the pipeline — before routing.
// The earlier a middleware runs, the less work is done for blocked requests.
$app->pipe(AiBotMiddleware::class);
// Standard Mezzio pipeline continues below:
$app->pipe(\Mezzio\Handler\NotFoundHandler::class);
};
// config/autoload/dependencies.global.php — register the factory
return [
'dependencies' => [
'factories' => [
\App\Middleware\AiBotMiddleware::class => \App\Middleware\AiBotMiddlewareFactory::class,
],
],
];7. public/robots.txt
Place robots.txt in public/ — served directly by Apache or Nginx as a static file before PHP runs. Neither Laminas MVC event listeners nor Mezzio middleware fire for it when served this way. AI crawlers can always fetch the file and discover they are disallowed.
# public/robots.txt
# Served directly by Apache/Nginx from public/ — PHP never runs for this file.
# Laminas MVC and Mezzio listeners/middleware do not fire for static files
# served at the web server layer.
User-agent: *
Allow: /
User-agent: GPTBot
Disallow: /
User-agent: ClaudeBot
Disallow: /
User-agent: CCBot
Disallow: /
User-agent: Google-Extended
Disallow: /8. Nginx static file serving
Configure Nginx to serve robots.txt as a static file and add X-Robots-Tag at the server level. This handles the header for static files that never reach PHP. For dynamic responses, the listener or middleware sets the header.
# nginx.conf (relevant server block)
# Serve robots.txt as a static file — PHP never runs for it.
server {
listen 80;
root /var/www/html/public;
index index.php;
# robots.txt — static, served before PHP
location = /robots.txt {
try_files $uri =404;
add_header X-Robots-Tag "noai, noimageai";
}
# All other requests: try static file, then PHP
location / {
try_files $uri $uri/ /index.php$is_args$args;
}
location ~ \.php$ {
fastcgi_pass unix:/var/run/php/php8.2-fpm.sock;
fastcgi_param SCRIPT_FILENAME $realpath_root$fastcgi_script_name;
include fastcgi_params;
}
}Key points
- getHeader() returns an object, not a string (Laminas MVC):
$request->getHeader('User-Agent')returns aLaminas\Http\Header\UserAgentobject orfalse. Call->getFieldValue()to extract the string. This is the most common mistake when porting code from other PHP frameworks. - stopPropagation order matters: In Laminas MVC, call
$event->setResponse($response)before$event->stopPropagation(true). Setting the response first ensures Laminas knows what to send when propagation stops. - PSR-7 immutability (Mezzio):
withHeader(),withStatus(), and other PSR-7 methods return new response objects — they do not modify in place. Always assign the return value:$response = $response->withHeader(...). - getHeaderLine() vs getHeader() (Mezzio / PSR-7):
getHeaderLine()returns a comma-joined string of all header values, or an empty string if absent — no null check.getHeader()returns an array. For User-Agent,getHeaderLine()is simpler since there is always exactly one value. - Static files bypass PHP entirely: Apache and Nginx serve files from
public/without invoking PHP. Laminas MVC event listeners and Mezzio middleware do not fire for these requests. No path guard is needed for static robots.txt in production — but keep it as a safety net. - Event priority in Laminas MVC: Attach your bot listener with a negative priority (
-1or lower) so it runs after the built-in Laminas listeners that set up the router, but before dispatch. UseEVENT_ROUTErather thanEVENT_DISPATCHto short-circuit before controller instantiation.
Framework comparison — PHP enterprise frameworks
| Framework | Hook / middleware | Block call | UA header access |
|---|---|---|---|
| Laminas MVC | EVENT_ROUTE listener | setResponse() + stopPropagation(true) | getHeader()->getFieldValue() |
| Mezzio (PSR-15) | MiddlewareInterface::process() | return response without calling $handler->handle() | getHeaderLine('user-agent') |
| Laravel | handle() middleware | return response(403) without $next($request) | $request->header('User-Agent', '') |
| Symfony | kernel.request event listener | $event->setResponse($response) | $request->headers->get('User-Agent', '') |
Laminas MVC's event system is the most powerful but has the most Laminas-specific API surface — the getHeader()->getFieldValue() chain is unique. Mezzio aligns with PSR-7/PSR-15 standards, making its middleware portable to any compliant framework. For new PHP projects, Mezzio's approach is recommended; for existing Laminas MVC applications, the event listener approach is idiomatic.
Dependencies
# Laminas MVC
composer require laminas/laminas-mvc
composer require laminas/laminas-http
# Mezzio (PSR-15 pipeline)
composer require mezzio/mezzio
composer require mezzio/mezzio-fastroute
composer require laminas/laminas-diactoros # PSR-7 implementation
composer require laminas/laminas-servicemanager # PSR-11 container
# Run Mezzio with built-in PHP server (dev)
php -S 0.0.0.0:8080 -t public/
# Production: use PHP-FPM + Nginx (see config above)
# or Swoole for async Mezzio:
# composer require mezzio/mezzio-swoole