Skip to content

How to Block AI Bots in PHP Phalcon

Phalcon is a PHP framework implemented as a C extension — it runs faster than pure-PHP frameworks because its core classes never go through PHP's interpreter overhead. Bot blocking uses Phalcon's before() handler (Micro apps) or the PSR-15 MiddlewareInterface (MVC apps). The critical Phalcon-specific detail: returning false from a before() handler stops the entire pipeline — route handlers and after() do not run.

1. Bot pattern class

A static utility class keeps patterns reusable across Micro and MVC apps. strtolower() normalises the User-Agent; str_contains() (PHP 8+) does substring matching without regex overhead.

<?php
// src/AiBots.php
declare(strict_types=1);

class AiBots
{
    private const PATTERNS = [
        'gptbot',
        'chatgpt-user',
        'claudebot',
        'anthropic-ai',
        'ccbot',
        'google-extended',
        'cohere-ai',
        'meta-externalagent',
        'bytespider',
        'omgili',
        'diffbot',
        'imagesiftbot',
        'magpie-crawler',
        'amazonbot',
        'dataprovider',
        'netcraft',
    ];

    public static function isAiBot(string $userAgent): bool
    {
        $ua = strtolower($userAgent);
        foreach (self::PATTERNS as $pattern) {
            if (str_contains($ua, $pattern)) {
                return true;
            }
        }
        return false;
    }
}

2. Phalcon Micro — before()/after() handlers

For API and micro apps. The before() handler runs before every route. Return true to continue, or send a response and return false to short-circuit. The after() handler adds X-Robots-Tag to all passing responses — it only runs when before() returns true.

<?php
// public/index.php — Phalcon Micro application
declare(strict_types=1);

use Phalcon\Di\FactoryDefault;
use Phalcon\Http\Response;
use Phalcon\Mvc\Micro;

require __DIR__ . '/../vendor/autoload.php';
require __DIR__ . '/../src/AiBots.php';

$di  = new FactoryDefault();
$app = new Micro($di);

// ── before() handler — runs before every route ────────────────────────────
$app->before(function () use ($app): bool {
    $request = $app->request;
    $path    = $request->getURI(true);   // true = strip query string

    // Always allow robots.txt regardless of User-Agent
    if ($path === '/robots.txt') {
        return true;
    }

    $ua = strtolower($request->getHeader('User-Agent') ?? '');

    if (AiBots::isAiBot($ua)) {
        $response = new Response();
        $response->setStatusCode(403, 'Forbidden');
        $response->setHeader('X-Robots-Tag', 'noai, noimageai');
        $response->setContentType('text/plain; charset=utf-8');
        $response->setContent('Forbidden');
        $response->send();

        // Returning false stops the Micro pipeline:
        // route handlers and after() do NOT run.
        return false;
    }

    return true;
});

// ── after() handler — inject X-Robots-Tag on passing responses ────────────
$app->after(function () use ($app): void {
    $app->response->setHeader('X-Robots-Tag', 'noai, noimageai');
});

// ── Routes ────────────────────────────────────────────────────────────────
$app->get('/', function () use ($app): void {
    $app->response->setContentType('text/html');
    $app->response->setContent('<h1>Hello</h1>');
    $app->response->send();
});

$app->get('/robots.txt', function () use ($app): void {
    $app->response->setContentType('text/plain');
    $app->response->setContent(file_get_contents(__DIR__ . '/../public/robots.txt'));
    $app->response->send();
});

$app->handle($_SERVER['REQUEST_URI']);

3. PSR-15 middleware (full MVC apps)

For full Phalcon MVC applications, implement Phalcon\Http\Server\MiddlewareInterface. Return a response directly to block, or call $handler->handle($request) to continue and decorate with withHeader().

<?php
// src/AiBotBlocker.php — PSR-15 middleware for Phalcon MVC apps
declare(strict_types=1);

use Phalcon\Http\Server\MiddlewareInterface;
use Psr\Http\Message\ResponseInterface;
use Psr\Http\Message\ServerRequestInterface;
use Psr\Http\Server\RequestHandlerInterface;

class AiBotBlocker implements MiddlewareInterface
{
    public function process(
        ServerRequestInterface  $request,
        RequestHandlerInterface $handler,
    ): ResponseInterface {
        $path = $request->getUri()->getPath();

        if ($path !== '/robots.txt') {
            $ua = strtolower($request->getHeaderLine('User-Agent'));

            if (AiBots::isAiBot($ua)) {
                // Return a response without calling $handler->handle()
                return new \Nyholm\Psr7\Response(
                    403,
                    [
                        'X-Robots-Tag' => 'noai, noimageai',
                        'Content-Type' => 'text/plain; charset=utf-8',
                    ],
                    'Forbidden',
                );
            }
        }

        // Pass through and decorate the response
        $response = $handler->handle($request);
        return $response->withHeader('X-Robots-Tag', 'noai, noimageai');
    }
}

4. MVC middleware registration

<?php
// config/services.php — register PSR-15 middleware in full MVC app
use Phalcon\Di\FactoryDefault;
use Phalcon\Mvc\Application;

$di = new FactoryDefault();

// Register middleware in the dispatcher event chain
$di->setShared('dispatcher', function () {
    $eventsManager = new \Phalcon\Events\Manager();

    $eventsManager->attach('dispatch:beforeDispatch', new AiBotBlocker());

    $dispatcher = new \Phalcon\Mvc\Dispatcher();
    $dispatcher->setEventsManager($eventsManager);
    return $dispatcher;
});

5. public/robots.txt

Phalcon does not bypass middleware for static files — every request goes through before(). The $path === '/robots.txt' guard ensures AI crawlers can fetch the file and learn they are disallowed.

# public/robots.txt
User-agent: *
Allow: /

User-agent: GPTBot
Disallow: /

User-agent: ClaudeBot
Disallow: /

User-agent: CCBot
Disallow: /

User-agent: Google-Extended
Disallow: /

Key points

Framework comparison — PHP ecosystem

FrameworkMiddleware hookShort-circuitUA header
Phalcon Micro$app->before()$response->send(); return false$request->getHeader()
Laravelhandle(Request, Closure)response(403)$request->header()
Slim 4PSR-15 process()new Response(403)getHeaderLine()
Symfonykernel.request event$event->setResponse(new Response(403))$request->headers->get()

Phalcon's before() return-value contract is unique among PHP frameworks — it's closer to facil.io or raw WSGI than the PSR-15 handler pattern. The PSR-15 MiddlewareInterface variant makes Phalcon middleware portable to Slim, Laminas, or any PSR-15 compatible framework without modification.

Dependencies

Phalcon itself is a PHP extension — install via PECL or the official package. No additional Composer packages needed for the before() approach. The PSR-15 variant requires a PSR-7 implementation:

# Install Phalcon extension (Ubuntu/Debian)
curl -s https://packagecloud.io/install/repositories/phalcon/stable/script.deb.sh | bash
apt-get install php8.3-phalcon

# For PSR-15 middleware variant only:
composer require nyholm/psr7 phalcon/phalcon