Skip to content
Guides/Supabase Edge Functions

How to Block AI Bots in Supabase Edge Functions: Complete 2026 Guide

Supabase Edge Functions run on Deno at the network edge using the standard Fetch API. Blocking AI bots is a single header check before any database call — zero Supabase reads for blocked requests. The _shared/ directory lets you define the UA list once and import it across every function in your project.

Deno-native — no npm install needed

Supabase Edge Functions are TypeScript-first. Your bot-blocking helper is a plain .ts file — no build step, no npm package. Import with a relative path: import { isAiBot } from '../_shared/ai-bots.ts'.

Protection layers

1
robots.txtDedicated robots function or Supabase Storage bucket — served before any UA check
2
noai meta tagIn HTML responses returned by SSR edge functions — crawlers that reach the HTML still see the directive
3
X-Robots-Tag headerAdded to every non-403 Response headers — covers JSON APIs and non-HTML responses
4
Hard 403 (static list)isAiBot() check from _shared/ai-bots.ts — blocks before any Supabase DB query
5
Hard 403 (dynamic list)Block-list from Supabase table — update rules without a redeploy; cached per isolate instance

Step 1 — Shared AI bot helper (_shared/ai-bots.ts)

Create supabase/functions/_shared/ai-bots.ts. Supabase bundles the _shared/ directory at deploy time — it is not deployed as its own function. Import it with a relative path from any function.

// supabase/functions/_shared/ai-bots.ts
export const AI_BOTS = [
  // OpenAI
  'gptbot', 'chatgpt-user', 'oai-searchbot',
  // Anthropic
  'claudebot', 'claude-web',
  // Common Crawl / CCBot
  'ccbot',
  // Bytedance
  'bytespider',
  // Meta
  'meta-externalagent',
  // Perplexity
  'perplexitybot',
  // Google AI
  'google-extended', 'googleother',
  // Cohere
  'cohere-ai',
  // Amazon
  'amazonbot',
  // Diffbot
  'diffbot',
  // AI2
  'ai2bot',
  // DeepSeek
  'deepseekbot',
  // Mistral
  'mistralai-user',
  // xAI
  'xai-bot',
  // You.com
  'youbot',
  // DuckDuckGo AI
  'duckassistbot',
  // Webzio
  'webzio',
] as const;

export function isAiBot(userAgent: string | null): boolean {
  if (!userAgent) return false;
  const ua = userAgent.toLowerCase();
  return AI_BOTS.some((bot) => ua.includes(bot));
}

Step 2 — Block AI bots in your edge function

The check runs at the top of Deno.serve() before any database access. A 403 response costs zero Supabase reads.

// supabase/functions/my-api/index.ts
import { isAiBot } from '../_shared/ai-bots.ts';

const ROBOTS_TXT = `User-agent: *
Allow: /

User-agent: GPTBot
Disallow: /

User-agent: ClaudeBot
Disallow: /

User-agent: CCBot
Disallow: /

User-agent: Bytespider
Disallow: /`;

Deno.serve(async (req: Request) => {
  // 1. Serve robots.txt (if this function handles the /robots.txt path)
  if (new URL(req.url).pathname.endsWith('/robots.txt')) {
    return new Response(ROBOTS_TXT, {
      headers: { 'Content-Type': 'text/plain; charset=utf-8' },
    });
  }

  // 2. Block AI bots — before any DB call or business logic
  if (isAiBot(req.headers.get('user-agent'))) {
    return new Response('Forbidden', { status: 403 });
  }

  // 3. Handle CORS preflight
  if (req.method === 'OPTIONS') {
    return new Response(null, {
      headers: {
        'Access-Control-Allow-Origin': '*',
        'Access-Control-Allow-Headers': 'authorization, x-client-info, apikey, content-type',
      },
    });
  }

  // 4. Your normal business logic
  const data = { message: 'Hello, architect.' };

  return new Response(JSON.stringify(data), {
    headers: {
      'Content-Type': 'application/json',
      // 5. X-Robots-Tag on all successful responses
      'X-Robots-Tag': 'noai, noimageai',
    },
  });
});

Step 3 — Dedicated robots function

Supabase Edge Functions have no filesystem at runtime — you cannot read a robots.txt file. Embed it as a string constant and serve it from a dedicated robots function. Route GET /robots.txt to this function via your CDN, Vercel rewrite, or Nginx proxy rule.

// supabase/functions/robots/index.ts
// Deploy and route GET /robots.txt → this function via your proxy/CDN

const ROBOTS_TXT = `User-agent: *
Allow: /

# AI training bots — blocked
User-agent: GPTBot
Disallow: /

User-agent: ClaudeBot
Disallow: /

User-agent: CCBot
Disallow: /

User-agent: Bytespider
Disallow: /

User-agent: Google-Extended
Disallow: /

User-agent: PerplexityBot
Disallow: /

User-agent: Meta-ExternalAgent
Disallow: /

User-agent: YouBot
Disallow: /

User-agent: AmazonBot
Disallow: /

User-agent: Diffbot
Disallow: /`;

Deno.serve((_req: Request) => {
  return new Response(ROBOTS_TXT, {
    headers: {
      'Content-Type': 'text/plain; charset=utf-8',
      'Cache-Control': 'public, max-age=86400',
    },
  });
});

Deploy: supabase functions deploy robots --no-verify-jwt — robots.txt must be publicly readable without a Supabase JWT. Set --no-verify-jwt for this function only.

Step 4 — Dynamic block-list from a Supabase table

For rule updates without a redeploy, store UA patterns in a blocked_bots table and cache per isolate instance. The module-level cache avoids a DB hit on every request; a TTL forces a re-fetch after 5 minutes.

// supabase/functions/my-api/index.ts — dynamic block-list from Supabase table
import { createClient } from 'npm:@supabase/supabase-js@2';
import { AI_BOTS } from '../_shared/ai-bots.ts'; // fallback

const supabaseAdmin = createClient(
  Deno.env.get('SUPABASE_URL')!,
  Deno.env.get('SUPABASE_SERVICE_ROLE_KEY')!,
);

// Module-level cache — persists for the lifetime of this isolate instance
let cachedBotList: string[] | null = null;
let cacheExpiresAt = 0;
const CACHE_TTL_MS = 5 * 60 * 1000; // 5 minutes

async function getBotList(): Promise<string[]> {
  const now = Date.now();
  if (cachedBotList && now < cacheExpiresAt) return cachedBotList;

  try {
    const { data, error } = await supabaseAdmin
      .from('blocked_bots')
      .select('ua_pattern');

    if (error || !data?.length) {
      // Fall back to hardcoded list on DB error
      return [...AI_BOTS];
    }

    cachedBotList = data.map((r: { ua_pattern: string }) => r.ua_pattern.toLowerCase());
    cacheExpiresAt = now + CACHE_TTL_MS;
    return cachedBotList;
  } catch {
    return [...AI_BOTS];
  }
}

Deno.serve(async (req: Request) => {
  const ua = req.headers.get('user-agent')?.toLowerCase() ?? '';
  const botList = await getBotList();

  if (botList.some((pattern) => ua.includes(pattern))) {
    return new Response('Forbidden', { status: 403 });
  }

  // ... rest of your handler
  return new Response(JSON.stringify({ ok: true }), {
    headers: { 'Content-Type': 'application/json', 'X-Robots-Tag': 'noai, noimageai' },
  });
});

Create the table:

create table blocked_bots (
  id serial primary key,
  ua_pattern text not null unique,
  created_at timestamptz default now()
);
insert into blocked_bots (ua_pattern) values
  ('gptbot'), ('claudebot'), ('ccbot'), ('bytespider'),
  ('google-extended'), ('perplexitybot'), ('amazonbot');

Step 5 — noai meta tag + X-Robots-Tag for SSR functions

For edge functions that return HTML, add both the meta tag and the response header. Crawlers that bypass your UA check will still see the directive.

// supabase/functions/ssr-page/index.ts — X-Robots-Tag for every HTML response
Deno.serve(async (req: Request) => {
  if (isAiBot(req.headers.get('user-agent'))) {
    return new Response('Forbidden', { status: 403 });
  }

  const html = `<!DOCTYPE html>
<html>
  <head>
    <!-- noai meta tag for crawlers that reach this far -->
    <meta name="robots" content="noai, noimageai">
  </head>
  <body>...</body>
</html>`;

  return new Response(html, {
    headers: {
      'Content-Type': 'text/html; charset=utf-8',
      // Belt-and-suspenders: header + meta tag
      'X-Robots-Tag': 'noai, noimageai',
    },
  });
});

Supabase Edge vs Cloudflare Workers vs Vercel Edge vs Lambda@Edge

FeatureSupabase EdgeCF WorkersVercel EdgeLambda@Edge
RuntimeDeno (TypeScript native)V8 isolate (no Node/Deno)V8 isolate (Next.js middleware)Node.js / Python (heavier)
UA block patternreq.headers.get("user-agent")request.headers.get("user-agent")req.headers.get("user-agent")event.Records[0].cf.request.headers["user-agent"]
robots.txtDedicated function or Storage bucketWorkers Assets or explicit routepublic/ directory (static)S3 origin or Lambda@Edge route
Path interceptionProxy/CDN rewrite needed for /robots.txtWorkers Routes — any pathmiddleware.ts — any pathCloudFront behaviours
Shared helpers_shared/ directoryutils/ + wrangler.tomllib/ imported from middlewareLambda Layers
Dynamic rulesSupabase table (first-class)Workers KV or D1Upstash Redis or external DBDynamoDB or SSM Parameter Store
Cold start latency~50–150 ms (Deno isolate)<5 ms (V8 isolate)<5 ms (V8 isolate)~100–500 ms (Node.js)
Deploy commandsupabase functions deploywrangler deployvercel deploy (automatic)aws lambda update-function-code

Quick reference

UA headerreq.headers.get('user-agent')
Hard 403new Response('Forbidden', { status: 403 })
X-Robots-Tagheaders.set('X-Robots-Tag', 'noai, noimageai')
Shared helper importimport { isAiBot } from '../_shared/ai-bots.ts'
Deploy functionsupabase functions deploy <name>
No-JWT deploysupabase functions deploy robots --no-verify-jwt
Local testsupabase functions serve

FAQ

How do I block AI bots in a Supabase Edge Function?

Check the User-Agent at the start of Deno.serve(): const ua = req.headers.get('user-agent')?.toLowerCase() ?? ''. Match against an AI_BOTS array with AI_BOTS.some(b => ua.includes(b)). Return new Response('Forbidden', { status: 403 }) before any database call. Define AI_BOTS at module scope — initialised once per cold start, not per request.

How do I serve robots.txt from Supabase Edge Functions?

Supabase Edge Functions have no filesystem at runtime. Embed robots.txt as a string constant and serve from a dedicated robots function. Route GET /robots.txt to it via your CDN or reverse proxy. Deploy with --no-verify-jwt so crawlers can access it without authentication. Alternatively, upload robots.txt to a public Supabase Storage bucket and serve from the CDN URL directly.

How do I share the AI bot list across multiple edge functions?

Create supabase/functions/_shared/ai-bots.ts with the UA array and an isAiBot() helper. Import it with import { isAiBot } from '../_shared/ai-bots.ts'. Supabase bundles the _shared/ directory at deploy time — it is not a deployable function itself. Keep the shared file lean: only the UA list and the check function, no Supabase client imports.

How is Supabase Edge different from Cloudflare Workers for bot blocking?

Both use V8-based edge runtimes with the Fetch API. Key differences: Supabase Edge runs on Deno (TypeScript native, JSR imports, Deno.serve()). Cloudflare Workers use a V8-only environment with export default { fetch }. Workers can intercept any URL path via Workers Routes; Supabase functions live at /functions/v1/<name> and require a proxy rewrite to intercept /robots.txt. Supabase functions have first-class Postgres access; Workers use KV or D1.

Can I update block rules without redeploying my edge function?

Yes — store UA patterns in a blocked_bots Supabase table and query it with the service-role client at the start of the handler. Cache the result in a module-level variable with a TTL (e.g. 5 minutes) so subsequent requests in the same isolate instance skip the DB query. Cold starts re-fetch. New isolate instances (after a re-deploy or after the instance is evicted) will pick up the latest rules on their first request.

Is your site protected from AI bots?

Run a free scan to check your robots.txt, meta tags, and overall AI readiness score.