Skip to content
Perplexity AICompliance ControversyAI Search

How to Block PerplexityBot: Stop Perplexity AI from Scraping Your Site

PerplexityBot was at the centre of a 2024 crawler controversy after publishers documented it scraping paywalled content. Here's how to block it — and the real tradeoff for publishers who want AI search visibility.

Updated March 2026

The robots.txt Controversy (2024)

In mid-2024, Wired, Forbes, and other publications documented cases where Perplexity appeared to summarise content from paywalled and robots.txt-disallowed pages. Perplexity attributed some incidents to a third-party crawler they were using (not PerplexityBot itself). Since then, Perplexity has stated improved compliance — but for publishers who want certainty, server-level blocking remains the safest option.

Perplexity Runs Two Separate Agents

Blocking one does not block the other. Full protection requires blocking both:

PerplexityBot

Background indexing crawler. Systematically crawls the web to build Perplexity's knowledge base and search index. Runs continuously, not triggered by users.

perplexity-user

Real-time search crawler. Fetches pages live when a user's query triggers a fresh page read. Similar to ChatGPT-User in that it's request-triggered, not autonomous.

How to Block PerplexityBot in robots.txt

Add both agents to your robots.txt for full coverage:

robots.txtBlock both Perplexity agents
User-agent: PerplexityBot
Disallow: /

User-agent: perplexity-user
Disallow: /

For server-level blocking (stronger guarantee than robots.txt), add to nginx:

nginx.confServer-level block
if ($http_user_agent ~* "PerplexityBot|perplexity-user") {
    return 403;
}

Cloudflare WAF option

In Cloudflare: Security → WAF → Custom Rules → Create Rule. Match: http.user_agent contains "PerplexityBot" or perplexity-user. Action: Block. This fires before the request reaches your origin.

The Visibility Tradeoff

Unlike traditional search engines, Perplexity often answers questions directly — summarising your content rather than linking to it. This creates an unusual tradeoff:

Why block
  • • Protects paywalled content from AI summarisation
  • • Prevents traffic cannibalism (Perplexity answers so users don't click through)
  • • Compliance concerns around data use
  • • Historical robots.txt violations make trust lower
Why allow
  • • Citation links drive referral traffic to your site
  • • Presence in AI search results grows your brand reach
  • • Perplexity's Publisher Program offers revenue share
  • • Blocking may hurt SEO if AI search becomes primary discovery

Publisher Program: Perplexity offers an opt-in publisher arrangement where verified publishers get attributed citations and a share of Pro subscription revenue. This is an alternative to blocking — you allow crawling in exchange for traffic and compensation.

Frequently Asked Questions

Does Perplexity now reliably respect robots.txt?

Perplexity states that PerplexityBot respects robots.txt Disallow directives. The 2024 incidents were primarily attributed to a third-party crawler they were using, not PerplexityBot itself. Current compliance is considered improved. That said, if you want certainty, server-level blocking (nginx, Cloudflare) is more reliable than robots.txt alone.

What user agent strings does PerplexityBot use?

The primary user agent is: Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; PerplexityBot/1.0; +https://perplexity.ai/perplexitybot). The real-time agent is perplexity-user. In robots.txt, use "PerplexityBot" and "perplexity-user" as the tokens.

Will blocking PerplexityBot remove my site from Perplexity search entirely?

Yes — blocking PerplexityBot means Perplexity cannot crawl your pages, so they won't appear in Perplexity's answers or search results. Existing indexed content may remain briefly but will eventually expire from their index as their crawl cycle refreshes.

Can I block PerplexityBot for paywalled content only?

Yes. Use path-specific Disallow directives: Disallow: /premium/ or Disallow: /members/ — this lets Perplexity index your free pages while blocking access to subscriber-only content.

Related Guides

Is your site protected from AI bots?

Run a free scan to check your robots.txt, meta tags, and overall AI readiness score.

Related Guides