About
Diffbot's AI-powered web scraper that uses computer vision and NLP to automatically extract structured data from web pages. Powers knowledge graphs and data extraction APIs.
Purpose
Structured data extraction and knowledge graph building
User Agent String
Mozilla/5.0 (compatible; Diffbot/0.1; +https://www.diffbot.com)
How to Control in robots.txt
🚫 Block Diffbot
User-agent: Diffbot Disallow: /
✅ Allow Diffbot
User-agent: Diffbot Allow: /
Complete Guide: How to Block Diffbot
Server-level blocking, nginx configs, Cloudflare rules, Next.js middleware, and more →
Is Diffbot crawling your site?
Run a free scan to check if Diffbot's crawler is accessing your website.
Check if Diffbot is crawling YOUR site →