How is this different from Google Analytics?

Google Analytics shows you traffic. Shadow shows you traffic, AI bot activity, what AI platforms say about your brand, AND tells you what to do about all of it. It's analytics + AI intelligence + action steps in one tool.

Do I need to install anything?

For basic monitoring (bot detection, AI perception, readiness score) — nope, just enter your URL. For full visitor analytics (clicks, behavior, sessions), add one script tag. One-click integrations for Vercel, Shopify, WordPress, and more.

Will it slow down my site?

No. The script is under 5KB and loads async. Zero impact on page speed or Core Web Vitals. External monitoring has literally no impact — it watches from the outside.

What AI bots does Shadow detect?

All of them. GPTBot, ClaudeBot, PerplexityBot, Google-Extended, Bytespider, Amazonbot, and dozens more. The Shadow Network means new bots get identified across all users instantly.

What do you mean by "actionable steps"?

Shadow doesn't just show you graphs. It says things like: "ChatGPT has your pricing wrong — add structured data to /pricing to fix it" or "Your bounce rate on /features is 68% — here's why and what to change." Specific, do-it-today recommendations.

Can Shadow block bots?

Shadow is a telescope, not a shield. It shows you who's visiting and what AI says about you. It generates block rules and robots.txt configs you can apply — but it doesn't intercept traffic.

Yes. Shadow never collects PII. IP addresses are hashed after classification. No cookies on your visitors. All Shadow Network data is anonymized. GDPR compliant by design.

How do I block archive.org_bot?

Add this to your robots.txt file: User-agent: archive.org_bot / Disallow: /

Does archive.org_bot respect robots.txt?

Yes, archive.org_bot by Internet Archive respects robots.txt directives.

Other Respects robots.txt

archive.org_bot

by Internet Archive First seen: 1996-10

About

The Internet Archive's Heritrix-based crawler, responsible for the Wayback Machine's web preservation project. While not an AI training crawler itself, the Internet Archive's data is widely used as a source for AI training datasets — Common Crawl, used by many LLMs, draws on archived web data. The 'ia_archiver' user-agent string is an older alias for the same crawler. Blocking this bot prevents your content from being permanently archived.

Purpose

Web preservation and Wayback Machine archiving

User Agent String

Mozilla/5.0 (compatible; archive.org_bot +http://www.archive.org/details/archive.org_bot)

How to Control in robots.txt

🚫 Block archive.org_bot

User-agent: archive.org_bot
Disallow: /

✅ Allow archive.org_bot

User-agent: archive.org_bot
Allow: /

Is archive.org_bot crawling your site?

Enter your URL below — scan takes under 5 seconds.

Free · No signup · Instant results