Cloudflare Accuses Perplexity of Using AI Bots to Crawl Blocked Websites in Secret

AI Startup Accused of Circumventing Website Restrictions
Cloudflare's recent investigation alleges that AI search company Perplexity has been employing tactics to bypass crawling restrictions implemented by website owners. The internet infrastructure provider reports observing systematic attempts to disguise Perplexity's web crawlers when encountering access barriers.
The Circumvention Tactics
According to Cloudflare's findings, Perplexity's crawlers initially present standard identification ("PerplexityBot" or "Perplexity-User"). However, when blocked through:
- robots.txt directives
- Web Application Firewall rules
- Other access restrictions
The system allegedly switches to masking its identity as a regular Chrome browser user on macOS, utilizing:
- Rotating IP addresses not officially listed
- Changing autonomous system network identifiers
- Undocumented user agent patterns
Scale of Activity
Cloudflare documents this behavior across:
- Tens of thousands of domains
- Millions of daily requests
- Various network configurations
Company Responses
Perplexity's official statement contests Cloudflare's characterization, describing it as:
- A "publicity stunt"
- Containing "many misunderstandings"
- Potentially confusing legitimate user traffic with scraping activity
The startup attributes some detected activity to:
- Actual users making specific requests
- Third-party service BrowserBase
- Occasional technical necessities
Industry Context
This incident follows:
- Previous reports of Perplexity bypassing paywalls
- The company's past attributions to third-party crawlers
- Growing industry concerns about AI content scraping
Cloudflare has taken action by:
- Removing Perplexity's verified bot status
- Implementing new blocking measures
- Expanding default AI crawler restrictions
The situation reflects broader tensions between:
- AI companies' data needs
- Publisher rights and protections
- Evolving internet infrastructure responses
Related article
WordPress.com now allows AI agents to write and publish posts, plus more
WordPress.com, the popular web hosting and publishing platform, is now embracing AI agents—a move that could reshape the look and feel of the web. The company announced Friday that it will allow AI agents to draft, edit, and publish content on custom
Kakao Mobility outlines Level 4 autonomous driving roadmap for physical AI
Kakao Mobility is planning to develop Level 4 autonomous driving technologies internally as part of its physical AI strategy.
At the 2026 World IT Show conference in Seoul's COEX, Kim Jin-kyu — vice president and head of Kakao Mobility's Physical AI
Barry Diller: Trust in Sam Altman irrelevant as AGI nears
Barry Diller, the billionaire media titan, does not believe OpenAI CEO Sam Altman is untrustworthy, despite recent reports suggesting otherwise. Speaking at the Wall Street Journal's "Future of Everything" conference this week, Diller defended Altman
Related Special Topic Recommendations
Comments (3)
0/500
Alors, cette nouvelle me fait vraiment réfléchir aux limites entre innovation et éthique dans l'IA. Perplexity ferait ça en cachette ? Si c'est vrai, c'est pas joli joli. On dirait un peu un jeu du chat et de la souris où les startups bousculent les règles. 😬 Mais bon, Cloudflare n'est pas parfait non plus, ils ont leur propre agenda. Un peu de transparence ne ferait de mal à personne ! C'est ce manque de clarté qui mine la confiance du public envers toute cette industrie.
¿Está Perplexity realmente raspando sitios web bloqueados a escondidas? 🤔 Si es verdad, es bastante preocupante. Muchas empresas de IA prometen ser 'éticas', pero a veces parece que sus acciones contradicen sus palabras. Espero que haya más transparencia en la industria y que respeten los archivos robots.txt de los sitios. Este caso podría suponer un precedente importante.
¿Es esto lo que llaman 'innovación'? Primero nos venden la IA como una herramienta mágica, y luego descubrimos que hacen trampas para robar datos. Si Perplexity realmente evita bloqueos a propósito, es un problema serio de ética y legal. ¡Qué hipocresía! 🙄 ¿Hasta dónde llegarán algunas startups para ganar la carrera de la IA?

AI Startup Accused of Circumventing Website Restrictions
Cloudflare's recent investigation alleges that AI search company Perplexity has been employing tactics to bypass crawling restrictions implemented by website owners. The internet infrastructure provider reports observing systematic attempts to disguise Perplexity's web crawlers when encountering access barriers.
The Circumvention Tactics
According to Cloudflare's findings, Perplexity's crawlers initially present standard identification ("PerplexityBot" or "Perplexity-User"). However, when blocked through:
- robots.txt directives
- Web Application Firewall rules
- Other access restrictions
The system allegedly switches to masking its identity as a regular Chrome browser user on macOS, utilizing:
- Rotating IP addresses not officially listed
- Changing autonomous system network identifiers
- Undocumented user agent patterns
Scale of Activity
Cloudflare documents this behavior across:
- Tens of thousands of domains
- Millions of daily requests
- Various network configurations
Company Responses
Perplexity's official statement contests Cloudflare's characterization, describing it as:
- A "publicity stunt"
- Containing "many misunderstandings"
- Potentially confusing legitimate user traffic with scraping activity
The startup attributes some detected activity to:
- Actual users making specific requests
- Third-party service BrowserBase
- Occasional technical necessities
Industry Context
This incident follows:
- Previous reports of Perplexity bypassing paywalls
- The company's past attributions to third-party crawlers
- Growing industry concerns about AI content scraping
Cloudflare has taken action by:
- Removing Perplexity's verified bot status
- Implementing new blocking measures
- Expanding default AI crawler restrictions
The situation reflects broader tensions between:
- AI companies' data needs
- Publisher rights and protections
- Evolving internet infrastructure responses
WordPress.com now allows AI agents to write and publish posts, plus more
WordPress.com, the popular web hosting and publishing platform, is now embracing AI agents—a move that could reshape the look and feel of the web. The company announced Friday that it will allow AI agents to draft, edit, and publish content on custom
Barry Diller: Trust in Sam Altman irrelevant as AGI nears
Barry Diller, the billionaire media titan, does not believe OpenAI CEO Sam Altman is untrustworthy, despite recent reports suggesting otherwise. Speaking at the Wall Street Journal's "Future of Everything" conference this week, Diller defended Altman
Alors, cette nouvelle me fait vraiment réfléchir aux limites entre innovation et éthique dans l'IA. Perplexity ferait ça en cachette ? Si c'est vrai, c'est pas joli joli. On dirait un peu un jeu du chat et de la souris où les startups bousculent les règles. 😬 Mais bon, Cloudflare n'est pas parfait non plus, ils ont leur propre agenda. Un peu de transparence ne ferait de mal à personne ! C'est ce manque de clarté qui mine la confiance du public envers toute cette industrie.
¿Está Perplexity realmente raspando sitios web bloqueados a escondidas? 🤔 Si es verdad, es bastante preocupante. Muchas empresas de IA prometen ser 'éticas', pero a veces parece que sus acciones contradicen sus palabras. Espero que haya más transparencia en la industria y que respeten los archivos robots.txt de los sitios. Este caso podría suponer un precedente importante.
¿Es esto lo que llaman 'innovación'? Primero nos venden la IA como una herramienta mágica, y luego descubrimos que hacen trampas para robar datos. Si Perplexity realmente evita bloqueos a propósito, es un problema serio de ética y legal. ¡Qué hipocresía! 🙄 ¿Hasta dónde llegarán algunas startups para ganar la carrera de la IA?





Home






