Cloudflare is accusing AI startup Perplexity of using stealth crawlers to bypass website restrictions and access content in violation of internet crawling norms.
In a recent blog post, Cloudflare alleges that Perplexity deployed undeclared bots designed to slip past traditional defenses and avoid detection. The company says the activity spans millions of automated requests and has triggered updated countermeasures.
Unauthorized access to private domains
According to Cloudflare, it began investigating Perplexity after receiving reports from website operators who had blocked the company’s official crawlers but continued seeing their content appear in Perplexity’s results. To test the claims, Cloudflare created newly registered, undiscoverable domains and configured them to deny access to all bots.
Despite those protections, Cloudflare says Perplexity was still able to retrieve and surface specific content from the restricted test sites. The company alleges that Perplexity bypassed both robots.txt directives and web application firewall (WAF) rules in doing so.
Bots disguised as browsers
Cloudflare says the content was accessed using undisclosed bots that didn’t identify themselves as belonging to Perplexity. These crawlers reportedly posed as ordinary browsers by mimicking common user agents such as Chrome on macOS.
The traffic also came from IP addresses outside of Perplexity’s documented range. Cloudflare states the bots rotated through different IPs and even changed Autonomous System Numbers (ASNs) to avoid detection and blocking.
Cloudflare attributes millions of these stealth requests to Perplexity each day, spread across tens of thousands of domains. The company claims it was able to fingerprint the activity using network signals and machine learning.
Perplexity’s web crawlers
Perplexity states that it uses two bots: one for search indexing and another to fetch content in response to user questions. Both operate under declared user agents, respect published IP ranges, and are not used to train AI models.
These crawlers are documented on Perplexity’s website, but Cloudflare’s allegations center on traffic coming from undeclared sources, outside the scope of what the company publicly describes.
Concerns about how Perplexity accesses content aren’t new. In 2024, multiple reports claimed the company was scraping websites that had blocked bots, relying on unlisted IPs and external crawling tools. Amazon later confirmed it was reviewing whether this breached the AWS terms of service.
More recently, the BBC sent a legal letter accusing Perplexity of reproducing its content without permission and bypassing robots.txt restrictions it had placed on the company’s declared bots.
Just a sales pitch?
Perplexity disputed Cloudflare’s allegations in an email to TechCrunch. Spokesperson Jesse Dwyer called Cloudflare’s blog post a “sales pitch” and said the screenshots cited showed no content was accessed. He added that the bot named in the report is not operated by the company.
In other cybersecurity news, AI agents are creating insider security threat blind spots.