- Cloudflare accuses Perplexity of bypassing robots.txt and masking its crawling with undeclared user agents and IP addresses.
- The company claims to have observed ASN changes and millions of requests daily across tens of thousands of domains.
- Perplexity denies covert practices, questions the methodology, and argues that its AI works differently than a traditional crawler.
- Cloudflare delists Perplexity as a verified bot and enables rules to block AI tracking by default.
Cloudflare has raised the alarm by publishing a report in which accuses AI-powered answer engine Perplexity of continuing to crawl websites despite barriers placed by their owners. According to the infrastructure provider, the service would have robots.txt ignored and bypass network blocks to access banned content.
In a landscape where AI devours data to train models and respond in real time, the balance between innovation and respect for the rules of the web ecosystem is getting tenseThe controversy rekindles the debate on the unauthorized scraping and the technical and ethical limits that those who build products based on large amounts of online information should comply with.
What Cloudflare is reporting and why it matters

The network security and performance company says it received Customer's complaints whose sites continued to receive access attributed to Perplexity despite ban it in robots.txt and apply rules of the WAF to block their declared trackers. After investigating, Cloudflare claims to have detected a pattern of covert tracking incompatible with the preferences of website owners.
The supplier claims to have observed this behavior in tens of thousands of domains and with millions of requests daily, a volume which, in his opinion, shows systematic rather than incidental practices. As a result, has removed Perplexity from its list of verified bots and has activated heuristics and managed rules for block this tracking by default.
How Perplexity would have overcome the barriers

According to Cloudflare, when your declared trackers (as identified by Perplexity user agent names) encountered a crash, the system would go to impersonate a browser common, presenting itself as if it were Chrome on macOS to camouflage their identity and avoid detection.
In addition, the accesses came from unpublished IP ranges by Perplexity and rotated frequently, which would have made filtering difficult. Cloudflare also claims to have seen changes in the ASN (autonomous systems) origin of the requests, another sign of block evasion network.
The research mentions that the observed behavior would not respect the pattern of the good crawlers described in RFC 9309 and in its “verified bots” policy: identity transparency (agent, IPs and contact), traffic calming, a clear objective and respect robots.txt already the limits set by site owners.
Cloudflare says it has been able to “leave a mark” to this traffic through a combination of network signals and machine learning, adding signatures to your managed rules that identify and block this activity, even for customers of the free plan.
Testing with decoy domains and results
To confirm their suspicions, the team created new and unpublished domains (not indexed or publicly linked) and applied a policy to them total ban on robots.txt, as well as specific rules for banning Perplexity bots. After consulting the AI for those sites, Cloudflare claims that got answers with details about the hosted content, something that—if correct—would indicate access despite barriers.
When the block was effective, Cloudflare observed that Perplexity's AI resorted to alternative sources to build a response, but less precise and without the particularities of the original material, reflecting that the restriction had worked.
Perplexity's official response

Perplexity, for its part, reject the accusations of covert tracking and claims that Cloudflare has misunderstood part of the activity analyzed. Company spokespersons have described the report as a “commercial piece” and they claim that some evidence they would not test real accesses or even correspond to other people's bots.
The startup has also shared its stance on posts on X, where he questions the capacity of the detection systems to differentiate between legitimate AI assistants, third-party trackers, and malicious traffic. Furthermore, it argues that a agent seeking timely information to respond to a query it doesn't work the same than a traditional crawler that crawls the web en masse.
Measures, good practices and the role of other actors
As part of its strategy, Cloudflare has delisted from Perplexity from its registry of trusted bots and has added rules for blocking its alleged hidden tracking. The company recommends that administrators activate anti-bot policies, Apply challenges when a total block is not desired and use specific managed rules against the AI scraping.
In its argument, Cloudflare contrasts the case with examples of compliance of best practices, citing actors who respect robots.txt, document their agents and adopt emerging standards such as Web Bot AuthIn comparative tests, it claims that other bots they stopped when encountering a network ban or block, without camouflaged retries.
A conflict that marks the course of the ecosystem

The supplier anticipates a constant evolution of the tactics of bot operators and the defenses used to contain them. In parallel, he participates in work with experts and organizations such as the IETF to impulse robots.txt extensions and measurable principles that well-intentioned trackers should adhere to.
Beyond the specific pulse, the case puts on the table the crisis of confidence between content creators, platforms and AI companies: who can access what, under what conditions, and how make it transparent without breaking business models or slowing innovation. Everything points to this conversation will remain open while AI agents gain prominence and the web adjusts its rules of coexistence.
This episode leaves a clear message: AI tracking is under scrutiny, with Cloudflare denouncing camouflage tactics attributed to Perplexity and the startup firmly denying it; in the middle, site owners have access to new tools to control access and a set of good practice under construction that will mark the playing field in the coming months.
I am a technology enthusiast who has turned his "geek" interests into a profession. I have spent more than 10 years of my life using cutting-edge technology and tinkering with all kinds of programs out of pure curiosity. Now I have specialized in computer technology and video games. This is because for more than 5 years I have been writing for various websites on technology and video games, creating articles that seek to give you the information you need in a language that is understandable to everyone.
If you have any questions, my knowledge ranges from everything related to the Windows operating system as well as Android for mobile phones. And my commitment is to you, I am always willing to spend a few minutes and help you resolve any questions you may have in this internet world.