Breaking Internet Rules with AI: Perplexity

2024-06-21

Perplexity wants to change the way we use the internet, but this AI search startup backed by Jeff Bezos may be breaking the rules to achieve this goal. According to a report released by developer Robb Knight this week and confirmed by Wired magazine, the company appears to be ignoring a widely accepted web standard - the Robots Exclusion Protocol - to crawl parts of websites that site operators do not want to be accessed by robots.


Perplexity's service summarizes articles on the web, claiming to provide "reliable answers" without the need to click on different links, as described in a blog post. To achieve this, according to Wired magazine and Knight's findings, Perplexity ignores deliberately written code that blocks web crawlers (robots.txt files). These two reports found that Perplexity uses undisclosed IP addresses to bypass these robots.txt files and crawl websites regardless. According to Wired magazine, its website blocked Perplexity's web crawler earlier this year, but this AI search engine is still able to summarize its articles in detail.


Nevertheless, Perplexity claims to respect the Robots Exclusion Protocol in its website documentation. Perplexity's CEO Aravind Srinivas stated that they have a "profound and fundamental misunderstanding" of how Perplexity and the internet work, but did not directly refute these findings.


In addition, Perplexity is currently facing legal threats for violating other internet rules: copyright infringement. Forbes has threatened legal action against Perplexity this week, accusing the AI startup of plagiarizing Forbes' reports without proper attribution. Forbes had originally reported on former Google CEO Eric Schmidt's AI drone project, while Perplexity used Forbes' text and images to create AI-generated articles, podcasts, and videos. Forbes' executive editor publicly criticized Perplexity earlier this month.


Despite its usefulness, Perplexity redirects traffic on the internet. While Google also indexes web pages and provides short AI summaries, it directly directs traffic to the source of information. Perplexity, on the other hand, actually writes detailed AI articles, allowing users to access information without clicking on websites, which disrupts the business model of digital media.