Encyclopaedia Britannica and Merriam-Webster have dragged Perplexity AI into court, accusing the startup of copying their work without permission and misusing their brands. They filed a lawsuit on September 11, 2025, in the U.S. District Court for the Southern District of New York.
What the lawsuit claims
- The complaint alleges that Perplexity’s “answer engine” scrapes content from Britannica and Merriam-Webster, copies and summarizes it, then serves that material to users without proper authorization. That “free riding,” as the plaintiffs call it, is said to steal web traffic and undercut their revenue.
- It also accuses Perplexity of violating trademark law by attributing false or misleading content — “hallucinations” generated by the AI — to Britannica or Merriam-Webster. Thus, users might believe Britannica or Merriam-Webster actually produced or endorsed content they did not.
- The complaint lays out a three-stage allegation of infringement: (1) when Perplexity’s crawlers collect the material (“curation stage”), (2) when that material is used as raw input to its answer model (“input stage”), and (3) when Perplexity outputs responses that are verbatim, near-verbatim, or close paraphrases of Britannica/Merriam-Webster content.
Why it’s a big deal
This isn’t just about copyright. It touches on trust, brand reputation, and where the line gets drawn with generative AI. For decades, Britannica and Merriam-Webster have built credibility through careful editorial standards.
If an AI tool uses their names to lend authority to content they didn’t create, that could mislead users.
Revenue models are at stake too. When users get answers directly from Perplexity (or similar tools), they may never click through to the source site. That means fewer ad views, fewer subscriptions, and less income for publishers.
I’ve dug a little more — here are some things the lawsuit implies or raises that weren’t spelled out:
- Robots.txt and crawl-permission issues: The complaint mentions that Perplexity may use crawlers that ignore or evade the rules websites set to prevent automated scraping.
That suggests a technical side to the dispute: not only what content is being used, but how it is being collected.
- Free speech / fair use vs. monetization tension: Some may argue that providing summaries from publicly visible content might be fair use, especially when transformed.
But when summaries replace clicks and revenue, courts may view the harm differently. This case could set important precedent for how generative AIs can use web content without destroying underlying business models.
- Risks for Perplexity if the court rules against them: Beyond damages, they could be forced to redesign part of their answer engine — changing how they cite, how they limit “hallucinated” content, how they attribute.
It may force more licensing agreements with publishers or stricter content usage policies.
- What this means for users: If results have to include better attribution, disclaimers, or avoid using brands like Britannica unless the source is accurate, that could change how “answer engines” are used day to day.
Trust might suffer if people see errors attributed to reputable brands. Also, might increase latency or reduce coverage of some kinds of information if legal risks discourage inclusion.
My take
Feels like a showdown we’ve been heading toward. For too long, many AI companies have assumed that scraping what’s visible online, summarizing it, and attributing loosely (or sometimes incorrectly) is “good enough.”
But this case throws a spotlight on how that approach undermines the economics of producing high-quality verified content.
I side with the publishers here — credibility and accuracy matter. If AI systems piggyback on trusted sources to gain legitimacy, then make up stuff or misattribute it, that cheapens both the information ecosystem and what users believe.
Perplexity (and similar tools) may need to build more ethical, transparent systems — sooner rather than later.