Plagiarism Predicament: AI Startup Perplexity Under Fire for Allegedly Mimicking Content from Forbes, CNBC, and Others Without Acknowledgment

Plagiarism Predicament: AI Startup Perplexity Under Fire for Allegedly Mimicking Content from Forbes, CNBC, and Others Without Acknowledgment

  • Tuesday, 11 June 2024 18:00

In a startling revelation, AI startup Perplexity has come under intense scrutiny for allegedly appropriating content from reputable news outlets like CNBC and Forbes without appropriate credit or acknowledgment, as outlined in a damning report. The controversy unfolded around the feature dubbed "Perplexity Pages," a platform showcasing articles purportedly "curated" by the company, sourced from various third-party news outlets. However, these articles fail to credit the original news outlets by name within the curated content, despite closely mirroring their wording.

According to Forbes, instead of proper attribution, Perplexity opted for inconspicuous logos linking back to the original stories. One egregious example highlighted by Forbes involved Perplexity's chatbot regurgitating a version of a Forbes report on ex-Google CEO Eric Schmidt's military drone project. The "curated" version lifted substantial passages and even an in-house graphic from the original Forbes story without explicit acknowledgment.

Forbes identified two additional instances where Perplexity Pages allegedly scraped articles without crediting the original sources: an exclusive CNBC report on Elon Musk's strategic chip shipments and a Bloomberg piece on Apple's foray into home robotics. In both cases, Perplexity purportedly utilized near-verbatim passages without naming the sources within the copy.

Despite repeated attempts for comment, Forbes, CNBC, and Bloomberg remained silent on the matter. Perplexity AI, boasting a valuation surpassing $1 billion and backed by prominent investors like Jeff Bezos and Nvidia, faced the heat. Perplexity AI CEO Aravind Srinivas acknowledged the issue but argued that their chatbot offers more prominent citations to third-party outlets compared to competitors like Google Gemini, OpenAI’s ChatGPT, and Microsoft’s Copilot.

The unfolding saga underscores broader questions about intellectual property, ethical AI use, and the responsibility of tech companies to uphold journalistic standards in content aggregation.

Aravind Srinivas, CEO of Perplexity, attempted to address the controversy surrounding the company's alleged content appropriation, particularly evident in a screenshot he shared depicting Perplexity's post on Eric Schmidt’s AI-powered drones. In the screenshot, a small hyperlink to the original Forbes article was faintly visible near the top of the page. Srinivas emphasized a distinction between Perplexity Pages and its primary product, an AI-powered chatbot. He acknowledged the chatbot's imperfections, asserting ongoing efforts to enhance it based on user feedback.

Srinivas contended that Perplexity's core product consistently provided proper source attribution, contrasting it with competitors like ChatGPT, Gemini, and Copilot, which he claimed lack prominent attribution features. He pledged improvement for the Pages and discover features, conceding to feedback suggesting the need for clearer source identification.

Forbes' John Paczkowski rebuffed Srinivas' defense, condemning Perplexity's actions as tantamount to plagiarism. He criticized the inadequate attribution, labeling it as theft rather than mere oversight. When approached for comment, a Perplexity AI spokesperson disclosed adjustments made to source presentation on Pages in response to Forbes' exposé. Now, all sources are prominently displayed at the top of each page and in footnotes for each section, with mobile compatibility forthcoming.

The spokesperson reaffirmed Perplexity's commitment to attribution, underscoring their core product's design to cite sources clearly, a feat purportedly lacking in most contemporary chatbots.

This incident further underscores the contentious relationship between AI firms and journalism outlets, with accusations of content exploitation by chatbots without due credit or compensation. Critics warn of the potential detrimental impact on news publishers unless regulatory intervention occurs. Last November, the News Media Alliance cautioned against the proliferation of "plagiarism stew" created by chatbots lifting text, potentially infringing copyright laws.

Google faced criticism for introducing auto-generated text summaries, branded as "AI Overviews," in search results, which seemingly prioritized its own content while relegating links to other outlets. This move sparked debate and concern among users and experts alike. What are your thoughts on this development? Feel free to share your opinion in the comments below.

Interestingly, Google's AI-powered search started yielding peculiar responses, such as advising users to consume rocks or incorporate glue into their pizza recipes. Upon investigation, users discovered that the "pizza glue" suggestion was lifted verbatim from a humorous Reddit post dating back a decade. This incident raised questions about the reliability and accuracy of AI-generated content and underscored the importance of maintaining integrity and credibility in information dissemination.

In conclusion, Google's introduction of "AI Overviews" in search results, coupled with instances of bizarre and inaccurate responses from its AI-powered search, has ignited significant scrutiny and debate. The prioritization of auto-generated content over external sources has drawn criticism, prompting concerns about fairness, transparency, and the reliability of information provided to users. Moreover, the discovery that some AI-generated responses were directly lifted from unrelated sources, like a decade-old Reddit post, underscores the imperative for rigorous quality control and ethical considerations in AI content generation. As technology continues to shape information dissemination, it becomes increasingly crucial for platforms like Google to uphold standards of accuracy, credibility, and respect for diverse sources to foster trust and reliability among users.