Privacy Scandal: Tumblr and WordPress Plans to Sell User Data to Train AI models (404 Media)

On 19 February 2024, a bombshell report revealed Tumblr and WordPress plans to sell user data to train AI models—including private posts and messages—to companies like OpenAI and Midjourney. Internal documents exposed negotiations to supply this information to an undisclosed AI firm and tech giants like Google, igniting immediate privacy concerns. While neither platform confirmed the deals, the leak sparked outrage, with users accusing the social media giants of trading trust for profit.

The backlash was swift. Critics lambasted how Tumblr and WordPress plans to sell user data to train AI models could exploit personal content without consent, turning blogs and creative posts into commodified datasets. Reddit’s similar AI monetization efforts faced scrutiny, but the intimate nature of Tumblr and WordPress content made this case feel uniquely invasive. Speculation swirled around partnerships with AI firms hungry for expansive datasets, though the undisclosed company remains unnamed.

Behind the chaos lies a harsh truth: as platforms rush to monetize the AI gold rush, Tumblr and WordPress plans to sell user data to train AI models exemplify how user trust becomes collateral. While companies like Google defend such moves as “innovation fuel,” the lack of transparency leaves users feeling weaponized. For millions, their shared words, photos, and ideas—once posted in confidence—are now bargaining chips in a race they never consented to join.

Table of Contents

The Report:

Tumblr and WordPress plans to sell user data to train AI models

The Report by 404Media sent shockwaves through the social media landscape this week. According to the media outlet, insider sources shared confidential documents showing Tumblr and WordPress’s parent company, Automattic, is actively pursuing high-stakes deals with OpenAI and Midjourney. The goal? To leverage user data—posts, comments, and even draft content—to improve training for AI text generator models like ChatGPT and image generator models, respectively. While specifics remain unclear, the partnerships allegedly aim to feed platforms’ archives into AI systems, raising questions about how real examples of human creativity could be repurposed without consent.

Critics argue this move mirrors Reddit’s controversial plans to monetize user contributions, but with higher stakes due to the personal nature of blog content. Automattic hasn’t officially announced the agreements, but confidential documents suggest negotiations are advanced. Users fear their words and art—shared in trust—might become fodder for artificial intelligence models, blurring lines between community spaces and corporate AI labs. As debates over privacy implications intensify, one thing is certain: the scramble to cash in on AI’s boom is turning user data into a currency few agreed to spend.

The Agreements

Automattic, the parent company behind Tumblr and WordPress, is reportedly finalizing agreements with prominent AI companies like OpenAI and Midjourney, though no official announcement or launch dates have been shared. Leaked details suggest these deals involve integrating vast amounts of user data—including posts, drafts, and media—into AI models for model development. While framed as a positive step for advancing AI capabilities, the move has ignited privacy concerns, particularly among bloggers who rely on these platforms for creative expression. A hastily added settings option to opt out of sharing personal content has done little to calm tensions, with critics calling the feature opaque and buried deep in menus.

The lack of transparency around how data will be used—or which prominent AI companies beyond OpenAI and Midjourney might access it—has left users feeling like unwitting participants in a corporate experiment. Even as Automattic insists the partnerships prioritize ethical AI growth, many question whether integrating sensitive content into AI models truly respects user autonomy. For now, the scramble to monetize user data continues, blurring the line between community platforms and AI training grounds.

The Implications

The fallout from Tumblr and WordPress plans to sell user data to train AI models has forced their parent company, Automattic, to publicly address growing issues around transparency. Critics argue that merely adding opt-out settings isn’t enough to ensure user consent was properly obtained before Tumblr and WordPress plans to sell user data to train AI models advanced—especially as customer data could be transferred to outside parties for commercial AI projects. This incident isn’t isolated—it’s part of a continuing trend where platforms treat personal content as raw material for profit, a practice that warrants close monitoring by regulators and privacy advocates.

As debates rage, questions linger: Who truly owns the creative work shared on these platforms? Can users trust their words and images won’t be repurposed without explicit permission? While Automattic claims compliance with data laws, the ethical gray areas around Tumblr and WordPress plans to sell user data to train AI models remain vast. For now, privacy advocates urge stricter safeguards, emphasizing that the line between innovation and exploitation must be drawn clearly—before more user trust erodes.

The Report:

Tumblr and WordPress plans to sell user data to train AI models

The Agreements

The Implications

Leave a Comment Cancel reply