Reddit Requires Microsoft, Other Tech Companies to Pay for Scraping Site

Reddit CEO Steve Huffman announced that Microsoft and other tech companies must pay if they seek to continue scraping the website's data for training their AI models.

The social network forum has already inked deals with Google and OpenAI, allowing them to scrape the site's content for AI training legally.

Reddit, Quora Appearing More at the Top of Google Search Results — Brett Jordan via Unsplash

Reddit Seeks Mutual Agreements With Tech Companies for Data Scraping

Last June, the company announced that its Robots Exclusion Protocol would be implemented in the following month. The robots.txt prevents bots from other tech companies from scraping data from its site to train AI models.

Huffman is keen on securing agreements with companies before allowing them to continue scraping content from Reddit. He disclosed that Microsoft, Anthropic, and Perplexity have been refusing to negotiate.

"Without these agreements, we don't have any say or knowledge of how our data is displayed and what it's used for, which has put us in a position now of blocking folks who haven't been willing to come to terms with how we'd like our data to be used or not used," he said in an interview.

Reddit Actively Blocks Data Scrapers Without Agreements

Last week, search engine users experienced a decreased appearance of recent Reddit discussions on non-Google search results. The blocking of results is part of the company's strategy to prohibit non-consented crawling of the website.

Search engines like Bing, Mojeek, and DuckDuckGo no longer show results from the site's recent discussions. Other search engines have also started to experience limited results outside Google's engine.

The company is actively blocking commercial entities from scraping Reddit. The Robots Exclusion Protocol aims to stop others from benefiting from the site's content for any use case that they want.

Reddit sealed an annual $60 million agreement with Google, allowing the company to take advantage of its content.