Microsoft claims Internet content is "free software" for AI training.

2024-07-02

Microsoft's AI CEO, Mustafa Suleyman, recently stated that all internet content can be used for free in AI model training. This statement has sparked strong opposition. In an interview with CNBC, Suleyman downplayed concerns about AI companies using intellectual property, stating that the practice of using freely accessible online content has been established for many years.

He commented, "I believe that for those contents that already exist on the open web, the social contract for these contents has been fair use since the 1990s. Anyone can copy, recreate, or reproduce these contents." "If you will, this is 'free software,' this is the understanding people have had all along."

This tech leader added that unless publishers or news organizations explicitly request not to "scrape or crawl" their content (excluding indexing), AI companies can use this content to train AI models.

He said, "There is a separate category where websites, publishers, or news organizations explicitly state, 'Do not scrape or crawl my content for any reason other than indexing so that others can find this content.' This is a gray area, and I believe it will be resolved in court."

Suleyman's comments on the unclear legal boundaries of AI model training have been reflected in recent legal actions. Following his remarks, the Center for Investigative Reporting filed a lawsuit against OpenAI and its major investor, Microsoft, accusing them of unauthorized use of the nonprofit organization's content without permission or compensation.

According to the lawsuit, AI detection tool Copyleaks found that nearly 60% of the responses provided by ChatGPT-3.5 contained some form of plagiarized content, with over 45% of the responses containing text identical to existing content.

This lawsuit aligns with similar legal challenges brought by The New York Times and approximately eight other media organizations.

User Reactions to Microsoft's AI CEO's Comments

Some users have posted their reactions on X, disagreeing with the Microsoft CEO's view that available content is part of a "social contract" and can be freely used for training AI models.

One user said, "It should be free software, but now your company openly reduces all human expressions to 'content' to steal." Another user likened it to a "plagiarism machine."