- Zuckerberg allegedly pressed on to AI implementation in spite of employees’ objections
- Employees reportedly discussed ways to hide how the company acquired its AI training data
- The court’s archives suggest that Meta took steps to unsuccessfully mask its AI training activities
Meta faces a lawsuit that claims copyright infringement and unreasonable competition for the education of its AI model, Llama.
According to court documents released by VX-Underground, Meta allegedly downloaded nearly 82 TB of pirated books from Shadow Libraries such as Anna’s Archive, Z-Library and Libgen to train its AI systems.
Internal discussions reveal that some employees raised ethical concerns as early as 2022, with a researcher who explicitly said, “I don’t think we should use pirated material,” while another said, “Using pirated material should be outside our ethical threshold.”
Despite these concerns, it seems that Meta has not only plowed and taken steps to avoid detection. In April 2023, an employee warned against using Corporate IP addresses to access pirated content, while another said “torrenting from a company’s laptop doesn’t feel right,” adds a laughter emoji.
There are also reports that Meta employees are allegedly discussing ways to prevent Meta’s infrastructure from being directly linked to downloads, and raises questions about whether the company is consciously dealing with copyright.
In January 2023, Meta CEO Mark Zuckerberg reportedly attended a meeting where he pressed for AI implementation in the company despite internal objections.
Meta is not alone in facing legal challenges over AI education. Openai has been sued several times to allegedly have used copyrighted books without permission, including a case filed by the New York Times in December 2023.
Nvidia is also under legal control to educate its Nemo model of nearly 200,000 books, and a former employee had revealed that the company is scraping over 426,000 hours of video daily for AI development.
And in case you missed it, Openai recently claimed that Deepseek illegally obtained data from its models and highlighted the ongoing ethical and legal dilemmas about AI training practices.
Via Toms Hardware