Download Meta’s full response here.
Meta has argued that downloading books through torrents for AI training can still qualify as fair use, according to a recent court filing in an ongoing copyright dispute with authors over its Llama models. The statement appears in discovery responses in Kadrey v. Meta, a 2023 lawsuit brought by 13 authors, including Richard Kadrey and Sarah Silverman, who allege the company used pirated copies of their books to train its AI systems without permission.
However, in June 2025, a US federal judge granted Meta summary judgment and dismissed claims for copyright infringement in AI training after finding that the authors had not shown sufficient market harm. However, the case is still ongoing surrounding disputes over the alleged distribution of books obtained through torrent downloads.
Meta Says BitTorrent Uploads Are Fair Use
- BitTorrent uploads were inherent to the download process: Meta argues that any uploading of books to other users happened automatically during torrent downloads and was not a separate or deliberate act. According to the company, uploading is an inherent feature of the BitTorrent protocol, meaning users share pieces of files with others while downloading them.
- Uploading during torrent downloads should still qualify as fair use: Meta claims that any copies made or shared during the torrent process were part of obtaining data used to train its AI models. It argues that because the ultimate purpose—training large language models (LLMs)—is transformative, the copying that occurs during downloading should also fall under fair use.
- Making files “available” on BitTorrent does not prove distribution: The company argues that copyright infringement requires actual dissemination of copies, not merely making them available on a peer-to-peer network. Therefore, it says the temporary availability of books during torrent downloads does not demonstrate that any third party actually obtained copies.
- BitTorrent was the most practical way to obtain the datasets: Meta says it used the protocol because it was a more efficient and reliable method of downloading very large datasets. In particular, it claims datasets hosted on Anna’s Archive were only available in bulk through torrent downloads.
Other Arguments Made
- Training LLMs transforms books into linguistic data: the company argues it used books to analyse patterns in language rather than to reproduce or distribute the works themselves.
- The training process converts text into numerical representations: engineers tokenise books into small units and transform them into vectors used to adjust model parameters. Consequently, the resulting model representations bear no recognisable form of the original works.
- Models generate new text rather than copy books: the Llama models predict words and produce fresh responses to prompts instead of reproducing training data. Additionally, plaintiffs have also said they…
Source link
Disclaimer
We strive to uphold the highest ethical standards in all of our reporting and coverage. We blogs.grocliq.com want to be transparent with our readers about any potential conflicts of interest that may arise in our work. It’s possible that some of the investors we feature may have connections to other businesses, including competitors or companies we write about. However, we want to assure our readers that this will not have any impact on the integrity or impartiality of our reporting. We are committed to delivering accurate, unbiased news and information to our audience, and we will continue to uphold our ethics and principles in all of our work. Thank you for your trust and support.
Website Upgradation is going on for any glitch kindly connect at [email protected]