Friday, October 04, 2024

Authors' concern over publishers selling their research to AI developers






Gradually, publishers are moving towards a deal with tech companies including Microsoft, Google, Open AI, Apple and Meta. The deal is that the publishers will get compensation for their content being used to feed and train large language models (LLMs) or other generative AI models. Academic papers having high information content are valuable for these LLMs. Lucy Lu Wang, who co-created S2ORC, a data set based on 81.1 million academic papers, says "Training models on a large body of scientific information also give them a much better ability to reason about scientific topics."[1] Recently Taylor and Francis made a deal with Microsoft in July for selling access to its authors’ work to AI firm, also Routledge, Wiley, Sage, Cambridge University Press and Oxford University Press are in the same line.

Publishers probably view these deal as an alternative as their data is already being harvested by these firms without an agreement. According to a Sage spokesperson “We believe that a preferable route is to offer clear licensing routes to our content that protect rights and include payment for the use of content by the LLM that we can pass on to authors and societies." [2] Informa, the parent company of Taylor & Francis and Routledge has recently revealed to make $75m from AI deals and Wiley to make $44m from two AI partnerships. The Independent Publishers Guild’s Autumn Conference is one of the biggest and best events of the UK publishing year held recently on September 17, 22024 in London, UK and the most discussed topic was AI and licensing.

 
 Possible use of content by AI companies         

In the article published in Leiter Reports, a philosophy blog, "Cambridge University Press now asking authors whether they want to license their publications for LLMs."[3] CUP discussed the possible use of author's content by AI developers.

If your work is part of a generative AI licensing agreement, it could be used for: 
 
  • Training and testing the foundational models that are then used to create, for example, personal assistant and chatbot tools or discoverability summaries.
  • As part of banks of authoritative content that are used, on a perpetual basis, to check and verify the accuracy of information provided by AI tools. 

  

 Benefits of this licensing to authors            


  1. Publishers can monetize their archives and content by AI companies paying them to use it to train their LLMs.
  2. It may improve the quality and accuracy of tools that are increasingly going to be used in everyday life.
  3. There may also be opportunities for your content to have greater visibility and impact if it is properly cited and attributed by AI tools.


 Authors' concerns          


In all of these deals authors rights have been ignored causing authors' concern about their work that is being fed to LLMs without even any information and remuneration for their work. The Society of Authors (SoA) has more than 12000 members. It has written a letter stating that they do not consent to these tech companies involved in using their work in the development of artificial intelligence (AI) systems.

The letter by SOA Policy Team (August 2024) states:

“Our members have instructed us to put you on express notice that they do not authorise or otherwise grant permission for the use of any of their copyright-protected works in relation to, without limitation, the training; development; or operation of AI models (including the generation of Infringing Works), by large language models or other generative AI models, unless they have first specifically agreed licensing arrangements for the use of their work.” It warns that this “continues to cause great harm to creators’ livelihoods and jeopardizes the future of the profession, which in turn threatens our creative industries and our cultural capital”. [4]

The letter by The Creative Rights Alliance (August 2024) which represents over 500,000 creators, has also written a similar letter to tech companies. [5] There are a few more cases which support the fact that tech firms should not use copyright-protected works without permission or compensation, and that these firms should seek licenses and create transparency for rights holders. Authors have also angered on publishers like Taylor & Francis dealing with Microsoft for selling authors' research for $10m. [6] Sage confirmed that it will pay royalties on any licensing income to the authors, editors and societies based on according to contracts. [7] Authors having their research content with CUP are more relaxed as the Cambridge University Press (CUP) is carefully considering how best to license their content to generative licensing providers and have created a set of principles to guide our decision-making. [8]

These focus on:
  • author attribution
  • the creation of formal licensing arrangements to govern content
  • obtaining permissions from rights holders
  • obtaining fair remuneration for the use of content.

CUP’s "opt-in" approach [9] involves asking for the consent of all authors and rights-holders for being the part of generative AI licensing agreement before licensing their content to providers of generative AI technologies. 
The Bookseller - News - Sage confirms it is in talks to license content to AI firms [10]
 

Overall, the evolving landscape of publishing in relation to generative AI presents both opportunities and challenges for authors and publishers alike. with the licensing of academic content between publishers and these tech companies like Microsoft, Google, and Open AI, etc, there is a clear potential for monetization and enhanced visibility for authors' works.  However, the concerns surrounding authors' rights and compensation cannot be ignored. Many authors express anger with the existing practices, feeling their contributions are exploited without proper recognition or remuneration.

Organizations such as the Society of Authors and the Creative Rights Alliance are advocating for transparent licensing agreements that respect authors’ rights and ensure fair compensation. Meanwhile, publishers like Cambridge University Press are adopting an "opt-in" approach, prioritizing author consent and establishing principles for ethical licensing.

As the discussions around AI and copyright continue to evolve, it is very important for all stakeholders—authors, publishers, and tech companies to collaborate in creating a framework that protects the rights and prestige of authors. Finding a balance between the advantages of AI in making research more accessible and the need to respect authors' work is essential for the future of publishing in the age of artificial intelligence. 


 Reference        


  1. 1. Has your paper been used to train an AI model? Almost certainly (nature.com)
  2. 2. https://www.thebookseller.com/news/sage-confirms-it-is-in-talks-to-license-content-to-ai-firms
  3. 3. https://leiterreports.typepad.com/blog/2024/05/cambridge-university-press-now-asking-authors-whether-they-want-to-license-their-publications-for-ll.html
  4. 4. The Society of Authors writes to tech companies asserting members’ rights around uses of their works by generative AI - The Society of Authors
  5. 5. https://www.thebookseller.com/news/creators-demand-immediate-change-from-companies-developing-ai-after-unlawful-use-of-content
  6. 6. https://www.thebookseller.com/news/academic-authors-shocked-after-taylor--francis-sells-access-to-their-research-to-microsoft-ai
  7. 7. https://www.thebookseller.com/news/sage-confirms-it-is-in-talks-to-license-content-to-ai-firms
  8. 8. The Bookseller - News - IPG 2024 Autumn Conference dominated by AI and licensing discussions
  9. 9. https://infogram.com/1p9g1kvndzqkrkt7523yd02wk3b3grmm9mw?live&utm_campaign=LLM+Comms&utm_medium=bitly&utm_source=Email
  10. 10. https://www.thebookseller.com/news/sage-confirms-it-is-in-talks-to-license-content-to-ai-firms
  11. 11. Open-access expansion threatens academic publishing industry (insidehighered.com)
  12. 12. https://www.thebookseller.com/news/wiley-set-to-earn-44m-from-ai-rights-deals-confirms-no-opt-out-for-authors
  13. 13. https://www.thebookseller.com/news/society-of-authors-writes-to-ai-firms-demanding-appropriate-remuneration-and-consent-for-authors
  14. 14. https://www.thebookseller.com/news/anthropic-sued-by-us-authors-over-use-of-pirated-books-to-train-ai-chatbot
  15. 15. https://www.thebookseller.com/news/taylor-francis-set-to-make-58m-from-ai-in-2024-as-it-reveals-second-partnership
  16. 16. https://www.nature.com/articles/d41586-024-02599-9



No comments:

Post a Comment

Comments