Saving Digital Libraries and the Internet Archive

A battle over “truth and who has access to it in the digital age.”

console servers labeled 'Internet Archive'

In a case that calls to mind David vs. Goliath, the Internet Archive is appealing a decision by a New York district court judge in a suit brought by Hachette Book Group accusing the Internet Archive of copyright infringement by scanning and distributing copies of books online.

The suit, brought in June 2020 during the COVID-19 pandemic and which also includes the publishers HarperCollins, John Wiley & Sons, and Penguin Random House, alleged that the Internet Archive’s controlled digital lending (CDL) system was illegal. CDL is a model used by libraries to digitize materials in their collection and make them available for lending.

CDL is the same as traditional library book lending, in that libraries lend the books they own to one user at a time based on how many physical copies they own. Libraries pay for the books, so the publisher and author have been compensated, just as they always have been for library lending, observes Chris Freeland, director of Open Libraries at the Internet Archive.

“Libraries in the United States have never needed permission to lend the books they own, and copyright law does not stand in the way of libraries using technology to serve their communities,” Freeland says.

The non-profit Internet Archive was founded in 1996 and is well known for its Wayback Machine, a collection of over 150 billion Web pages including free books, movies, software, music, and audio items such as live concerts. The Internet Archive uses software to ensure its users are unable to copy or view books after the loan period is up.

However, during the pandemic lockdown, the Internet Archive temporarily implemented a “National Emergency Library” from March to June 2020, which gave many readers the ability to simultaneously borrow the same book.

Proponents of CDL maintain that the practice is legal under U.S. copyright principles of fair use because it operates under digital rights management (DRM), which manages legal access to digital content to ensure that any digitized work a library owns that is copyrighted, is loaned for a limited period of time. It also ensures that a one-to-one ratio of owned copies to borrowers is maintained.

“Fair use is a flexible legal standard that applies not only to the non-exclusive list of activities in the statute, like news reporting or teaching, but also to building search engines, making documentary films, and remixing art,” says Freeland. “Our patrons use our collections for their reading and research in both formal and informal settings.”

However, opponents do not agree with this interpretation and argue that CDL involves copying, not lending. They argue that a library’s purchase of a physical book does not entitle it to produce and lend an ebook version of the book, or to distribute digital copies.

Judge John Koeltl’s decision said that the Internet Archive has created “derivative works by ‘recasting’ the publishers’ print books into ebooks.” Those “derivative works” are the exclusive property of the copyright holder (the publishers), the judge said, and thus, the Internet Archive would have needed permission before lending them out through its National Emergency Lending program.

Libraries typically purchase physical copies of books or pay for ebook licenses through aggregators such as OverDrive. Publishers use different profit models for licensing, all of which generate lucrative returns. For example, the suit noted that Penguin generates around $59 million per year from library ebook licenses, and between 2015 and 2020, HarperCollins earned $46.91 million from the American library ebook market.

So what does all of this mean in terms of the potential impact on copyright and information sales and sharing? The plaintiff and its supporters foresee a dire future where publishers hold all the cards when it comes to accessing digital content.

“The lawsuit is about truth and who has access to it in the digital age,” Freeland says. “For more than a decade, we’ve been flooded by misinformation across social media. Now we have a wave of AI content coming at us, and it’s getting harder for the average person to know what is true. Libraries serve an essential function in our democratic society by ensuring the public’s access to information, knowledge, and wisdom.”

The Internet Archive does this by collecting, preserving, and lending humanity’s published works in digital form, he notes. “If the lower court’s decision is upheld, then corporate publishers and their technology vendors will have complete control over our digital heritage–dictating who gets to read what, when, and for how long–all while guzzling up our personal data.”

In terms of when the appeal will be filed, Freeland says appeal briefs were due around four months after the lower court’s final order. The appeal, he says, “will be based on the errors of fact and law made by the lower court.”

Hachette Book Group did not respond to requests for comment as of press time.

In the meantime, the Internet Archive will continue to act as a library. “This case does not challenge many of the services we provide with digitized books including interlibrary loan, citation linking, access for the print-disabled, text and data mining, purchasing ebooks, and ongoing donation and preservation of books,” according to a blog Freeland posted on the site after the verdict.

Electronic Freedom Foundation legal director Corynne McSherry, one of the attorneys representing the Internet Archive, says the decision came down to economic impact. “I think the plaintiff did a good job—or perhaps the defendant didn’t do as good a job—in terms of stressing the economic impact argument.”

McSherry also says the judge took a particular view of the facts in the case and ignored other things on the record. For example, “The court paid far too little attention [to the fact] that there was no evidence of harm to the market,” McSherry says.

If the Internet Archive were to lose on appeal, she says, “I think we could erode the ability to have these types of digital libraries in the future … not just the Internet Archive, but libraries as a whole.” People expect to be able to read everything online, but that will not happen if books are only available “at the whims of publishers until copyright terms expire,” McSherry says.

This gives publishers “an enormous amount of power over what we read and how, and it is very concerning for libraries and authors and readers,” she adds.

McSherry did not rule out taking the appeal all the way to the Supreme Court, but adds that “ultimately, that’s a decision for the client.”

William Scott Goldman, an intellectual property attorney who is not affiliated with the case, says the Internet Archive may win on appeal based on the 2015 ruling in favor of the defendant in Authors Guild, Inc. v. Google, Inc. (, a case that determined Google’s digitizing of books for publication online was considered valid fair use.

Jonathan Band, a Washington, D.C.-based copyright lawyer who represents libraries, calls the decision “very narrow” and “limited to its facts,” and believes that it “will not affect most library programs providing digital access to their collections.”

That is because “with respect to impact, fair use cases are highly fact-dependent,” says Band, author of Interfaces on Trial 3.0: Google v. Oracle America and Beyond ( “The court here fixated on the fact that Open Library lent books that were available from Overdrive and other platforms, so its fair use analysis was rooted in its perception of direct competition between Open Library and commercial services.”

Dave Hansen, executive director of the Authors Alliance, a nonprofit that aims to advance the interests of authors who want to share their works broadly, expressed disappointment with the court’s ruling.

“We support authors who write to be read. Those authors care a great deal about equitable access to their writing through institutions such as libraries,” Hansen says. “CDL is a critical tool for libraries to extend access to readers online and has no discernible negative impact on income for most authors or publishers.”

The Authors Alliance believes the court got many points wrong, but in a case like this one, “The market harm analysis prong of the fair use analysis is important,” Hansen says. “We find it remarkable, then, that the court found that this factor weighed against fair use in the face of a total lack of evidence.”

Many authors are now struggling in the digital publishing market, but that is not because of librarians loaning books on the Internet, he says. “In our opinion, most of the major financial challenges authors face today can be traced back to the business practices of the very publishers who brought this suit and who dominate the publishing market.”

For McSherry, the case is cut and dried. If the Internet Archive doesn’t succeed in overturning the judge’s ruling, it will have a profound impact on libraries—and their patrons by extension, she says.

“The fundamental issue is, in the future, book publishers will have extraordinary and unprecedented control over what is available to the reading public if it’s in copyright. They will be able to control your digital access,” McSherry says.

If that’s not an option when it comes to digital materials, “We’ve handed over full control to publishers in the digital environment on what books you’re going to get to read and won’t be able to read,” she says. “That is the future we’re looking at in the digital environment.”

    • Challenges to building an effective digital library. The Library of Congress,
    • Courtney, K.K. If publishers have their way, libraries’ digital options will see major cuts. The Hill. (May 5, 2022);
    • Grossman, E. Challenges we face with ebooks. Anythink. (2019);
    • Iroaganachi, M.A. Trends and issues in digital libraries. IGI Global. (2018);
    • Sharma, V.K. and Chauhan, S.K. Digital library challenges and opportunities: An overview. Library Philosophy and Practice. University of Nebraska-Lincoln, (2019), 3725;

Join the Discussion (0)

Become a Member or Sign In to Post a Comment

The Latest from CACM

Shape the Future of Computing

ACM encourages its members to take a direct hand in shaping the future of the association. There are more ways than ever to get involved.

Get Involved

Communications of the ACM (CACM) is now a fully Open Access publication.

By opening CACM to the world, we hope to increase engagement among the broader computer science community and encourage non-members to discover the rich resources ACM has to offer.

Learn More