Google Books: “Think of the Orphans”

GoogleOrphans128The much-debated Google Book Settlement (GBS) is almost a great idea. Without a doubt, it opens up a vast pool of previously inaccessible information.

To understand how great an idea the GBS almost is, I put together the following back-of-the-spreadsheet graphs:OrphanGraphsFixed

The green represents orphans or neglected books, the blue represents public domain texts, and the orange represents in-print copyrighted texts.  The left-hand graph represents the accumulation of titles over time, separating out books in-print, books out-of-copyright, and “neglected” or orphan (a term coined by Brewster Kahle for works whose rights’ owners cannot be found). The right-hand pie chart shows the distribution across these categories in 1998, where my low-hanging data end.

The graph is based on US census data about new book titles (from 1881 to 1998, convenient abstracted by Swivel) and two assumptions: that effective US copyright is 95 years (in fact, it varies in peculiar ways) and that (on average) books stay in-print for five years (probably an overestimate). I’d be delighted to get better data or assumptions, but these seem to be in the right ballpark.

The Google Books project is such a great idea because it takes the
mostly inaccessible 82% of content represented by the green and blue
regions and makes it accessible on the web. Almost everyone agrees that the Google Book project is a great idea.  The Google Book Settlement …errr… not so much.

The GBS gives Google the right to effectively publish most out-of-print content (the green region) at limited cost and no risk of liability. Everyone else, on the other hand, runs the risk of a $150,000/infringement liability for putting exactly the same stuff out there. Critics reasonably argue that this gives Google a de-facto monopoly over much of that huge swath of content. Google argues that it’s not a monopoly because anyone else could risk the multi-millions in liability, get sued, and make their own deal for further hundreds of millions of dollars (okay, they don’t put it that way). This limits Google’s competition to bold and deep-pocketed players, which is a pretty small circle. For small tech companies (like my own, to be honest), Google’s de-facto monopoly is especially harmful because it cannot be “innovated around” like other kinds of monopolies (for instance, patent-based or infrastructure-based monopolies).

To be fair, Google’s monopoly does not extend to the public domain (the blue sliver) and the GBS will increase the public domain (turning green into blue) by identifying works which didn’t have their copyrights renewed back when copyrights had to be renewed.  It will also identify the actual rights owners in those cases where they can be found (though it’s unlikely that a competitor could get terms equivalent to Google).  However, these will likely be small advantages and the GBS still puts Google into a unique safe harbor while also making it the “publisher of default” for out-of-print books not contested by verifiable owners.

There is an alternative, which is orphan copyright legislation (HR5889 and S2913) which was making solid progress in the previous (110th) Congress and would reduce the potential liability for everyone and not just Google.  It was based in part on a report from the Register of Copyrights.  The passage of this legislation would lessen the negative impact of the GBS and unblock numerous efforts, public and private, to open up the “green field” represented in the graphs above.  Unfortunately, the GBS has taken the wind out of the sails of that legislation, which has not been re-introduced in this Congress.  Brewster Kahle of the Internet Archive (whose massive book-scanning efforts preceded Google’s), a vocal critic of the GBS, recently urged Google to withdraw from the GBS and support this legislation.

There are also a variety of “author’s rights” concerns about the GBS, legal concerns about the class action suit which led to the settlement, and international copyright concern.  There are also concerns about the representativeness of the class and that the settlement is more a commercial transaction than a legal action.

The plaintiffs (represented by the Author’s Guild and the Association of American Publishers) together with Google are currently trying to modify the agreement to address those various concerns while (presumably) maintaining the central role of the plaintiffs (as class
representatives and in administering the settlement) and preserving most of Google’s (closely held) sheaf of business models. We can only hope (and lobby) that the changes are radical enough to dismantle the potential monopolies it creates and satisfy the constitutional purpose of copyright, which is to serve the public interest.

One Response to “Google Books: “Think of the Orphans””

  1. The Google Warehouse? « sBooks: Reinventing Reading Says:

    [...] would be unavailable to possible competitors without significant investment (that may be fair) and tremendous risk (not so [...]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Connecting to %s


Follow

Get every new post delivered to your Inbox.