Archive for the ‘Uncategorized’ Category

The Page Isn’t Dead

October 20, 2011

I once thought that the rise of electronic reading would mean the death of the “page” as part of the reading experience. I fully expected that the chop-chop-chop of paged media would be replaced with the smooth flow of endless scrolling. It seemed so much simpler and more elegant.

I was wrong, but the reasons are interesting. The chop-chop of pages are here to stay but the way documents are designed is due for transformation.

What we think of as paged layout was introduced in the first century. It’s called a ‘Codex‘ and it succeeded the scrolled media where a very long piece of printed material was rolled up and then selectively unrolled (typically between two spools) to expose a desired passage or fragment. The technology of the codex, which had significant advantages by itself, also attached itself to a quite prolific meme: early Christian texts were nearly all published as codices.

The codex solved two problems. First, it presents a reduced aspect of a long document which fits the visual field so that the text is readable. Second, it simplifies random access to the text, so that it’s very easy to go to a particular part of the text. (It would be interesting to speculate on the impact of random access on how people thought and read, but that’s not my current point.)

Advance several thousand years, and technology has made the random access problem go away: you can jump within a scrolled document in less time than it takes to say “Heraclitus.” However, the first problem. the reality of the human visual field and acuity, is still with us. Clark Kent might be able to read an entire book fit onto a single exposed surface (either a rampart-sized display or something with very tiny letters), but we can’t. The scrolling computer display works by presenting us with a slice of the text and letting us move that slice within the document (just like the old unrolling scrolls which predated the codex).

Unfortunately, it’s the way we move that slice which makes things complicated. While it works fine for short documents (e-mails, short news articles, advertising), it doesn’t really scale to longer documents. To see this, think about what your hand and eye are doing when you scroll forward to the next chunk of reading in a long document.

Your eye is on a line somewhere near the bottom of the screen and you’re caught up in the argument, or the thrill, or the emotional tension, but you’re near the bottom, so you have to scroll up. You then use some physical device — mouse in your hand, finger on a trackpad, thumb on a trackball — to move the content within the display.

Your eyes follow the line you’re reading and, when it nears or passes, the top, you abruptly stop whatever physical motion is driving the display change. To pick a metaphor from physical sports, the line is the ball, the top of the display is the basket, and you’re trying to score. Every time you turn the page. Now what were you reading about?

All of the human pieces of this little game are organic and subject to fatigue with time and repetition, which is why scrolling is less objectionable for shorter texts than longer documents or books.

Traditional scrolling of longer texts is both tiring and and disruptive of narrative flow. What’s amazing is that brilliant writing and compelling stories can survive these interruptions so easily. We are a lucky race of media consumers. But what’s tragic is that it’s not necessary.

Just press PageDown! Many of my more astute readers have probably just said “why don’t you simply push page down (or space) and move your eyes to the top!”

That is a huge improvement and the remark is the perfect segue to my second point. One button/key scrolling is what I usually do when I need to scroll, but many people don’t even think of it. Even when we manage to remember the possibility, it reveals the obvious fact that scrolled information is already broken into pages by the size of the display, but it’s just broken into pages BADLY.

In the worst case, a line of text is split across the bottom of the display or window, so that we see only head or feet of a line at the bottom or top of the scrolled page. It is a testament to the miracle of visual perception that we can often read a line that’s lost it’s head or feet, but it slows us up even when it doesn’t stop us in our virtual tracks. Better software always scrolls by whole lines (when it can identify them), which is better, but the problem runs deeper.

Texts aren’t just series of lines. They includes blocks and heads and hierarchies, and the whole purpose of their design is to guide our attention and understanding. When a display arbitrarily imposes stops and breaks, it doesn’t help the flow, the attention, or the understanding. Think of it as cogitus interruptus. Not fun.

Almost any printed book has had attention paid to where the page breaks fall, doing so in such a way as to minimize the disruption to the reader’s experience. Don Knuth, famous computer scientist and digital typesetter extraordinaire, used to rewrite sentences in his books to avoid awkward line and page breaks. Unfortunately, we can’t make those adjustments by hand in this modern web world where anyone can adjust their page dimensions or font size long after the author or publisher has left the building.

To its credit, the web has a partial solution, in that CSS has properties, like page-break-before, that let authors or designers provide constraints which layout engines can try to honor. Though intended originally for printed layout, they’re equally helpful (when honored) for codex-style electronic display.

Unfortunately, most eBooks ignore this information, leading to cases where headings appear at the bottom of pages and thought-sized chunks of text are split across boundaries. It’s distracting, confusing, ugly, and unnecessary.

The page isn’t about to die. In fact, it’s becoming more alive, as its dimensions and attributes change across devices and purposes. The page is fundamentally about providing an eye-sized window into a text; what’s changing is that the character of that window is now a fluid and lively aspect rather than a fixed and frozen window. This is an extraordinary opportunity, where the reader’s experience has more dimensions and publishers will be able to add more kinds of value than they had in the past. The page isn’t dead. It’s about to be reborn.

How To Point To A Paragraph

October 20, 2011

If commentary or discussion about online content is going to get beyond ‘Like’ and ‘Hate,’ there needs to be a way to make comments or annotations fine-grained below the level of the document itself: at least to the paragraph or even to the sentence or other excerpt. However, this isn’t as easy as it might appear.

The problem is that references to paragraphs (for example) need to be stable even as the rendering of the paragraphs changes across devices or displays. Page and line numbers, for example, are the centuries-old solution to this problem which don’t work when readers (human and digital) change page dimensions or type sizes with regularity. So what’s the solution?

The easiest solution is to simply attach some kind of unique identifier to paragraphs or other passages. This is what the Bible does, for instance, with chapter and verse numbers that are stable across editions and translations. This scheme is a pretty old one and now close to universal. The trick is to have production and reproduction processes which create and sustain these identifiers and it helps (as with the bible) to have the content itself be relatively stable without many edits. This is what we do with sBooks (www.sbooks.net) where we assign unique XML element IDs and have various ways to sustain these identifiers across edits and versions. Not rocket science, but not always easy either.

Unfortunately, we don’t always of the luxury of unique identifiers so sometimes we need to be a little more clever. One approach is to use the structure of the document to define identifiers, so that a paragraph might be identified as “the third paragraph in section 5 of chapter 3″. This is the approach used by the most recent version of EPUB3, though it has gotten some flak. The problem is that it uses the document hierarchy for encoding identifiers and consequently assumes that the narrative hierarchy and flow of the document is consistently transformed into the XML containment structure. Change the XML structure, and all your references break. Whoops.

Another approach is to use the actual content of the paragraph to identify it semi-uniquely. In the simplest version of this, an annotation just quotes the entire paragraph. This works pretty well in general but fails for paragraphs (such as ‘This is left as an exercise to the reader’) which may occur repeatedly in a document. However, in these cases, we can provide some context or other information to help disambiguate the reference.

Unfortunately, the use of the text itself has a few other problems of its own:

  • it’s wordy and takes up space;
  • it can vary between renderings (think whitespace, line breaks, hyphenation) or edits;
  • it exposes the content which may offend or frighten some rights-holders.

There are two things we can do to address these problems: normalize the content and hash the normalized version.

With sBooks and our open annotation standard (Knotes), we’ve been using a simple normalization scheme called word segment normalization (WSN) which basically breaks the content at whitespace, strips off non-embedded punctuation, and then glues the words back together into a string where words are separated by single spaces. This works pretty well because most rendering and minor edits affect punctation and whitespace but not the words themselves. To be even more normal, we also decompose the content’s Unicode characters and then strip out all of the modifiers. And (in case you were wondering), if the content contains embedded markup, we strip out the markup tags (but not their embedded content) before applying the WSN processing.

(It turns out that the WSN representation can also be helpful for identifying stable text ranges, but that’s the topic for another post.)

Once we have a normalized string, we can hash it with MD5 or SHA1 or however you like to hash your strings. Unless you have more disruptive edits (fixing typos, for instance) or have to deal with genuine duplicate content (‘exercise for the reader’), this works pretty well. It is fairly compact and hides the actual content from anyone who has the reference but hasn’t paid for the content.

There is an interesting variation which hides the content while allowing smart fuzzy matching of references. The idea is to start with the WSN vector and then sort the words using some context-free criteria like alphabetical order, word length, or frequency in some external corpus. It’s then possible to use a little vector algebra to match similar passages automatically, but the exposed word vectors are relatively useless for reconstructing the original paragraph.

The real message here is that there is no silver bullet (besides unique initial IDs) to point to paragraphs. This leads to the idea of using multiple identifiers for any given passage and expecting that applications will use and choose among those identifiers to do the right thing in a particular context.

For example, an sBooks annotation stores the unique paragraph identifier (if there is one), along with the WSN vector sorted by word length, and the vector’s MD5 hash (WSNMD5), both by itself and together with WSNMD5 hashes of other paragraphs in the document. Finally, if the document was converted from a scanned document, we also use the page+lineno for where the original paragraph started in the printed book. The philosophy is that the more pointers you have, the more likely you are to be able to connect an annotation with the content.

One of the systemic consequences of cheap computing and bandwidth is that the rendered forms of ideas have gotten more diverse in vacuous but annoyingly idiosyncratic ways. These methods are ways to reduce this problem so that the important stuff, conversations about ideas and real issues, can proceed. I hope it helps.

Should an iPad be my eReader?

March 1, 2010

iPadOver the past few weeks, I’ve had many friends and colleagues ask about using Apple’s new iPad as an e-reader.  Many of these folks had been thinking about a Kindle or a Nook but the iPad (and rumors of the iPad even earlier) offers an intriguing alternative.  Of course, the iPad is far more than an e-reader: it’s a portable media player, a mobile browser, and a possible netbook replacement.  But here’s how it stacks up as an e-reader, especially in contrast to current e-readers like the Kindle or Nook.

In a nutshell, current e-readers (like the Kindle or Nook) offer longer battery life and easy reading in natural light but are limited by a static grayscale display and an awkward interface.  The iPad offers a dynamic color display which supports video and animations and a touch-based interface but is limited by a relatively short battery life and screen readability issues, especially in natural light.   Apple is generally brilliant when it comes to user experience, so I expect that interface will be simple and natural in surprising and pleasant ways.

On the content side, it looks like Apple is signing up with all the big publishers, just as Amazon and B&N.  Apple has also said that they will be priced the same as the Kindle, though that might happen by publishers demanding (as they already have) that Kindle prices be raised.

As I’ve said before, the big content issue with e-books (on all the major platforms) is that they are more like software applications than books: they can’t be moved among vendors, due mostly to proprietary copy protection schemes (Amazon’s, Apple’s, or Adobe’s).  This won’t be an issue for the iPad, however, if the Kindle (etc) iPhone/iTouch apps work on the iPad.  This “big if” depends on Amazon (etc) being willing to support the iPad and Apple being willing to accept the applications.  It also isn’t known how readers will add their own e-books to the iPad, though it might be as simple as adding your own MP3s to an iPod using iTunes (and leaving Linux users out in the cold).

Outside of the content questions, then, the big downsides of the iPad are (relatively) short battery life and a display which (a) can’t be read in sunlight and (b) may have eyestrain issues for extended reading.

Battery life. In his introduction of the iPad, Steve Jobs cited an iPad battery life of 10 hours, which is pretty impressive and largely due to the fact that Apple built their own CPU for the iPad.  In an exchange with Walt Mossberg after Jobs’ presentation, Jobs commented that “nobody reads for 10 hours straight” which might be true now that the Harry Potter series is completed.  But the 1o-hour battery life means that you will probably need to recharge your iPad every night.  Depending on the individual, that might be easy to forget at home or hard to sustain when traveling.

However, ten hours is pretty miserable compared to the days and weeks of battery life provided by e-Ink based readers like the Kindle. For example, I once forgot my Kindle’s charger over a week’s vacation but finished all the novels I’d brought without a glitch (though I did turn off the Kindle’s wireless for the duration).

Reading outside.  On that same vacation, I often read outside, which would have likely been impossible on the iPad.  The iPad has a striking LCD display based on IPS technology (in-plane switching) which allows a broad viewing angle together with vibrant and accurate colors.  IPS also requires a strong backlight, which means the display is unlikely to be readable in sunlight.  I was personally disappointed that Apple didn’t go with Pixel Qi‘s low-power sunlight readable display technology.  This technology, which was originally developed for OLPC’s so-called $100 laptop, is featured in (for example) Notion Ink‘s Android-powered Adam tablet.

Eye strain? Computer displays have gotten much easier on the eyes over time, especially as their “refresh rates” have become imperceptible and as anti-aliased fonts have become more common.  But it’s not clear how the iPad display will stack up for e-reading.  Eye strain comes from how the eyes move, which can vary tremendously with task and context.  In a fascinating New York Times piece, I learned that our eyes make 10,000 movements/hour while reading.  Many of these motions are involuntary and can depend on “optical ergonomics” such as contrast or page width or typeface or spacing.

In this regard, the Kindle’s e-ink display is close to ink-on-paper (though the paper is a little grayish) which I find very easy on the eyes. I can read it for hours with no particular problems, providing the ambient light is adequate (as with ink on paper).

On the other hand, reading books on my iPhone, as I often do  waiting in lines, gets tiring after twenty or thirty minutes (as does the waiting in line!).  However, this might be do to the page width, or the screen glare, or any number of other factors.  It remains to be seen how the iPad will function in this regard.  (Stay tuned to this station).

In conclusion, the iPad will probably be an adequate e-reader and has the potential for much richer content and interaction than “traditional” e-readers.  For example, the sBooks experience will be much more satisfying on the iPad than it is on the Kindle or Nook.  The iPad may be especially appropriate for kinds of reading (magazines, colorful textbooks, graphic novels, etc) which current readers handle poorly if at all.

At the same time, the relatively short battery life (10 hours), the difficulty of outdoor reading, and the possibility of eyestrain issues will probably make it inadequate as a “read anywhere” device to replace your physical books.

What is an sBook?

February 26, 2010

An sBook is an enhanced and personalized ebook which has been subtly enriched with context and conversation from multiple sources.  sBooks can change the way that readers and communities engage with extended content (books, essays, stories) and the knowledge and inspiration which they contain.

Concretely, an sBook differs from a conventional ebook in two ways:

  • readers can share rich notes with their friends, their communities, and other readers;
  • an extensible embedded knowledge base helps readers search and explore the book’s content.

Sharing. Most e-books have ways to add and save simple notes for personal use, but sBooks makes it simple to share those notes with friends, colleagues, or communities, even engaging in conversations “in the margin” of books we care about.   sBook notes, called glosses, can also include references, rich media, and tags which can make the book easier to search and navigate.  Third parties can also use glosses to metapublish against an sBook, creating sets of notes, tags, or references that enhance the book’s value and evolve with time.

Searching. Most e-books also have some kind of full text search and (sometimes) a version of the printed index, but an sBook uses a compact embedded knowledge base — a knowlet — to combine those functions.  When searching an sBook, the reader doesn’t need to know the exact words used by the author and can even search for abstract concepts or themes.  The book’s built-in knowledge base can be further extended by the tags and knowledge created by the reader’s friends, communities, or interested third parties.

These distinguishing features of sBooks are based on three key ideas: that content and meaning matter more than form and flash, that technology can change (for the better) how we engage with complex information, and that publishers (large, small, and tiny) need to focus on editorial added value to remain relevant and sustainable in the digital age.

To explore more, you can visit sbooks.net, read some sBooks, or experiment with converting your own documents into sBooks.  Let us know what you think!

Beautiful e-Books

February 25, 2010

“All aesthetics is functional, even if simply for our souls.”

Printed books are beautiful things.  Even the cheapest paperback, if you look closely, has the hallmarks of a process where individuals made thoughtful choices to enhance the reading experience.  From the chosen typeface to the page breaks to the spacing and separators, books are made to be read.  To see the importance of those choices, just send a large raw text file to your printer and try to read it!

In contrast, many e-Books are especially ugly even given the limitations of their form.  Early in my life as a Kindle owner, I bought the e-book of Poul Anderson’s “Brain Wave,” one of my favorite science fiction novels.  I was disturbed to find that the e-book rendering lost many structural cues, such as the appearance of breaks between sections within a chapter.

There are some great slides from Liza Daly which illustrate this problem (the names have been changed!) and make some points which I hadn’t originally realized.

One of the reasons for the ugliness is that e-book production is typically an afterthought on the print production process and is often out-sourced and based on unproofed versions of the text.  Furthermore, e-book renditions are often not proofed significantly by editors or publishers.

A deeper cause is that many print designers (and editors and publishers) have a prejudice about online design based on the ubiquity of bad design and the early days of the web.  But it doesn’t have to be this way.

Design is the process of thoughtfully making the choices you have around the choices you don’t.  Book designers have a huge number of design dimensions and  some significant constraints.  The constraints (beyond largely fixed content) are mostly financial in origin, such as the use of color, the size of pages and the linearity of their arrangement.  In the early days of online reading, the design space was nightmarish, because designers had very few choices and the choices they didn’t have were also variable, reflected in the diversity and adjustability of early browsers or display applications.

One of my friends and heroes at the MIT Media Lab was a graphic designer and visionary named Muriel Cooper.  Muriel was a character (in the best of senses) who had started the Visible Language Workshop at MIT after working as a book designer for MIT Press.  The work of Muriel and her students, starting in the 1980s, was about making new media (long before it was thus named) beautiful.

They started by fixing some obvious sources of ugliness: aliased fonts and crayola color models.  They went further to use the bleeding edge of technology to create new choices for designers, leveraging effects like three-dimensionality, translucency, or animated fonts.  Eventually, they developed models for how designs could automatically adapt to the devices or choices of readers.  These ideas found their way into online design by inspirational example, engagement with vendors, and passionate disciples.

Today it is possible to make beautiful online documents, even for the Web, using declared standards that are quickly converging towards ubiquity.  There are some lovely examples at the CSS Zen Garden and even some examples of e-books at the ePub Zen Garden.

There is no longer a good excuse for ugly e-books.

Let’s call them A-books

February 18, 2010

The iPad will apparently be using a proprietary Apple copy protection for its electronic books, according to an LA Times report which cites “anonymous publishing sources”.  This means that you won’t be able to read your Kindle or Barnes & Noble e-books on the iPad or your iPad books on other devices.

We should really refer to these products as a-books rather than e-books because they’re more like applications than books: they’re tied to a particular platform and vendor.  ”a-books” could also coincidentally stand for Apple-Books, Amazon-Books, or Adobe-Books, covering the three major flavors of DRM in use with digital books.  DRM stands for  ”digital rights management” and is (in all these cases) a kind of encryption used to prevent unauthorized copying and distribution.  The Sony and Nook e-readers both use DRM from Adobe Systems.  It isn’t yet known what kind of DRM will be used by Google Editions (though it might be interesting).

This isn’t really new, since most e-books already come with DRM which ties them to a particular platform, but there seemed to be some hopeful convergence on Adobe’s DRM with its adoption by multiple vendors, leaving Amazon as the odd man out with the proprietary DRM.  (Of course, the reliance on Adobe isn’t ideal either).  However, Apple’s decision changes all of this.

The iPad’s book reading application will reportedly use Apple’s own FairPlay DRM, which they use for movies and TV shows and (until recently) music.  This is distinct from the Adobe DRM used on Barnes & Noble’s Nook, the Sony readers, and other products.  It is also distinct from the DRM used on Amazon’s Kindle, though the Kindle also has an entirely different e-book format altogether, rather than the ePub format which everyone else seems to be using.

It will be interesting to see what happens to the existing iPhone reader applications, especially the offerings from Kindle and Barnes & Noble (and more interesting ones from other sources).  If Apple lets these applications work on the iPad, you will be able to read your Kindle and Barnes & Noble books on your iPad, just not with the native reading application.  It would also let creative developers come up with even better ways of using the iPad as an e-reader.

However, Apple has an explicit policy of barring applications which replace core functionality.  That was one explanation of Apple’s rejection of the “Google Voice” application which got some anti-trust attention last year.  It is also the reason that alternative web browsers are not generally available in the App Store.  Will this policy bar those kinds of applications not that there is new native functionality?  It would be a questionable move on Apple’s part, but certainly conceivable.  One interesting move would be to provide a way that those applications could connect to the native iPad “bookshelf” and to offer/require that purchasing on the iPad happen through Apple’s in-app purchase infrastructure (where Apple gets a cut).

In the long run, there’s a slightly more than philosophical argument that a given product is not “really” a book (with its connotations of durability and portability) if it’s tied to a device or vendor.  If that’s the case (and books continue to exist), these a-books are a transitional form awaiting either some kind of open and portable DRM (not an oxymoron) or the evolution of publishing beyond dependence on DRM (as has happened in music).  In this scenario, there’s a chance that early adopters (like myself) will end up as either victims (losing purchases to vendor evolution) or criminals (breaking the DRM on books we’ve purchased so we can read them on different devices).   But I’m hoping for a better outcome!

The End Of Ownership?

January 27, 2010

e-Books aren’t owned, they’re licensed.  There are various reasons for this policy (on the part of publishers) which has some serious consequences for both consumers and culture.  On the consumer side, it enables device lock-in and makes e-books more ephemeral (your device could die or your vendor could stop supporting it).  On the culture side, it largely erases books from physical history and diminishes the creation of “hybrid value” brought to books by readers and communities.

In most cases, when you buy an e-book, you are actually getting a license to read it, not genuine ownership.  This is the case with nearly all e-book distributors, including Amazon and Barnes & Noble.  Google, the coy elephant debutante of the e-book world, has been careful in their formal language to talk about “providing access” to books, rather than selling them, suggesting that their cross-platform cloud-based solution will take this even further.

There are many reasons for the trend away from ownership:

  • publishers think digital copyrights are barely worth the paper they aren’t written on, believing licenses (together with DRM) give them more protection from revenue loss through piracy and fair use;
  • special restrictions are more easily included in licenses (e.g. no resale, no robo-read-aloud, device limitations, etc);
  • those restrictions can give more flexibility in pricing models, which is how publishers recover investment in a largely speculative business;
  • licensing opens the way for other revenue models, like subscriptions, for publishers and distributors to explore;
  • licensing is the norm in the software world and the rest of the digital media world, partially to restrict rights for consumers and liability for vendors.

It would be nice to believe that ownership and licensing are equivalent for consumers, since both come down to “you pay and you read”.  Unfortunately, it’s not the case.  Some of the consumer consequences include:

  • vendor lock-in: I can’t read my Kindle book on my Nook (or my Linux netbook);
  • 1984 redux: Amazon could (ironically) delete copies of 1984 from buyers’ digital shelves because those buyers (probably) lived in nations where they had fewer rights than people in other places (1984 is out of copyright in some parts of the world but not in the US, due to the Mickey Mouse Protection Act);
  • no sharing: I can’t lend my e-book to a friend or family member; even the Nook’s much-touted sharing features are extremely limited (you can share a book exactly once and only if the publisher allows);
  • ephemerality: if a vendor goes away (either entirely or in part) and your e-book is under DRM, you will have “access” to a very random string of bytes (of course, Google and Adobe and Amazon are too big to fail, right?)

Given these issues for the consumer, I’m surprised that I buy e-books in the volume that I do.  But the convenience is compelling (a roomful of books in my briefcase) and I’m gambling that the market will drive some of the deficits away while grandfathering in my digital investment to date.  But the end of ownership has consequences for intellectual culture that are more serious, especially the diminishing of intellectual history and hybrid value.

Book ownership is one foundation of intellectual history for both individuals and the culture.  The books on our shelves constitute a history of our learning and thinking; our own shelves can yield reflection, insight, and inspiration; browsing someone else’s shelves can offer us a glimpse of their minds.  Though browsing a digital shelf might give the same opportunities (or potentially better ones!) for reflection and disclosure, their reliance on cloud-based data could make that impossible.  Repressive regimes make a big deal out of burning books as a way to cement their domination, and the end of ownership will make that even easier (with a reduced carbon footprint, of course).

In his 2009 op-ed piece defending the controversial Google Books’ Settlement, Sergey Brin compared their effort to saving the Library of Alexandria.  Ironically, the swathes and swatches of knowledge which survived that ancient calamity did so because there were many copies (some translations) of the works in many places.  Ownership, physically at least, saved the day and gave us what survived.

A second casualty of the end of ownership is to the creation of hybrid value by readers.  When I buy a book, my rights to the book (to mark up, to lend, to hang on to, to be found by future generations, to resell), allow me to add value to that book for myself and my community.  A book that I can talk about with friends is more interesting than a book just for my personal enjoyment.  A book that I (or a student, child, or friend) can return to in 10 years has more value than is contained in the moment of reading.  The five musty volumes of “Tom Swift” I found in my grandparents’ attic touched me more than the complete set that I now have on my Kindle.  It’s not just about accessibility, but about personal relevance.

Hybrid value is what enables serious intellectual discourse.  Academic publication, with its lists and chains of references, was the original “remix culture” where the value of a contribution included the vitality of its intellectual offspring.  In principle, digital technologies should make this discourse even richer and, in the short term, technologies like “Google Scholar” have been boons to conversation and scholarship.  But the end of ownership puts the traditions of distributed intellectual stewardship at risk.

What to do?  Here are a few ideas and I’d love to hear more:

  • embrace open and common standards for packaging (e.g. epub .vs. mobi/azw);
  • insist on vendor-independent “survivable” DRM (often social DRM) or no DRM at all;
  • create frameworks for capturing and sharing user added value (that’s what we’re trying to do with sBooks);
  • personally backup your cloud-stored e-books in anticipation of either distributor disaster or improved openness.

What should an e-Book cost?

January 14, 2010

Pricing in the middle of a technological revolution is always tricky. Opportunities, expectations, tradition, fears, idealism, and greed all collide, trying to strike a new balance on a moving train.  The current situation with e-books is no exception.

Currently, the main e-book distributors/manufacturers (primarily Amazon), are pressuring and “bribing“  publishers to accept prices which will spur and spread adoption of the new technology.  In actively trying to make e-books more attractive to consumers, the ebook vendors have ended up striking a balance between consumers who say “copying is nearly free, so e-books should be really cheap” and publishers who say “production costs haven’t changed a bit, so e-books should be priced close to p-books (physical books).”  But the story isn’t so simple.

P-books (the artifacts) are also cheap. The cost of producing and distributing the physical book itself is generally a fraction of the sale price, with notable exceptions for art books and some text books.

Publishing is a hits-based business where a few titles do well and carry the rest of the titles which break even or lose money. Prices reflect the cost of producing books which didn’t end up doing well. The uncertainty of “what will do well” is compounded by some publishers’ laudable desire to contribute to the public good by disseminating important ideas and works even when the profit is marginal.

Publishers are (part of) society’s solution to Sturgeon’s Law that “90% of everything is crud.”  The cost of this social function is built into the cost of books.

Marketing costs can be significant as in most consumer goods where part of the price is paying to convince people to pay the price.

Book production costs are largely up front and by the time the book is out for sale, the publisher has already put down its money for editing, production, graphics, rights clearance, etc. This is part of the reason that books start out expensive and get cheaper as costs are recovered. Part of publisher’s beef with Amazon’s $9.99 best-sellers is that it disrupts this traditional method of cost recovery.

Production processes remain print-oriented and do not always taken advantage of technological improvements; the e-book price thus reflects the costs of p-book production. This isn’t because authors and publishers are technophobes (most aren’t), but because their key differentiator is the addition of editorial value (design, layout, organization) for which print is still the dominant paradigm.  It’s not clear how to make an e-book better and publishers differentiate by improving their product which currently needs to focus on ink on paper.

Most e-books are licensed rather than sold conferring substantially fewer rights than the sale of a p-book. This is based on both the e-book vendors attempt to lock in market share and the publishers’ desire to protect their IP.  On the other hand, licensing gives consumers good reason to pay less for e-books than p-books, and they will become more sensitive to the difference as vendor lock-in, DRM woes, and the other perils of licensing become more prominent.

Consumers don’t like paying for the same book twice, whether the second copy is an e-book, an audio-book, or a print book.  Author contracts often separate out these items, but consumers don’t generally see them this way.  (Adaptation rights are another story).  Paul Graham does a great job of discussing this issue (among others) in his essay on post medium publishing.

What should an ebook cost? Given this complicated mix of factors, the better question is how can the useful enterprise of publishing be sustained as information goes digital?

Despite the claims of the more radical digerati, publishers play an important role in the ecosystem of ideas and editorial added value is a very real thing.  ”Crowd-sourced” editorial value doesn’t generally improve content or address Sturgeon’s law because the law applies even more to crowd-sourced metadata, which is amplified by (what might be called) the “RTFA law” that for everyone who has read a published item, 10 people have opinions about it.

The challenge is that editorial value needs to be sustained while the underlying exchange mechanism (physical books for money) is disappearing.  Consequently, the question of e-book pricing is not just about setting a price for another format; it is about reimagining how the added value created by publishers is monetized in the digital domain.

I’m not going to answer that question here, but here are some starting places:

  • develop e-centric production processes that inject editorial value into e-books directly, making them more valuable than p-books; think about how to make e-books beautiful;
  • emphasize e-book “ownership” rather than “licensing” as people will pay more for something they own; find ways to add more benefits to ownership (access to ancillary materials, etc)
  • consider bundling e-books (and even audio-books) with p-books, at no or nominal markup, as a way to spur adoption (and an alternative to $9.99 pricing);
  • look at alternative models for timely recovery of production costs; for example, selling “book futures” or transparently dynamic pricing (so a hot item will go up (a little) in price and people know that ahead of time);
  • explore alternative marketing strategies, including viral approaches or counter-intuitive ideas; e.g. suppose that every hardback included 4 discount coupons for the same title to pass on to friends (with barriers to abuse);
  • commoditize the routine aspects of e-books (formats, distribution, DRM (if you have to)) so that competition is based on real editorial added value, not ephemeral accidents of technology.

The point of these ideas is not that they’re sure-fire prescriptions, but that the shift to digital will require a radical reinvention of publishing.  As a self-serving notes, the technologies and ideas that we’re developing at sbooks.net are designed to create an infrastructure for just these kinds of innovations.  If you’d like to try something, drop us a line!

The Google Warehouse?

November 2, 2009

Last month, Google provided more details on its foray into the e-book world, first announced over a year ago.  In their announcement, Google said that it would operate mostly as a warehouse, essentially as a bridge between publishers and retailers.  They also made clear that they were not launching their own e-reader but that they intended to provide “cross-platform books” mostly served over the network to e-reader devices, cell phones, and computers.  The plan is to launch in early 2010 with roughly 600,000 books which is close to double Amazon’s current offerings (that doesn’t include approximately 1,000,000 free books for various platforms, which they and others already make available in various forms).

But the Google announcement implies some other significant consequences.  First, they will be creating some kind of e-book software platform; second, they are trying to move e-publishing towards a service-based model; third, they are creating a framework that could be used to deliver the books that may be available under the disputed Google Books Settlement (however it is transformed).

The Software Platform. A digital warehouse implies a software platform and it’s worth thinking about what that platform will look like.  Google has formally embraced the EPUB platform, which is a flexible framework from IPDF based on open standards.  This is good news for e-books generally but may be bad news for Amazon, whose Kindle (unlike Sony’s e-readers and B&N’s Nook) doesn’t currently handle the EPUB format.  In addition, EPUB is a format for delivering standalone “book bundles” and doesn’t specify how it would work with the kind of service/access-based model that Google is discussing.  Finally, ePUB has “plug-in” DRM and Google hasn’t specified what scheme it will use.

Sony and Barnes and Noble seem to have settled on Adobe’s “Digital Editions” DRM, but Google hasn’t discussed DRM for their e-books.  The name “Google Editions” suggests that Google might implement their own flavor of DRM, possibly based on some kind of server access control  However, since they will support offline reading, they probably will have some kind of familiar offline DRM.

Books In The Cloud.  The service-based aspect of Google’s plans is not unlike the approach of the other big players (like Amazon) which keeps books in a centralized server and provides for downloads to digital devices.  It provides a kind of centralized control to publishers through their contracts with the service providers (Amazon, Google, etc) who deliver the actual books.  It’s clear why publishers would like this model but it also permits striking violations of trust and expectation such as Amazon’s funny-if-it-weren’t-tragic deletion of Orwell’s 1984 from reader’s Kindles.  In many ways, because it diminishes future rights, it is profoundly anti-consumer.

There is also an argument that a server-based model for e-books undermines the ecology of ideas which has characterized post-Enlightenment thought.  Books are the among the key carriers of perspectives and ideas across generations.  Their ability to do this is based, in part, on their independent durability, which is lost in a server-based scheme.  Sergey Brin’s NYT op-ed piece “A Library to Last Forever” argues that the value of Google’s book project (and, by extension, their disputed settlement) is in avoiding the ultimate or practical loss of knowledge associated with physical books.  This is true, but it’s also important to be careful to not introduce new risks as our books go digital.  Part of the reason that the loss of the Library at Alexandria (for example) was so tragic was that books were difficult to share and copy (legitimately or illegitimately).  We don’t want to make the same mistake!

Think of the Orphans.  The Google Warehouse would presumably also host the vast number of orphan works which Google would be allowed to license under the terms of the Google Book Settlement.  Depending on how the settlement is reworked (and we should hear soon, though it will probably be further challenged), this could mean that the warehouse could shortly have many millions of out-of-print but copyrighted titles (leaving all the other big players in the dust) that would be unavailable to possible competitors without significant investment (that may be fair) and tremendous risk (not so much).

It is ironic that Google, the disintermediator par excellence who helped blow up the economics of “old media”, might end up becoming an uber-mediator for book publishing.  And it is disturbing that it’s centrality and necessity won’t be based on technology (like the victims of disintermediation past), but on law and copyright, which is (alas) more durable and less susceptible to innovation.

What If Amazon Owned the Alphabet?

October 15, 2009

One of the biggest issues in e-book publishing today is the incompatibility of formats across vendors.  Right now, without invoking digital arcana and probably violating a EULA or two, I can only read my Kindle books on a Kindle or the Kindle app for my iPhone.  I can’t read them on my computer or a Sony Reader or any other device I might purchase (from anyone but Amazon).  The same is true for e-books purchased from Barnes and Noble or Sony’s online store.  The difference (see below) isn’t just in technical standards (written in malleable code) but in DRM implementation and policy (partially written in obdurate contracts and agreements).

The diversity and incompatibility of e-book formats is a pain for consumers and also threatens the renaissance of engaged reading as a basis for social and intellectual discourse (part of what sbooks.net is all about).  However, the dominance of a single format and vendor (with Amazon the front-runner) would be even worse.

The market power of a dominant non-portable e-reader is especially insidious because books are a long-term investment and the more books one buys, the more one is locked into a particular vendor and their e-reader.  This makes it hard for consumers to change and essentially crystallizes early market dynamics (where we are right now).

Having a dominant vendor and platform is bad for both readers and publishers.  For readers, we’ll find a technology which should have expanded innovation and increased choice does the exact opposite.  Instead of the innovation and growth which accompanied the open-by-design Internet, we will have progress which is slow and limited by a single organization’s strategy.  Some publishers might actually be attracted to this scenario, but they pursue it at their peril.

Publishers lose in this scenario because the market power of a single vendor/format is tremendous.   Big book chains (like Borders) and e-tailers (like Amazon) already use their scale and scope (in terms of outlets or registered accounts/credit cards) to wield pressure over publishers and limit competition (though sometimes in the publishers’ interest).  Having a dominant vendor with a non-portable format will result in shrinking margins for publishers and will hit smaller publishers (with less negotiating power) earliest, leading to reduction in choice.

As the title suggested, in the e-book world, a proprietary non-portable format is like owning an alphabet, enabling the vendor to be a gatekeeper for publishers’ access to the market.  While there will certainly be alternate channels (like ink-on-paper), the absence of an easy way to “rip” purchased books (as you can CDs) makes the advantage even stronger than for music publishing.  And because there is no portable open format (like MP3, which iTunes supported), Jeff Bezos will end up having much more power to dictate to publishers than Steve Jobs ever did, to nearly everyone’s detriment.

– Ken Haase

Geektails: One basic incompatibility is in container formats, with Amazon supporting only Mobipocket-derived formats (they now own Mobipocket, but run it as a separate business) and everyone else apparently converging on the (technically better) EPUB format.  But the deeper incompatibility is the DRM (digital rights management) which is essentially a plug-in for EPUB, so different readers may all support EPUB but still be incompatible for DRM’d books.  And with most of the publishing industry convinced that they can’t live without DRM, this means we have many flavors of incompatibility.


Follow

Get every new post delivered to your Inbox.