Iceberg inGreenland
Image via Wikipedia

It’s long been an assumption that tried-and-true publishing business models might not retain their viability in an increasingly collaborative and tech-centric information environment. But as these new layers emerge, can the central aspect on which they depend — namely, content — survive?

In a recent blog piece about Aus-e-Lit, Roger Osbourne identifies apparent contradictions between individual reward models and an increasing emphasis in utilities like Aus-e-Lit on collaborative annotation:

The ANNOTATION TOOL will allow users to enhance AustLit records by adding information about a work or an author that would not normally be indexed by AustLit. For instance, information about the role of a literary agent in the publication of a particular edition could be added, providing a foundation for generating networks of influence in the publishing industry. If we can imagine scores of independent researchers contributing such information, cultural fields could emerge from the contributed data that tell us more about the publishing industry than we currently know.

The detriment, he says, is that all of this innovation has pushed the boundaries of existing publishing, copyright, and compensation models. This may, in fact, only be the tip of the iceberg — the point at which traditional publishing business assumptions are being Zamboni’d by the blade and squeegee of technological change.

Certainly, we’re aware that dynamic posting, annotation, discussion and elaboration, revision, and user ranking are utilized in a rapidly growing number of information environments. And it’s no surprise that commercial business models and individual compensation structures have struggled to keep pace with the rapid adoption of these collaborative tools for content creation.

As publishers migrate towards less-defined content sales units, they are also encountering new challenges relative to contributor compensation.

For example, a book author receives a negotiated, ongoing royalty based on sales of the book unit, whereas if the same content is mashed-up in a database, the author may be paid a licensing fee based on usage or representation of bytes of content in the larger digital repository.

Compensation structures are unlikely to deliver the same uniform and consistent royalty payments to authors and contributors in the new paradigms. The worth of an individual contribution will be diluted by the addition of new materials, unless complex algorithms are developed to assign relative value, based on user perception of the relative import of one piece of contributed content versus another.

And there is a question of motivation. Will individuals be equally motivated to be part of these larger projects with less direct revenue and without their name visible on a book cover (not to mention the fact that explaining how payments have been calculated will be challenging enough – what’s a byte?).

There’s also a shift in what is being commercialized. Is it content or eyeballs? Osbourne notes that the user data and folksonomic tags compiled on the back-end of socially networked information platforms will support further analysis, with great potential learning about systemic relationships, user behavior, and user preferences. We’ve already seen innumerable examples of this on the Web, where companies make content and tools freely available in order to leverage the subscriber access and usage information obtained on the back-end.

John Wilbanks spoke about this at last year’s SSP IN meeting:

When a layer gets commoditized, value is created through proprietary services in adjacent layers.  Clay Christensen

But adjacency is relative — there has to be a central source to support the adjacent elements. How publishers will continue to motivate contributors of this central activity while compensation and motivation move to the adjacent layer will provide many challenges.

There’s a delicate balance to strike that calls for a clear understanding of what is most valuable to others and what is needed to make that relationship work. Content is no longer king — it’s being demoted to a functional role, its value shifting to the bicameral rule of access and utility. Collaboration is becoming a must-have rather than a nice-to-have. This requires a very different dialogue with stakeholders about what is needed to sustain the necessary contributions.

Reblog this post [with Zemanta]

Sunlight Foundation Transparency Brainstorming

If they build it, will we go? That’s a question being posed by two open data exercises, one underway and another planned for later this year. Both are attempts to use information transparency to make governments more involving and accountable.

ViewChange.org, a release from LinkTV planned for summer 2010, is described as a game-changing utility that will provide free information tools to everyone, from organizations to individuals, on global development topics. The project is funded by the Bill and Melinda Gates Foundation, with design services from Method and development by Definition. A design prototype is available on the Method case studies page.

The interface encourages discovery through a fluid model that highlights the relationships among topics. Concentric circles and related information combine to surface related issues, regions, and media. “Take action” links at every step ensure a personal relationship with the data, connecting global issues to individual action.

Besides the sheer visual coolness of the platform, which embeds a video experience with interactive Web search and research buttons, a standout characteristic of the project is its promise of democratized access to data and information to everyone — individuals, researchers, and organizations. If successful, ViewChange.org could be level-setting in that it will not require the user to have a PhD to interpret how disparate bits of data, news, performance, and research begin to fit together.

Sunlight Foundation’s Sunlight Labs is also aiming to make complex data and information more usable for the greater good. Sunlight’s mission is to open government and “make it more transparent, accountable, and responsible.” To accomplish this, the Sunlight Labs site is a community space where staff and community programmers can share open-source code, APIs, publicly available data sets, and ideas — resulting in co-created utilities that help the organizations and the public interpret public data, often aided by mobile apps or Flash visualization technologies.

Currently, Sunlight Labs reports that community members have contributed more than two-thirds of the apps and APIs found on their Projects page. Sunlight Labs also hosts a wiki that includes a list of yet-unbuilt project ideas, for example:

  • Build a greasemonkey extension on top of the OpenSecrets.org (or the FEC database itself) that tells you the campaign contributions of the person you’ve received an email from.
  • Ahead of the 2010 Census, create a website that shows maps of districts, how they’ve changed over subsequent redistricting/gerrymandering, some way to discern “fair” and “unfair” redistricting, and finally, what a fair 2010 redistricting might look like.
  • Build blog plugins in WordPress, MovableType, etc., that allow bloggers to pull in data about lawmakers in blog posts.

What remains to be seen is whether the public will latch on to these utilities — and whether there will be sufficient maintenance and data consistency to support long-term research adoption. As has been noted here before, open data initiatives can be expensive to create, and they need a clear purpose. Whether Sunlight or ViewConnect will be useful is something only time will tell.

Whatever the outcome, it seems safe to assume that, in the future, researchers — and everyone else — will have far greater access to obscure data than they had ever dreamed.

Reblog this post [with Zemanta]

CrossRef recently published new guidelines for assigning DOIs to books — including reference works — and a revised fee structure for publisher deposits.

The new parameters advise publishers to deposit DOIs at the chapter/entry level. The 2010 pricing structure, which has significantly reduced fees for backfile and intra-ebook content deposits, supports their recommendation. Assuming publishers adopt this direction (Springer already has — see their new SpringerLink platform), these moves may have far-reaching, long-term implications for e-book functioning and interoperability.

CrossRef identifies the following aims:

  • Maximize reference linking among books, journals, and conference proceedings
  • Enhance the discovery, visibility, and usage of book content
  • Enhance the user’s experience through improved functionality
  • Enable the creation of a book citation reporting mechanism which would give book content the visibility, credibility, and metrics that journal content has

There have been many champions of entry-level metadata, some of the more prominent in connection with Reference Universe. Advocates have been acutely aware of the factors limiting e-book functionality, stemming from the absence of coherent e-book tagging and linking standards.  Reference e-books have suffered particularly in environments external to publishers’ own platforms, because these rely on deep-level tagging to enable discovery and use of the of content within.

Assuming that publishers quickly embrace the new book DOI recommendations, multi-disciplinary reference may yet regain its “royal status” (see David Tyckoson’s presentation for Booklist on “The Rise and Fall of Reference“) in the digital information environment — or, at least, to get back to the table as a relevant, high-use player.

Publishers, particularly those who publish journals, have been cognizant of the potential for DOIs in e-book linking.  However, with hundreds of thousands of backfile DOIs at the chapter and entry-level to deposit, Cross Ref’s pricing has been a gating factor — until now.

It’s easy to envision that, in an environment in which patrons have access to dashboards that help them create and manipulate personalized information folios — e.g., ebrary’s new DASH, which stands for “data sharing, fast” — more granular linking will provide a significant boost.  No one wants to add an entire encyclopedia to their folio, but individual articles make a lot of sense.

Are there unanswered questions? Yes, particularly pertaining to links for titles hosted in non-primary aggregations. E-books may be hosted in 10 or more different locations and formats. With collections of hundreds of thousands of hosted titles in their repositories, e-book aggregators may lack the incentivse to embed granular DOIs that link out to publisher sites.

However, this is an assertive move by CrossRef to help make e-book content — including reference — more parsable, interoperable, and linked.

Reblog this post [with Zemanta]
Richard Rosenblatt
Image by jdlasica via Flickr

There are some who believe that if major media outlets can reinstate a paid model for online content, there will be a reversal in the market for skilled journalists—in essence, returning to a subscription or subsidized model for news will provide revenue to publishers, who will hire quality writers and editors once again, and pay them.

But that seems unlkely. In 2009, Gary Kamiya contemplated this exact problem in Salon:

If reporting vanishes, the world will get darker and uglier. Subsidizing newspapers may be the only answer . . .

But the story is more complicated than that. At the same time that newspapers are dying, blogging and “unofficial” types of journalism continue to expand, grow more sophisticated and take over some (but not all) of the reportorial functions once performed by newspapers. New technologies provide an infinitely more robust feed of raw data to the public, along with the accompanying range of filtering, interpreting and commenting mechanisms that the Internet excels in generating.

Journalists and newspapers have never lacked for opportunities to embrace new media technologies. So, what happened to their model? As early as 1999, Scott Rosenberg, also writing for Salon, prophesized in a piece entitled “Fear of Links:

While professional journalists turn up their noses, weblog pioneers invent a new, personal way to organize the Web’s chaos.

Were the newspapers not listening? Or were they sufficiently blinded by their own traditions and cultures that they could not take the best emerging technologies and ineffectively struggled to retrofit innovation into traditional containers?

Are we suffering from the same myopia?

Publishing institutions face challenges — trying to overcome infrastructure obstacles and established thought patterns to embrace new ways of thinking about content creation and scholarship.  Even a progressive publishing house will find re-tooling to be expensive, painful, and slow.

Companies that don’t have traditional models and structures have the benefit of having less infrastructure drag and can be agile enough to create niche businesses to zero in on market problems—sometimes solving them by irreverent means.  They have the added “advantage” of not embracing romanticized traditions or struggling with crises of conscience when contemplating radical change.

If you’ve watched the progress of Demand Media, you’re aware that Demand has set their sights on building relationships with major media outlets to supplement the content that the newspapers—under extreme financial strain—are commissioning less and less themselves.

Demand’s process for creating and vetting content and for compensating writers is nightmarish for those who have made their livings as journalists, editors, and contributors.  This month’s feature in Vanity Fair notes:

Demand pays roughly $15 for an original, well-written and researched 500-word article. That’s three cents per word, about one-tenth of what a writer would get from a frugal magazine or newspaper. Nevertheless, media professionals are signing up in droves . . .

According to an October 2009 article in Wired entitled, “The Answer Factory: Demand Media and the Fast, Disposable, and Profitable as Hell Media Model,” the process driving content creation is almost entirely technologically based:

Pieces are not dreamed up by trained editors nor commissioned based on submitted questions. Instead they are assigned by an algorithm, which mines nearly a terabyte of search data, Internet traffic patterns, and keyword rates to determine what users want to know and how much advertisers will pay to appear next to the answers.

Are there radically innovative start-ups already moving in our industry?  Take a look at SEED Media Group.  SEED’s Research Blogging is an aggregated blog platform that provides free access and cross-searching of posts authored exclusively by scientists and scholars — who are also scholarly journal contributors.  In an open-access environment, making pre-publication research discussion freely available is not necessarily problematic, provided that we can deal with the fact that it the material presented has not been peer-reviewed.  What we lose in the review process, we make up for in terms of speed, availability, and our ability to customize results to our own interests.

The concept that free blogging by academics—not intermediated by a society or publisher—is suitable for our readership is a new one.  And, it’s potentially disruptive.  If embraced, will this content be additive or will it supplant the demand for something else?

If we allow that we are already seeing new models and new technologies from start-up companies in our own industry, is there room for a company like Demand Media to move in?

I think there is potential for commercial or non-profit companies with models like Demand Media’s in academic publishing.  There are aggregated academic content platforms being developed (Google Scholar and others) that will, in future, be able to leverage enough student usage data to interpret content demand on the basis of clicks.  From there, building a meta-community of academic authors and contributors is not a stretch, and assigning work dynamically on the basis of clicks is also within reach.

Is this a bad omen for our industry?  Not necessarily.  There are positive ways that scholarly publishers, libraries, and vendors can collaborate to leverage these sorts of opportunities.  This may be a signal to us to watch closely, disencumber, and stake out new directions — from within our community.

Reblog this post [with Zemanta]
Google Book Search Mobile Edition auf dem iPhone
Image by fabi_k via Flickr

When my company added six million Google Book Search and Google News Archive links to two of its databases last month, I learned a few things about some puzzling disparities in Google’s treatment of scanned public-domain works.

The two databases — 19th Century Masterfile (NCM) and Public Documents Masterfile (PDM) — are discovery aids that link to many millions of documents, nearly all of which are in the public domain. By adding links to the locations of several millions of these on Google sites, we were able to discern, very clearly, differences in Google’s treatment of 19th century historical and literary materials versus scanned government documents.

Click through a bibliographic record to a 19th century item, and you will be able to download the complete document in PDF or Mobipocket formats about 90% of the time. By contrast, click on a bibliographic record linking to a government document, and you will be met in most cases with either no preview, a snippet view, or a limited view. The full-text seems to be available only 10-20% of the time.

I have been asking around about this and have been the beneficiary of any number of theories, but have not been able to get any real clarity as to why this discrepancy exists.

The same thing happens when the documents are viewed through a university’s system. When downloading a 19th century document, a scanned version of ‘An Introduction to Greek and Latin Etymology’ from the University of Michigan library, I was struck by the wording of the introduction that Google includes at the front of each of their scans, as follows:

Usage Guidelines

Google is proud to partner with libraries to digitize public domain materials and make them widely accessible. Public domain books belong to the public and we are merely their custodians. Nevertheless, this work is expensive, so in order to keep providing this resource, we have taken steps to prevent abuse by commercial parties, including placing technical restrictions on automated querying.

We also ask that you:

  • Make non-commercial use of the files. We designed Google Book Search for use by individuals, and we request that you use these files for personal, non-commercial purposes.
  • Refrain from automated querying. Do not send automated queries of any sort to Google’s system: If you are conducting research on machine translation, optical character recognition or other areas where access to a large amount of text is helpful, please contact us. We encourage the use of public domain materials for these purposes and may be able to help.
  • Maintain attribution. The Google “watermark” you see on each file is essential for informing people about this project and helping them find additional materials through Google Book Search. Please do not remove it.
  • Keep it legal. Whatever your use, remember that you are responsible for ensuring that what you are doing is legal. Do not assume that just because we believe a book is in the public domain for users in the United States, that the work is also in the public domain for users in other countries. Whether a book is still in copyright varies from country to country, and we can’t offer guidance on whether any specific use of any specific book is allowed. Please do not assume that a book’s appearance in Google Book Search means it can be used in any manner anywhere in the world. Copyright infringement liability can be quite severe.

Immediately, I am struck by the apparently contradictory nature of the message. Notice the phraseology — these are “Usage Guidelines” versus “Terms of Use.” Google asks but does not require. While re-stating that these documents are in the public domain, Google requests that these be used by individuals only, not by organizations, or for commercial purposes. The justification for these requests is the cost of the scanning process which, one can only assume, promises some ultimate commercial benefit to Google.

Google (or their lawyers) seem uncertain about the rights that they have to public domain scans. However, my interpretation is that they are clear about their desire to restrict use by other commercial entities or non-commercial organizations.

Without further information from the source, it’s difficult to know why access to government documents is currently noticeably restricted, in contradiction with the broad availability of 19th century materials. One is also left to wonder whether Google believes that they have any legal grounds for their “Usage Guidelines,” which seem to defy the very nature of public access.

Meanwhile, it seems Microsoft has plans of its own involving public domain materials. An article  in Sunday’s Times Online announced that 65,000 19th century works of fiction from the British Library’s collection will be made available for free public downloads this spring .  Cited as, “the latest move in the mounting online battle over the future of books,”  one wonders if the British Library’s Microsoft-backed project  may be the first in a series of initiatives aimed at reducing Google’s stranglehold on public-access materials.

Given how confused Google appears to be about what “public access” means, some competition in this area might be a good idea.

Reblog this post [with Zemanta]