The subject catalogue ("Schlagwortkatalog...
Image via Wikipedia

The Online Computer Library Center, Inc., otherwise known as OCLC, is being sued by Jerry Kline, the owner of two companies attempting to compete with OCLC — SkyRiver Technology Solutions, LLC, and Innovative Interfaces, Inc. Kline is alleging anticompetitive practices, exclusionary agreements, and monopolistic market positioning.

But the lawsuit may not be the greatest threat to OCLC’s business.

According to the complaint, OCLC has become a huge company by aggressively acquiring non-profit and for-profit elements since the 1960s, locking up a tremendous amount of market power and making significant revenues coordinating bibliographic information between libraries. In 2004, OCLC’s equity was $138 million. From 2005-2008, OCLC was able to generate $17 million per year in surpluses on sales of over $200 million per year. By 2008, it’s equity had reached $211 million. In 2009, OCLC’s equity devalued in line with most investments in the market, but otherwise, the business performed well once again.

Because OCLC relies on submissions from its users to create it’s WorldCat database of card catalog entries, it’s also possible to see its troubles slightly differently in a world 40+ years older and more technologically developed. In the old days, it was expensive to coordinate the activities of hundreds or thousands of individuals through a computer network. Now, it’s cheap and trivial.

While OCLC’s collective database is not called “social media,” the bibliographic databases it controls are generated through the independent submissions of librarians everywhere, then resold to the same community at a high price. In 2005, OCLC added wiki features to WorldCat, a clear step into social media.

Much of the library community likes OCLC, at least judging from the comments on a related story in the Chronicle of  Higher Education. It would take a precipitating event for everyone to take a second look at OCLC.

It arrived.

Everything was going along swimmingly for OCLC until the Michigan State University debacle, which sowed seeds of ill-will, some of which took root and sprouted. The events started when MSU decided to try to disaggregate OCLC’s offerings, buying only part while getting the rest from one of Kline’s companies for a substantially lower price. OCLC then apparently charged MSU a huge penalty, leading the head of MSU libraries, Clifford H. Haka, to exclaim:

The cooperative is being diminished by a financial decision. We’ve been OCLC members for 40 years — we’re the ones who built this database.

In a listserv posting earlier this year, Tim Spalding laid it out in a way that hints at a culture drifting toward realization that computer technology is no longer foreign and scarce, and one that is seeing cognates emerge from the land of abundance:

LibraryThing, a tiny little company with 9 employees and a rack of crap commodity servers, has a searchable, continuously updated store of unique records equal to half of the WorldCat database. . . . The real work here is done by librarians, not OCLC. . . . And all that tax money, love and diligence is coopted by an organization in Ohio so profitable that it lost its tax-exempt status because the judge couldn’t discern a charitable purpose in the business of selling data services to libraries. (OCLC’s tax-exempt status was restored by a “special bill” of the Ohio legislature.)

The presence of LibraryThing has arguably led OCLC to add social features to its site. As this convergence with social media inevitably continues, a future more akin to Wikipedia’s present, Apache Software’s present, or Linux’s present may be in the offing. Library catalogs seem highly amenable to the open source approach.

Is OCLC is a leftover from the era in which computing power and connectivity were scarce and expensive, and when contributing “love and diligence” into this rare bit of infrastructure was really the only alternative? Or is it something more?

How this case is decided may be less important than how users view OCLC — as a database provider with a precious resource that can’t be easily recreated, or as a social media collective that can be moved if needed. If the latter, OCLC’s story (and Kline’s as well) may take a dramatic turn in the months and years ahead — a more radical shift than any court could ever impose.

Enhanced by Zemanta
Kent Anderson

Kent Anderson

Kent Anderson is the CEO of RedLink and RedLink Network, a past-President of SSP, and the founder of the Scholarly Kitchen. He has worked as Publisher at AAAS/Science, CEO/Publisher of JBJS, Inc., a publishing executive at the Massachusetts Medical Society, Publishing Director of the New England Journal of Medicine, and Director of Medical Journals at the American Academy of Pediatrics. Opinions on social media or blogs are his own.

View All Posts by Kent Anderson


18 Thoughts on "OCLC: Indispensable Database Collaborative or Social Media Prelude?"

“In the old days, it was expensive to coordinate the activities of hundreds or thousands of individuals through a computer network. Now, it’s cheap and trivial.”

“Cheap and trivial?” Are you a programmer? Keeping all these OCLC services running for thousands of libraries is not cheap, and it is not trivial.

Is OCLC charging too much for its services? Probably. Is what they do cheap and trivial? No.

Do Spalding and Kline gain if OCLC loses? Yes.

Do libraries gain if OCLC loses? Maybe. Maybe not.

You’re telling me this on a blog that costs us $50 per year to run, that reaches around the world, and that has all the publishing tools I need to embed video, etc.?

Compared to 1967, everything OCLC is using is cheaper and more trivial to scale than it was even 10 years ago — connectivity, user input, databases, etc.

I’m sure you could never make and maintain an OS this way. Oh, wait, Linux.

I’m sure you could never make and maintain the world’s most prevalent server software this way. Oh, wait, Apache.

I’m sure you could never make and maintain the world’s most popular encyclopedia this way. Oh, wait, Wikipedia.

Yes, cheap and trivial comparatively. LibraryThing is the Craigslist of card catalogs, and it’s a threat to OCLC.

If you want more sensible and illuminating comparisons to OCLC, look no farther than CrossRef and CCC. The comparisons to Apache and Wikipedia are silly, for the reasons cited by Adam.

I won’t comment on OCLC’s legal quandary. I’ve got mixed feelings on the matter like many librarians I know.

I’d like to point out another alternative, however. The Open Library project. . It’s one to watch due to it’s technical architecture. They are very cognizant of the linked data approach and make their content available via APIs. Like LibraryThing it has the “social media collective” portion of the equation. Like LibraryThing, it is “easily recreating” the “precious resource” of a shared database (ok, not so easy, but for arguments sake they ARE doing it…)

Interesting times indeed.

I think you misunderstand why LibraryThing is a threat to OCLC. For a variety of complex reasons, it doesn’t threaten OCLC’s cataloging business, any more than Craigslist threatens the Times news business. However, look where OCLC is trying to move. It wants to be the library’s web presence and traffic wrangler. LibraryThing is already doing that, and doing a nice job of it, too. It’s the new business model that is threatened, not the old one.

I may misunderstand, but I view catalog listings as more akin to classified ads — simple, descriptive, structured listings that are quite portable. If so, they will find a new, less expensive, less exclusive, commodity-friendly place to live in this age. I think that’s inevitable. But I also agree that LibraryThing is cutting OCLC off at the pass.

Perhaps a better way to think of this is that OCLC provides authority, curation and quality control for catalog records, very similar to the way that a journal publisher provides curation and quality control for journal articles. The business struggle between OCLC and say, OpenLibrary, is exactly the same as the struggle between traditional scholarly journal publishers and open-access journals.

The deeper question is whether authority, curation and quality control in cataloging is sufficiently valuable to sustain OCLC’s cost model, and whether OCLC is doing that job well enough to survive.

I still find it hard to accept that librarians aren’t the people providing the authority, curation, and quality control for catalog records, and that OCLC is a survivor from a day and age when using computers to store and coordinate records was hard to do and expensive in the extreme. The business struggle is not the same. How many catalog entries are rejected by OCLC? 45%? 75%? Hardly. The analog doesn’t work for me at all.

I think the deeper question you pose is headed in the right direction, but again, who is providing the authority, curation, and quality control?

So, that’s active, human involvement in rejecting records, or just deduping done by a computer?

Most records are deduped algorithmically; it’s a large software investment to do so, as you might imagine.

The software “investment” is one or two really smart guys for a year or two (ie $500K or less than half the CEO’s annual salary) and the money would have been spent years ago. Recreating the software today would be cheaper and easier. The computing resources to run these algorithms are getting cheaper all the time — just whip out your credit card and spin up a Hadoop cluster on Amazon’s EC2 cloud.

It’s not just algorithms, but a combination of algorithms and human feedback (which in turn improves the algorithms. All the human feedback can come from the catalogers and users in the course of their daily use.

They might be able to hoodwink folks for weeks or even months longer, but the jigs up. Anyone with any technical savvy knows that there’s no black magic (or major expense) to it.

Tom- That’s exactly why I suggest using CrossRef as a fairer comparison. Your numbers are low by an order of magnitude, but that doesn’t effect the conclusion that if you were to reproduce OCLC cataloging from scratch, it would cost a lot less.

The difficulty of dismantling the social practice around OCLC cataloging is another story entirely.

Ignore the naysayers. Tim Spalding is correct. WorldCat is a crowdsourced application where libraries not only contribute the data, but pay OCLC for the privilege of doing so.

Read this Library Journal article for a good rational analysis including a comparison of OCLC’s million dollar executive compensation levels compared to other non-profits.

Kent says he thinks catalog records are “simple, descriptive, structured listings” which is correct from a computing perspective for records manipulation and transfer. From a content creation perspective, it is far from simple to add information to the fields within those records. And the same holds for programming with the content of those fields (metadata interoperability issues anyone?)

Kent asks, “who is providing the authority, curation, and quality control?”

It is the librarians within the cooperative. They apply standards overseen via extra-OCLC governing bodies(see PCC, BIBCO, NACO, CONSER, SACO et al) to deal with quality control on the record content. On the records manipulation and transfer end of things, there is MARBI which governs the use of the MARC standard structure and there is the Z39.50 standard. Librarians work within all of those bodies.

OCLC’s does provide quality control in record content the realm of standard database maintenance tasks (i.e. de-duping records, globally updating fields when standards evolve, server admin tasks etc.) Members are paying for the technical stewardship not the content stewardship. A subtle and blurry but important distinction.

Eric Hellman gets it right. OCLC cataloging is very akin to the scholarly publishing system where journal publishers provide curation and quality control — on the backs of “volunteer” editorial and peer-review contributions of faculty . OCLC is providing the mechanisms of quality control but not the record editing and peer review. That is done on the backs of librarians. Eric was not characterizing OCLC as being responsible for the record content authority control and curation but as a service provider.

You don’t need to “accept that librarians aren’t the people providing the authority, curation, and quality control for catalog records.” They are. And it’s not a part of OCLC’s cost model. Librarians are providing that labor “free” to OCLC.

I do agree with Kent in that OCLC is a closed system. Bibliographic metadata does need to be open. It’s rather facile to imply that a larger scale of crowd-sourcing is going to reduce costs. Quality control of individual records’ content *may* improve with more eyes on the records and easier ability for anybody to edit. Anybody using the records, however, is still going to have to do some quality check if they hope to have metadata which plays nicely with their systems. The costs of quality control and authority will not go away. They are going to be distributed more than they already are. And they will be borne when the records are used instead of when they are created.

What will change is the ability to share & re-purpose the records. Other communities outside of publishing & library land can get into the bibliographic data game and build services based on our records. Just where do you think FreeBase and Squared are getting their data from?

Librarians can decide where they want to put in their free labor. OCLC is no longer the only game in town and they know it, hence the change in business model. Lucky for them, librarians (especially cataloger types) can be slow to evolve.

We have had a similar ‘monopoly’ from the Australian National Library’s database of Australian library records. The fact is when it began – no other organisation had the capacity, technology or foresight to create such a thing. So, on the one hand any organisation that began such an enterprise 40+ years ago deserves recognition. On the other hand, as the world has moved on we (in Australia) still pay for the privilege of using & contributing to this database.
While LibraryThing is a viable alternative for small libraries (yes, I am a user) it does not imbed the bibliographic / cataloguing standards in the way that Aust Nat Lib & OCLC do. I would imagine that in the US OCLC is till the first port of call for good bibliographic records, just as the National Library is here. The wake-up call for both kinds of organisations is here – how do they use that long-term experience & quality of records properly in the 21st century?

Any update about the Lawsuit filed by the Online Computer Library Center, Inc.?

Interesting thet Scholarly Kitchen is re-running this article. I shall repeat some of the sentiments I posted in August 2010.
Library Thing is wonderful, but the quality & uniformity of bibliographic entries is lacking – especially if the only entry you can find for a book you’re entering is AMAZON rather than a library. The ‘Social Media’ aspect of OCLC has been around for a while. Librarians are attracted to Library Thing and anything else that operates similarly, because our founding principles are sharing and collaboration. Surely OCLC was originally set up by and for librarians? Perhaps when companies get really huge they forget their roots.

Comments are closed.