Alix Vance joins the Scholarly Kitchen

Please join me in welcoming Alix Vance as a new chef in the Scholarly Kitchen. Alex is currently President at Paratext, a company that publishes bibliographic databases and technological services to the academic, public and special library markets in North America and throughout the world. Prior to this, she worked at CQ Press and SAGE, eJournal Press, and the Chronicle of Higher Education. She’s currently on the Board at the Society for Scholarly Publishing.

Alix’s first post appears today.

Reblog this post [with Zemanta]
Elderly People sign
Image by bensons via Flickr

How quickly things change. The recent Pew study on social media adoption, which I blogged about here last week, showed that, as Nicholas Carr puts it, “blogging is now the uncoolest thing you can do on the Internet.”

His rationale? Teens don’t do it, so it’s not cool:

When I blog these days, I feel like I should be sitting in a rocking chair, wearing a highly absorptive undergarment, and writing posts debunking some overhyped new bunion treatment (iPads?).

It’s a funny post with a point — blogging is long-form communication. People with something to say do it. People who are just building an online social life (“ambient intimacy“) can use shorter signals to accomplish it.

The topic also arose at last week’s PSP meeting, in that students tend to be more conservative about their public information production because they are concerned about career advancement, while older professionals are more secure and interested in exploring boundaries. So, the relative privacy of Facebook and texting and Twitter is probably also a factor.

Reblog this post [with Zemanta]

Through the personae of John Maynard Keynes and F.A. Hayek, a rap video by George Mason University economist Russ Roberts and Spike TV executive John Papola explains the two major competing hypotheses around boom-and-bust cycles. It’s a great primer on economics, and definitely fun to watch.

A sidelight is around how this video gained prominence and legitimacy through an NPR story. It seems to me that NPR has been drawing more from the social media space recently, in what strike me as a concerted effort to find stories that have that particular edginess — or is it a natural side-effect of the fact that NPR has done a magnificent job of reinventing itself as a modern news entity, with great offerings in audio, text, and online?

But even NPR needs to legitimize its choice (securing, as they put it, “street cred”) when it comes to rap (surprising, since rap and hip-hop have been around for more than 30 years now), so they had to ask Ke$ha if Roberts and Papola are good rappers.

Ke$ha verified they are.

Happy Friday!

Reblog this post [with Zemanta]

A Chicago-style hot dog

Sitting in the audience awaiting the start of the first Oxford-style debate to be held at a PSP meeting (“Current US Copyright Law Excessively Restrains the Development of Intellectual Property“), Michael Clarke and I joked about how Oxford has debates while Chicago has hot dogs and elections (Chicago-style hot dogs have more vegetables and the distinctive taste of celery salt, while Chicago-style elections are notorious for the dead rising from their graves long enough to mark a ballot before mysteriously returning to the moldering ground).

Moderated by Silverchair’s Thane Kerner, the panel included Lawrence Lessig, William S. Strong, Richard Baraniuk, and Allan Adler. The audience was able to vote using a cool text-message polling system that displayed live results as the debate was conducted.

The rules governing an Oxford-style debate are almost as complicated as the ingredients list for a Chicago-style hot dog.

Richard Baraniuk started FOR the proposition, arguing that copyright holds us back because it is inefficient for knowledge exchange, and the inefficiency scales as the value of concatenations of content grow in complexity. In contrast, open licenses let value scale independently, eliminating inefficiency (at least that generated by having to secure legal permissions). In a networked world, these inefficiencies become more glaring. He also disputed the fact that copyright helps authors monetize content. As an author himself, Baraniuk said that authors don’t usually make money, especially academic authors. So authors have no real economic interest in copyright. In fact, it disincentivizes authors because it caps distribution and impact.

Bill Strong then responded AGAINST the proposition, but did not use slides. I initially thought this set him at a clear disadvantage, but it actually seemed to work in his favor. He made a number of points about the importance of copyright in an age of increasing government power — essentially, when government becomes more powerful, the protections for speech must remain so that communication vehicles can become more powerful as an offset. Strong noted that the incentives of scholars are different than more commercial parts of the copyright industry do, so Baraniuk’s perspective is inadequate. Strong asserted that tailoring the copyright law to suit particular types of material is a slippery slope to tyranny. He also noted that our publishing industry is the strongest in the world, and believes copyright is a main reason for this. While potentially imperfect for every scholarly application, tinkering with it could lead to untoward and dangerous consequences.

Strong also drew a distinction between the incentives to create and publish a work and the motivations to repurpose it. Copyright plays a role in both, and provides authors with an infrastructure for publishing by its very existence (how it enables a publishing environment, for instance), and safety in knowing they can know how their works will be reused.

Lawrence Lessig then took the stage, and called Barandiuk a Communist Texan, throwing him under the bus. I can’t capture the energy of his talk, but it centered around the word “excessively,” since he fundamentally agrees with the premise without the word “excessive.” Therefore, the argument is a relative argument, not an absolute argument. His examples were about the excessive complexity of tracking down copyright (75% of works can’t be tracked down for copyright), and the system’s complexity was his complaint. This over-regulation chills exchange of information because the regulation is implemented so poorly. Since copyright predicates itself on a “copy,” and everything digital makes a copy, the framework is wrong, and this is what’s driving the “excessive” restraint.

Allan Adler then took the stage, and took up the “efficiency” gauntlet, and talked about how the Constitution was designed to be inefficient since efficiency was equated with tyranny to them. He then wondered at whether this is harm that can be demonstrated by the excessive restraint introduced by copyright. If no harm, it’s not excessive. Is it the public? The culture? Scholarship? Is it the loss of availability of opportunity (so, an opportunity cost)? Is it creativity that suffers? Adler then reflected on the history of copyright law, which has never created a fair and open playing field. Yet absent a “lost Camelot of copyright law,” how was this balance before the current age achieved and how can it be restored? Ah, but it never existed. Adler points to evidence that refutes the assertion — namely, look at how many new works, how many new devices, how many new businesses, and how many creative ideas are emerging right in front of our eyes. How can this be “restraint”? How do we establish “acceptable” restraint (pointing to Lessig’s argument)? Would the process be any different than the one that led to current legislation? Safeguards in current copyright prevent it from being excessive. For instance, creativity (even a “scintilla of creativity”) can distinguish a work from another. Commonplace ideas aren’t protected. Characters are only protected when they are carefully delineated characters, not stock characters.

Kerner then moved the panel to an interrogatory period, asking Lessig and Baraniuk if their concept of copyright was to create very granular carve-outs. Baraniuk struggled to answer the question. Guided into an answer a little, he reflected on the fact that copyright is too complex. Lessig jumped in, and argued that the business model of mass media is what’s supported by current copyright law. Therefore, yes, more carve-outs would need to exist to manage the complexity, which Kerner rightly pointed out sounded like more complexity, not less.

Strong talked about how non-intuitive copyright law can be, and how education can help. But complexity is not incoherency, Adler added. He then talked about what copyright doesn’t protect — data, concepts, ideas. And because any law streaming from the Federal jurisdiction is complex, the complexity of copyright law isn’t unique. It occurred to me that free speech law is also complex.

Lessig responded that while you can write this off as educational gaps or the like, the fact is that overly complex systems cry out for us to do better. He believes we can do better. What we have is “unbelievably corrosive” to how our culture works, Lessig asserted passionately, stating we should simplify it so it doesn’t “criminalize our kids.”

Strong’s closing comments centered on a lack of empirical evidence that creativity is being stifled, a point Adler made as well.

Baraniuk commented on the millions of works published using Creative Commons as evidence that there’s dissatisfaction with copyright law. He also noted that licenses aren’t incompatible with business.

Adler mentioned how arbitrage financing, investment banking, or other areas of endeavor are complex. Complexity comes with the territory there, and it does with copyright. He then rattled off a litany of increases in output of new titles in movies, music, books, and other media. In 2009, the National Arts Index that measures “the vitality of the arts” in our culture. The latest index listed expansions on all fronts — artists, art businesses, photographers, etc. — up about 19%.

Lessig then closed with the example of how documentary films are essentially shut out of being burned to DVD because getting rights to earlier works is too expensive and complex. To him, this is just more evidence of copyright law being “excessive,” and we should find a way to make it work better.

So, how did the audience end up voting?

51% agreed with the premise — copyright law is an excessive restraint (18% agreed to start)

41% disagreed with the premise (50% disagreed to start)

8% were undecided (32% were undecided to start)

And, no, it wasn’t a Chicago-style election — the voting system prevented you from voting twice. Despite that, I’m convinced an undeclared number craved celery salt.

Reblog this post [with Zemanta]
Plastic Logic e-reader
Image by Jeremy Toeman via Flickr

As we’ve mentioned plenty on this blog lately, the e-reading space is just about to overheat, and it’s a fascinating turn of events. At the 2010 Professional and Scholarly Publishing (PSP) meeting, a session created a “petting zoo” environment, with various e-reader device manufacturers and sellers showing their wares using an ELMO projector and live demonstrations.

From the Nook to the Skiff to the PlasticLogic Que to the Entourage eDGe, they were all here. Most were black and white e-ink-based slabs. They varied in size (Nook is paperback-sized, Que and Skiff are magazine-sized), purpose, and target audience. The Que is meant to replace the stack of papers in your briefcase. The Skiff is flexible because it’s on metal foil, not stiff metal.

None were particularly fast. The Que was interesting because the demonstrator kept reprimanding herself for doing something wrong when the device didn’t do what she wanted (“oh, I get impatient” or “sometimes I swipe too fast”), suggesting the UI bugs aren’t all resolved yet.

There’s definitely a shake-out coming in this space, and the elephant in the room was the iPad. With money tighter than usual, people are going to choose carefully. This petting zoo felt like different breeds of the same beast, and none seemed to be the potential workhorse the iPad might be.

I’ll say it once again — things are probably evolving too quickly for all of these players to live to see 2011.

Demonstrations of reading software that included interactive and multimedia elements followed, and they were very compelling and interesting. You could feel people in the room lean forward. This level of engagement only cemented in my mind the importance of color, audio, a lot of working memory, and robust connectivity to the future of exciting e-reading — aka, things like the iPad.

Reblog this post [with Zemanta]

I was asked to provide the wrap-up for the Professional and Scholarly Publishing’s 2010 pre-conference, put together by the Electronic Information Committee (of which I’m a member). It was a session with some real insights, and I’ve tried to capture many of these in this slide deck. Unfortunately, a lot of what makes sense of these slides requires you to have heard the presentations of the day. But this is a peek into what went on yesterday in snow-fearing Washington, DC.

Reblog this post [with Zemanta]
Data Center Storage
Image by Waleed Alzuhair via Flickr

There are movements afoot to create an era of open data standards, with proponents arguing that publishers should be doing more to support open data. Governments, visionaries, and technologists are all promoting the seemingly wholesome and harmless notion that direct access to the underlying data is virtuous and necessary, and by using the term “open,” the illusion is that all we have to do is stop keeping it closed, and the data will flow without a problem.

In an interesting post written from bitter experience, Nat Torkington of O’Reilly’s Open Source Convention confesses that the open data vision is incomplete because it’s been built on technological enthusiasm while overlooking the very real barriers to making data not only open but useable:

  1. Funding for data continuity and marketing
  2. A use-case to ensure the right data are being created in the right way for the right people

Torkington talks about the problems his teams have encountered in repatriating data from ad hoc, internal, or idiosyncratic systems into robust, normalized, and accessible data repositories. Believe it or not, this costs money. Most researchers build data systems for the grants they have, and their grants only support the data for that particular study. Once a study is done, the spreadsheets shut down, and there is no budget for maintenance, updating, or migration of the dataset. There’s also no overall infrastructure or standard that makes one data set able to interface with any other. Each data set is an island. As Torkington puts it:

. . . it costs money to make existing data open. That sounds like an excuse, and it’s often used as one, but underneath is a very real problem: existing procedures and datasets aren’t created, managed, or distributed in an open fashion. This means that the data’s probably incomplete, the document’s not great, the systems it lives on are built for internal use only, and there’s no formal process around managing and distributing updates. It costs money and time to figure out the new processes, build or buy the new systems, and train the staff.

So, there’s no money to make the data come together. Next, there’s no money to let users know the data exist and are available. This is a marketing problem, but a real one. As Torkington puts it:

There’s value locked up in government data, but you only realise that value when the datasets are used. Once you finish the catalogue, you have to market it so that people know it exists. Not just random Internet developers, but everyone who can unlock that value. This category, “people who can use open data in their jobs” includes researchers, startups, established businesses, other government departments, and (yes) random Internet hackers, but the category doesn’t have a name and it doesn’t have a Facebook group, newsletter, AGM, or any other way for you to reach them easily.

Torkington lists five “different types of Open Data groupie,” and his experience in the area shines through:

  1. low-polling governments who want to see a PR win from opening their data
  2. transparency advocates who want a more efficient and honest government
  3. citizen advocates who want services and information to make their lives better
  4. open advocates who believe that governments act for the people therefore government data should be available for free to the people
  5. wonks who are hoping that releasing datasets of public toilets will deliver the same economic benefits to the country as did opening the TIGER geo/census dataset

I encountered an Open Data advocate late last year at the Online Information meeting in London. He and a group of like-minded people have started a new initiative called DataCite:

The objectives of this initiative are to establish easier access to scientific research data on the Internet, to increase acceptance of research data as legitimate, citable contributions to the scientific record, and to support data archiving that will permit results to be verified and re-purposed for future study. DataCite will promote data sharing, increased access, and better protection of research investment.

By taking the position of citation relative to data, this initiative seems to skirt the problems Torkington has identified, and that’s not a promising sign. What will it matter if data are citable if they’re idiosyncratic, not maintained, isolated, and serve no clear purpose beyond the study for which they were generated?

Ultimately, Torkington’s model of some promising approaches to Open Data seems very familiar — know why you want to create the data, identify a group who can use the data, build community and data simultaneously, then create useful applications of the data. Unspoken is the fact that these useful applications of the data would probably be what would generate the revenue to maintain the data and elaborate upon it.

At yesterday’s PSP meeting in Washington, DC, I used some of the information I’d gathered for this post during my wrap-up session for the pre-conference. Members of the audience with experience reviewing and publishing large data sets mentioned how reviewing reams of data and preparing them for publication seems likely to dwarf article peer-review requirements in both the time needed and the intensity of effort. Yet, everyone expects data coming from publishers to be usable.

The culture of free often overlooks the real costs the real world creates outside of our own hopes and dreams.

Reblog this post [with Zemanta]

Things happen quickly these days! Thanks to comments from David Crotty with nice embedded links, two updates emerged last night I think people will be interested in. I wanted to use this post to highlight them beyond the comment threads:

Reblog this post [with Zemanta]
Orycteropus afer - stuffed.
Image via Wikipedia

Yesterday, a new search engine made some waves on the Inter-nets. Called Aardvark, it’s splash was accompanied by a move borrowed from Google’s playbook — publication of a paper outlining the theory behind the service, which was submitted to and accepted by WWW2010, this year’s version of the same meeting at which Sergey Brin and Larry Page presented their paper entitled, “Anatomy of a Large-Scale Hypertextual Web Search Engine” back at WWW1998.

The paper Aardvark’s team published is called “Anatomy of a Large-Scale Social Search Engine.” And the URL of the search engine itself? It’s http://vark.com.

The metaphors of Aardvark vary from those of Google. Instead of a Library metaphor centered on documents, with authority derived through document linkages, Aardvark uses a Village metaphor centered on people you know, with intimacy and knowledge generating trust.

Aardvark has actually been around since 2007, but didn’t really get out into users’ hands until this past summer. According to TechCrunch:

  • As of October 2009, Aardvark had 90,361 users, of whom 55.9% had created content (asked or answered a question). The site’s average query volume was 3,167.2 questions per day, with the median active user asking 3.1 questions per month. Interestingly, mobile users are more active than desktop users. The Aardvark team attributes this to users wanting quick, short answers on their phones without having to dig for anything. They also think people are more used to using more natural language patterns on their phones.
  • The average query length was 18.6 words (median of 13) versus 2.2-2.9 words on a standard search engine.  Some of this difference comes from the more natural language people use (with words like “a”, “the”, and “if”).  It’s also because people tend to add more context to their queries, with the knowledge that it will be read by a human and will likely lead to a better answer.
  • 98.1% of questions asked on Aardvark were unique, compared with between 57 and 63% on traditional search engines.
  • 87.7% of questions submitted were answered, and nearly 60% of them were answered within 10 minutes.  The median answering time was 6 minutes and 37 seconds, with the average question receiving two answers.  70.4% of answers were deemed to be ‘good’, with 14.1% as ‘OK’ and 15.5% were rated as bad.
  • 86.7% of Aardvark users had been asked by Aardvark to answer a question, of whom 70% actually looked at the question and 38% could answer.  50% of all members had answered a question (including 75% of all users who had ever actually interacted with the site), though 20% of users accounted for 85% of answers.

Of course, I had to try it out. Given the fact that the link from their blog to the original Google paper landed me on a confusing page that wanted money for the paper, I thought a fair test would be to ask Aardvark where I could find a free version of the Brin/Page paper. Aardvark is supposed to specialize in natural language search, so this would be another thing to test. In brief, here’s how it works:

  1. You go to http://vark.com and enter a search statement or query
  2. You’re asked to either connect in with Facebook Connect or register (I used Facebook Connect)
  3. Once the Facebook handshakes are done, you see who else you know who is also on Aardvark (Jill and Mitch for me)
  4. It then asks you whether you know some other people (I knew one), and then you provide a topic you think you know about (I chose cycling)

I entered my question in natural language (“Where can I find a free copy of ‘Anatomy of a Large-Scale Hypertextual Search Engine’?”) and waited. Query results are delivered via email.

About two minutes later, I received an email from Renaud, a 33-year-old male from Geneva, Switzerland, who suggested I search Google Scholar for the paper I was seeking.

I almost fell out of my chair laughing. Talk about honesty! Of course a social search engine will tell you to search Google! I loved it.

However, he was wrong. I searched Google Scholar for the paper, and could only find a version I could buy. Renaud hadn’t noticed my request for a free version. So, I responded that his answer wasn’t helpful. About 30 seconds later, he responded again, this time with a link to a free version of the paper at Stanford. About 45 seconds after that, another guy, this one from North Carolina, responded to my first inquiry with that same Stanford link.

Overall, it felt fine to get results like this, but it didn’t feel intimate or social. I didn’t know either of these guys, nor was it clear why they received my query. They might just be earning a few pennies every time they get a positive response. (A later question — “What’s a great Valentine’s Day gift idea?” was answered by a 27-year-old male in the Phillipines, who recommended “the one that the receiver will love and will remember you best! a hand made gift is superb!” Oh-kay . . .

Another male (are all Aardvarks male?) from Venezuela gave me this answer for a good Valentine’s Day gift idea:

Well, that varies greatly from person to person. Usually, the best gifts are small. I mean, a friend of mine once said that her now ex-boyfriend would either do something great as a gift, or do nothing, and she wished that he would realize that sometimes little details are far more effective. What she means is that, well, maybe a hand-written letter could work. Maybe a home-cooked meal. It doesn’t have to be something big to be something great and worth remembering. Cheers!

I really enjoyed how this answer involved a couple that had broken up. I mean, social search may be worth it just for the pathos!

I can see something with Aardvark’s workflow functioning especially well in the mobile space, when you just want to file the query, and waiting for a response fits the mobile lifestyle. It seemed kind of retrograde with a full browser in front of me, but not that bad.

The user interface of Aardvark is nice. The emails are fun to use, and the feedback loops make sense. I haven’t been asked to answer a query yet. It will be interesting to see how that part works someday. Some people are already speculating about “participation fatigue” being Aardvark’s Achilles heel.

It’s not clear if Aardvark will be assembling these queries into a database of answers, but the site does log questions you’ve asked and answers you’ve given, so it seems like the pieces are there.

Also, I wonder how things like search engine optimization (SEO) will work with social search. How do you engineer people to prefer something specific? Or maybe it’s OK if SEO is threatened. It’s an expense line we could all do without, isn’t it?

Aardvark seems to realize the picture of Facebook as a search engine that people have speculated might lie at the heart of that company’s commercial dreams. Yet Facebook is allowing Aardvark to use Facebook Connect to populate its service with users. Is there an acquisition coming?

Ultimately, 12 years after Google debuted, and a decade of huge profits and a bevy of interrelated systems later, can any search engine create enough of a competitive edge?

Let me go enter those questions into Aardvark. I might get an answer emailed to me in a few years, once the answers are known.

(Thanks to HR for the pointer.)

Reblog this post [with Zemanta]
Diagram of simple (dead-end) filtration. Overs...
Image via Wikipedia

I think by the end of this post, you won’t think of your editorial filter in quite the way you did when you woke up this morning.

The metaphor of a filter has informed our thinking about information ever since Alvin Toffler popularized the concept of “information overload” in the 1970s. We scholarly publishing types take filtering very seriously. Journals filter out the dross, and editors filter out errors. Our pages are as high-quality and error-free as possible. For editors who eliminate errors and reject unwanted papers, filtering is a private, one-time, reductive process — we confidentially reduce the amount of information to only allow through the highest quality, eliminating the rest.

The junk is filtered out before the public sees it.

At least that’s how we think about it.

Yet there are changes the networked world introduces to our concept of the filter, and they dance together in interesting ways:

  1. Everything that’s published in the networked world is just a click away from any other resource.
  2. In the macrocosm of scholarly publishing, very little is ever really filtered out anymore. Any author with a little bit of persistence can get published and included in major indexing services and online searches.
  3. Many of the filters no longer eliminate information, but rather (obviously or inadvertently) add information.
  4. Filtering is no longer a private activity but a public, participatory activity.

In an interesting post by David Weinberger on Joho the Blog, Clay Shirky’s idea that “[i]t’s not information overload, it’s filter failure” is extended to introduce the notion that filters are no longer silent, private, and reductive. Instead, more and more are public, verbose, and increasing the size of what’s filtered.

Take this blog post, for example, which filters information by selecting and contextualizing it, just like a journal in some senses. I scanned a number of blogs and news items over the past few days, but the link above is what I wanted to share with you.

Now, because my blog will ping David’s blog, there will be a pingback. Many systems will register that pingback, and it’s potentially important. I filtered out a host of things I didn’t think you’d care about, but by choosing one, I have increased its reach and connectivity. I’m no longer isolated from it. Nor are other filters, and they know our linkage now. When Google indexes this site and David’s, it will use the link from here to Joho the Blog to help rank David’s blog as authoritative. You may add a comment to this post. We may debate the merits. This type of interactive filtering in plain sight of the community and the network only adds more information to David’s original post and to the Scholarly Kitchen as a filter. In fact, the more we debate in the spirit of getting the filter right, the larger the resulting information context around this single linkage — new words, new links, new ideas.

When filtering was private and isolated around the single chunk (the article), this didn’t happen unless the editor knew it was happening. Now, it happens without us realizing it, and the filters in the network concatenate it all rapidly into new modifications to be applied immediately.

The filter doesn’t work the same way it used to.

In the networked information space, filtering can add information in new ways. Google filters to the top of its rankings the most authoritative sites for a particular search query. Your contribution of the search term adds information to Google, driving not only its search filters but also its advertising system, its zeitgeist, its auto-suggest, its analytics, and other systems at Google and beyond. Your attempt to filter the Web through search added information to it. Google and others know how to turn filter use and refinement into ongoing business advantages.

Filtering is a dynamic system in the networked world.

This is a fundamentally different filtering system than the ones we’re accustomed to. And it consumes things in a way that shows how porous our traditional editorial filters are, even when we think they’re tight.

Our coarse, article-level filters aren’t suited to the current filtering environment. Why? Because we don’t apply the only filters, the fastest filters, or the finest filters. By comparison, our filters are light, slow, and non-recursive.

With coarse editorial filtration in an information world of abundance, it’s clear that traditional filters are potentially minor and brief impediments. And now we get to why the macrocosm of lots of papers matters more than it used to.

Many journals have studied what happens to rejected papers, and — no surprise — find that rejected papers usually get published somewhere, in some form. With more author-pays publishing, what used to be the small chance of getting published in a journal has probably reversed, and now there’s only a small chance that a slightly persistent author won’t get published in a journal.

So, while a publisher may be proud of its local filter — a journal’s article rejection rate, for instance — the fact is that the ecosystem allows for nearly universal publication. And the ecosystem is now linked and networked, everything just one click away.

Of course, your filter keeps those bad articles out of your journal, so you can rest easy. Your brand isn’t contributing to the prominence of bad articles elsewhere.

Really? Or does your filter’s relatively wide pores inadvertently let through network amplifiers?

Let’s say you just accepted a really good manuscript that cites a paper you rejected, even one that went way down the food chain, from your perspective. Lo and behold, the reference links to the paper which you (and maybe many others) rejected. A citation service makes sure the link works well. Suddenly, the rejected paper and its journal are more authoritative because your good journal threw it a reference. Your filtering process threw off a spark that lit up part of the network. You just increased another journal’s authority in Google. Your filter wasn’t fine enough to catch this loan of credibility.

Your filter is tuned to papers, not to the network. If it were tuned to the network, you might have rejected that reference, knowing its effects on a paper you rejected.

In the old days, this citation would have meant you’d increased the impact factor of that other journal by the tiniest amount. That effect was slow to be felt, and isolated to one measure. Now, the effect happens instantaneously, and it gets networked. It most likely stays in circulation for longer than an impact factor’s two-year window, and the link to the other journal will persist.

This is just one example of ways what we call “filtering” now extends information instead of reducing it. In the networked digital environment, information links with other information at tiny points we don’t currently really deal with. Is the article good? We’ll accept it. Is each reference worth allowing into the information expansion machine? That’s a new question.

And think about how often the most competitive journals cite each other. Are they are just SEO-ing each other, swapping context and brand authority in the network at a fairly high rate? They may compete for papers, but are they really competing in the network? If one were smart, it would prohibit citations to the other, slowly depriving it of borrowed authority, leaving it to fend for itself in the network, isolating it.

It’s the opposite of the citation-packing scandals of a decade or more ago. Instead of packing your journal with self-citations, you want to eliminate citations to your competitors.

By focusing on the power links have in the information economy, filtering (as in “eliminating junk”) becomes a less clearly effective act in scholarly publishing, which focuses on the articles, not the links or comments or other network drivers. We might want to do more granular filtering, realizing that legitimacy and prominence aren’t accomplished solely (or even primarily) through brands, impact factors, article selections, and reputations.

The papers we once rejected now have a back door, passing through our coarse article-centric filters and straight into networked authority systems, networked linking systems, and the myriad filtering systems (news reports, blogs, society sites, tweets) that actually expand them. Instead of small effects, the network amplifies and extends the effects of these traditional points of borrowed legitimacy while introducing a whole range of new ones.

Do you think about filtering differently now?

Reblog this post [with Zemanta]

Next Page »