One talk that caught my attention was presented by Scott Lachut, entitled Data, Tech and the Rise of Digital Scholarship. Scott is Director of Research and Strategy at PSFK Labs, a brand and innovation strategy consultancy. Scott gave us an elegant overview of new tools available, to help us consider how publishing may better incorporate technology in the context of a connected society. It is sobering to realize how accustomed we have become to being socially connected.
A few fun facts:
500 million tweets are sent every day
100 hours of video are uploaded onto YouTube every day
2.5 billion pieces of content are shared on Facebook every day
There is an increasing connection between content and people. Business Insider reports a significant increase in the market for wearables – gadgets like smartwatches, connected fitness bands, and smart eyewear. The wearables company, FitBit recently raised $43 million in financing. Business Insider speculates that global annual wearable device unit shipments will cross the 100 million mark in 2014, reaching 300 million units five years from now. Gartner reports that while there were 2.5 billion connected devices in 2005, by 2020 there will be over 30 billion.
According to The Guardian, 90% of all the data in the world has been generated over the last two years, but less than 1% of this information has been analyzed. The question for academic publishers and societies is one of comprehension. How do we assimilate these data? It is tempting to disregard them as irrelevant – to write off the social world at least as not being not concerned with academia. This is all very well, but open communication is part of the fuel that drives the open access debate. When everyone and everything appears connected, why should academics be shielded behind the castle walls? There has been much written in the Scholarly Kitchen on OA, and I am not going to address any of the issues for, or against, here. However, it is important to note that one of the drivers of OA is the notion that the more content and ideas that are shared, the more likely breakthroughs will materialize; OA is associated with innovation. An interesting venture in this space is API Commons where developers are encouraged to share their APIs (Application Programming Interface) under creative commons licenses.
Another fascinating venture, still in its early stages of development, is Thingful, a search engine for the Public Internet of Things, which provides a geographical index of where things are, who owns them, and how and why they are used. If an academic chooses to make their data available to third parties – either directly as a public resource or channeled through apps and analytical tools, then Thingful organizes ‘things’ around locations and categories and structures ownership around Twitter profiles (which can be either people, or organizations), enabling discussion.
As Scott Lachut indicated in his talk, there is so much data and yet so little analysis. Perhaps the next step is to organize and visualize the data in an effort to discern patterns and meaning. Useful tools that have emerged here are seen in Openrefine and Kinetica. Openrefine (formerly supported by Google and named Googlerefine) is an open source means of working with publicly available, potentially messy data, cleaning it; transforming it from one format into another; extending it with web services; and linking it to databases like Freebase, a community-curated database of people, places, and things.
Kinetica is an app for exploring data on your iPad. Instead of forcing you to use a spreadsheet. Thistool allows you to see, touch, sift, and play with your data in a physical environment.
I would say that one of the key paths forward for publishers and libraries to work together is in the area of data mining, extracting intelligence from the sea of information. Libraries can do this on behalf of their constituents, or users may run the analyses themselves. There are a number of tools emerging in this area, allowing researchers to scrape and analyze data more effectively, providing predictive insights from existing data.
Take a look at Premise. Premise is an economic monitoring platform, enabling users to track global business and economic conditions in real-time across thousands of online and real-world locations. Premise monitors the price, quality, availability, and other metrics of a range of goods and services, from online and on-the-ground sources. With Premise data, a person may observe such things as a change in price of a product in several countries simultaneously, how much a product is being discounted, how often a product is out of stock.
In the cyber security world there is Recorded Future. Their mission is to organize open source information for analysis. There are predictive analysis tools for conducting intelligence research, competing in business, or monitoring the horizon for situational awareness.
With many start-ups and large scale players dipping their toes into social sharing and data analysis tools, where does this leave the researcher and their publishing output? What I think would be wonderful is a way to combine tools that provide predictive data, the value of recommendation engines, and a social ecosystem around journal articles and authors – would this be the best of what we know and what we may want, but don’t yet know it?