Hubble telescope image known as Pillars of Cre...
Hubble telescope image known as Pillars of Creation, where stars are forming in the Eagle Nebula. (Photo credit: Wikipedia)

Web analytics provide as many questions as answers it seems. Email open rates are susceptible to error owing to counting pixels not loading. Session definitions are critical. Domain and cross-domain traffic monitoring adds complexity. Despite all this, measuring interactions from the social Web seemed the most straightforward — referrals from Twitter, Facebook, and other social players were all pretty well identified and easy to categorize. Yet we have a lot of deep traffic our analytics engines categorize as “direct” — that is, there was no clear referrer.


So the question emerged: Where is all that “direct” traffic coming from?


Late last week, Alexis Madrigal posted a fascinating set of ideas on the Atlantic, speculating that much of the direct traffic is coming from social media we haven’t categorized as “social” but which is social — email shares, bookmark shares, forums, or local hub sites (like a library landing page). Another source of this unknown traffic can be people moving from a secure (https://) site to a non-secure (http://) site.


Madrigal began thinking that all this “direct” traffic couldn’t actually be coming from people dutifully typing in long URLs — it had to be social in some way. But that idea had major implications:


This means that this vast trove of social traffic is essentially invisible to most analytics programs. I call it DARK SOCIAL. It shows up variously in programs as “direct” or “typed/bookmarked” traffic, which implies to many site owners that you actually have a bookmark or typed in into your browser. But that’s not actually what’s happening a lot of the time. Most of the time, someone Gchatted someone a link, or it came in on a big email distribution list, or your dad sent it to you.

Madrigal went around with this idea for a while before the analytics company the Atlantic uses — Chartbeat — made an accounting change in their analytics, creating something they called “direct social.” Traffic in this category goes deep into the site without a referrer. Their hypothesis is that somebody got this deep into the site directly by clicking on a link somewhere, not by typing it in, therefore demonstrating unmeasured (or, to use Madrigal’s term, “dark”) social traffic.


When the Atlantic’s traffic was put into this model, their “dark social” or “direct social” component proved very significant — in fact, it dwarfed the social traffic from any other source:



When the Atlantic saw nearly 57% of its social traffic coming from dark social, Madrigal asked Chartbeat to run the numbers against an aggregation of media sites. They found that 69% of the social traffic to the entire set came from so-called dark social or direct social sources.


This makes a lot of sense. Categorizing deep traffic without a referral source as “direct social” jibes well with experience and common sense. After all, how many times do you cut and paste a link into an email, send it to one or more people, or click on a similar link sent to you? With email an unreliably counted technology for direct sends and a huge backdrop technology among users who like to send links — in addition to the dribs and drabs of other possible sources for dark social traffic — my main feeling is that this is something we should have cottoned to earlier.


Madrigal extends the realization about dark social into a nice new perspective on the social Web — that is, while social media companies structured the social Web, they did not invent it; in fact, there’s a lot of old-fashioned social Web still being used and used heavily; therefore, what social media companies are providing is the ability to share but the ability to document certain shares and shortcut the sharing process for certain things:


If what I’m saying is true, then the tradeoffs we make on social networks [are] not the one[s] that we’re told we’re making. We’re not giving our personal data in exchange for the ability to share links with friends. Massive numbers of people — a larger set than exists on any social network — already do that outside the social networks. Rather, we’re exchanging our personal data in exchange for the ability to publish and archive a record of our sharing.

One other aspect of this story I like — the fact that new thinking, not new data, made this come to light. Too many times, we think more is better — more data, more articles, more, more, more. In fact, theory and thinking are undervalued in the world of analytics. You don’t need more data — you need a story to the data you have.


I think analytics vendors and staff will be spending the next few weeks integrating this idea into their reports. In the meantime, please email the link to his post to them. I want a little dark social in my life this week, too.


Enhanced by Zemanta
Kent Anderson

Kent Anderson

Kent Anderson is the CEO of RedLink and RedLink Network, a past-President of SSP, and the founder of the Scholarly Kitchen. He has worked as Publisher at AAAS/Science, CEO/Publisher of JBJS, Inc., a publishing executive at the Massachusetts Medical Society, Publishing Director of the New England Journal of Medicine, and Director of Medical Journals at the American Academy of Pediatrics. Opinions on social media or blogs are his own.


3 Thoughts on "Dark Social — A New Concept in Analytics That Explains Much of What We (Don't) See"

Thank you for the insights. I look at Dark Social, as it is called in this article, as a “donut hole”, where we are missing data. However, I do think that in some sense it is shrinking. Here is how:

1. Custom URLs. Instead of using entire unabridged URL strings, it is much more common now to shorten them with services like “BIT.LY” and “TinyURL”. The benefit to the publisher of the web address is keeping the link size small. The benefit to the owner of the website is to better understand where the traffic is coming from.

2. Attitudinal Analytics. I have found that respondents to surveys (Attitudinal Analytics, we use ForeSee) tend to be transparent about where they came from. Simply asking a representative sample provides a usable data set.

Also, a comment regarding the following paragraph that appears in the article:

One other aspect of this story I like — the fact that new thinking, not new data, made this come to light. Too many times, we think more is better — more data, more articles, more, more, more. In fact, theory and thinking are undervalued in the world of analytics. You don’t need more data — you need a story to the data you have.

To a large extent, I agree. However, sometimes we are challenged with connecting dots that are not clearly laid out with a limited data set that might support/refute a hypothesis directionally, but is still “fuzzy”. It is not always intuitively obvious what to do with the data. Perhaps we need more data and better data (not either/or)?

This is a very interesting way of looking at direct traffic. We recently launched a new site and have been encouraging local companies to use it via email so a lot of this direct traffic is really our emails. I wanted to use a transactional email service such as Mandrill to tag them messages.

I’m going to look into making some custom Google reports for the dark/direct social.

I think the percentage of deep link direct traffic that can be regarded as social will vary by site time and perhaps time. Maybe social media like Facebook can give an idea of these trends. For instance I love the Firefox Awesome bar and often use it to revisit sites in my history. This will count a direct visit and is usually a deep link too, but not a social one. But if we know that on our site sharing via Facebook, Twitter etc peaks for 5 days after a post is made, then two weeks later has nearly stopped then can we apply that same rule to direct social? We now it’s not all social, but assume that in the first 5 days 90% of deep direct traffic to a story in social, but after two weeks only 30% is? Then we assume that much of the sharing is over and people are bookmarks and browser history etc. I know it’s not totally accurate but it’ll probably give a more realistic figure than ever direct social visit. This would vary for every site and industry.

Most of my click throughs to the Web come from email since I am on several listservs that predate social media. Listservs are a major form of scientific communication. In fact I usually access the Kitchen via an email alert to a comment, not via my bookmark. But then I use neither Facebook nor twitter, having found them not useful.

Comments are closed.