Chris Anderson (no relation) published a fascinating post recently about how the vast amounts of data and processing capacity now available are putting pressure on the scientific method as generally practiced today, revealing that the model of discovery the scientific method presupposes might shift soon.
The scientific method of hypothesize, model, test, Anderson argues, is a result of a limited data environment. Now, in an age in which petabytes of data are emerging in a vast computing cloud, the notion of test, model, and hypothesize might flip things over. Or, even more dramatically, the idea that correlation can equate to causation might finally find credence. In vast data sets, correlation, established through sufficiently robust mathematics on sufficiently robust data, might come close enough to approximating causation that science can springboard from it.
Anderson draws parallels between this approach, in which raw mathematical practices are used against data to draw out advantage, and Google‘s advertising approach. As he writes:
Google conquered the advertising world with nothing more than applied mathematics. It didn’t pretend to know anything about the culture and conventions of advertising — it just assumed that better data, with better analytical tools, would win the day. And Google was right.
Anderson’s writing is a little sensationalistic, but it zips right along.
Of course, the tension he hits upon has been within science since the beginning. Essentially, Anderson is pointing at the difference between inductive and deductive reasoning. Inductive reasoning depends upon analyzing a lot of experimental findings, then finding theories that explain the empirical patterns. Deductive reasoning begins with elegant principles and postulates, then deduces the consequences from them. As Walter Isaacson states in his biography of Einstein, “All scientists blend both approaches to differing degrees. Einstein had a good feel for experimental findings, and he used his knowledge to find certain fixed points upon which he could construct a theory.”
As Einstein himself said,
The simplest picture one can form about the creation of an empirical science is along the lines of an inductive method. Individual facts are selected and grouped together so that the laws that connect them become apparent. However, the big advances in scientific knowledge originated in this way only to a small degree. . . . The truly great advances in our understanding of nature originated in a way almost diametrically opposed to induction.
Anderson’s probably correct that computational induction on a magnitude hitherto unattainable will allow scientists to reliably discover and extrapolate using vast data sets. Quantum physics has already done so, and biology should be next. But I sense a sensationalism in Anderson’s article hinting that the two approaches cancel one another out. As the history of science shows, they have always coexisted. We need more science. It’s not a zero-sum game.
I would recommend reading his entire post. It’s well-written, compelling, and very likely partially right. I would also recommend reading, “Einstein: His Life and Universe.” You will enjoy it.
At least, that’s my hypothesis.