We are sorry to inform you that your paper has been rejected, due to the lack of empirical evidence supporting it." It may well be the case that some of us, in the course of our academic lives have received or will receiveperhaps more than oncea communication similar to the previous sentence. It seems there is a widespread idea that a work only deserves to be qualified as "scientific" if it is supported by "empirical evidence" (from the Greek empeiría, experience). In this column I will present some arguments (and attempt to convince the reader) that this stance is completely insufficient, and to recover a place in our academic lives for a kind of research that is more speculative than experimental in character. Of course, I do not intend to question the legitimacy of experimental research, but rather to argue that a harmony must exist between the two. However, this harmony seems to be particularly menaced in current computer science research. This is a paradoxical situation, since computer science is rooted both in speculative sciences such as mathematics and experimental sciences such as physics.
Indeed, it is very easy to criticize this prevailing, radical empiricism: the idea that "only those propositions that are obtained through experience are scientific, and thus acceptable as true," is not supported itself by any kind of empirical evidence. Therefore, radical empiricism must be rejected as self-contradictory. Besides, the history of computer science provides us with empirical arguments against empiricism and shows us a very different picture, as I will discuss later. In other words, if radical empiricism is preached, it is not due to empirical or experience-based reasons, but because of other kinds of not-so-clear, to-be-discovered motives. However, given the extraordinarily important role that empirical evidence has in science (it is not without reason we speak of the experimental-scientific method), it would be very superficial to remain with such facile criticism, without trying to go deeper into the question.
Learning from experienceformulating general rules on the basis of particular casesis generally known as induction. Scientific inductivism expressed itself during the 20th century mainly through the philosophical stance known as Verificationism, to which Falsificationism was opposed (I will try to ensure these two are the last -isms mentioned in this column, so that the reader can proceed without having to make marginal notes).
Verificationism upholds an optimistic thesis: induction is possible. That is, it is possible to formulate true general laws on the basis of particular experiences. This optimism provides the foundation for the most generalized attitude among scientists, which precisely leads them to seek the confirmation of their theories in experience. The big problem of induction is to determine whether it truly has a rational foundation, since the mere fact that particular cases are repeated does not warrant the positing of a general law. Unless we admit a priori that regularities cannot be casual: there must be some kind of rationality in the universe that is within reach of the human mind. Sarcastic critics of Verificationism will likely recall the old story told by Bertrand Russell about that "inductive turkey," which after months of repeated experiences (most regular, indeed) came to the firm conclusion that the man who fed it every morning in the farmyard would continue to do so until the end of times, with all his affection...
Falsificationism, by contrast, as set forth mainly in the writings of Karl Popper, considers in a rather pessimistic way that induction is not possible; we cannot aspire to prove the truth of any scientific theory; scientific hypotheses are no more than mere conjectures that are provisionally accepted until a new experience appears to refute them (what Popper calls "falsification"). This stance is informed by a commendable skepticism that has helped to give it credit among scientists, too. But the truth is that, if taken to its ultimate consequences (beyond the point Popper himself would have taken it), Falsificationism becomes absurd: scientists do not devote themselves to formulating and provisionally accepting whatever theory, and then to looking for counterexamples that refute it.
On the contrary, scientists strive to verify hypotheses as much as to refute them, and they only accept hypotheses that are reasonable from the start and that have a huge explanatory power. What this "reasonability" might be, this "explanatory power," or even the "simplicity and elegance" that no doubt have influenced great scientists in the formulation of their hypotheses and theories (consider Galileo, Newton, Einstein...), is an arduous problem for the Philosophy of Science that cannot be addressed here. I only wish to point out that neither Verificationism nor Falsificationism can give a full account of the reality of scientific activity in all its magnitude. And that both, considered as methodological stances, refer to something that is beyond factual experience. Paying attention only to empirical evidence is not acceptable, especially if the consideration of correctness of reasoning is set aside, since, at least, empirical evidence must be adequately interpreted with good reasons. Experimentation without the guide of speculative thinking is worthless.
We have demonstrated that empiricism is insufficient. There cannot be a complete scientific activity that consists solely of proving theories by means of experiments: first, theories must be formulated and developed, and their explanatory power must be demonstrated, so that the investment of human and material resources in the experiments, which may be very costly, can be justified; then, the experiments that will prove or refute the theories must be carried out. Moreover, experimental verification may say something about the truth of a theory, but it can say nothing about its relevance, that is, its interest to the scientific community or society as a whole.
In this respect, we should be careful to distinguish between experimentation of a theory and its practical application: the latter is particularly important in engineering, but developing a practical application does not properly constitute an experimental verification, according to inductive criteria, of the theory that supports it. For example, showing with adequate reasons that a certain design pattern solves a recurrent programming problem demonstrates its applicability without the need of experiments and statistics; the rationale of the pattern, instead, is indispensable. The potential utility of a theory may be enormous, and should be fully acknowledged, but it is not at all an inductive proofa verification. Conversely, having an empirical validation is not the same as having a practical application.
Having demonstrated that empiricism is insufficient in and of itself, can we at least say it is necessary? That is, should we consider it an essential part of every scientific activity? From the scientific point of view, is a purely speculative-theoretical work acceptable without empirical support? In order to answer this question, I will formulate another one: What do we learn from history? In particular, and to focus on the area of major interest for the readers of this magazine: Who are the founders of computer science?
Experimentation without the guide of speculative thinking is worthless.
Consider some fundamental names: Turing (computation theory and programmable automata), von Neumann (computer architecture), Shannon (information theory), Knuth, Hoare, Dijkstra, and Wirth (programming theory and algorithmics), Feigenbaum and McCarthy (artificial intelligence), Codd (relational model of databases), Chen (entity-relationship model), Lamport (distributed systems), Zadeh (fuzzy logic), Meyer (object-oriented programming), Gamma (design patterns), Cerf (Internet), Berners-Lee (WWW)... Are their contributions perhaps distinguished by their experimental character? Aren't they mainly, or even solely, speculative investigations (yet with enormous possibilities for practical application), whose fundamental merit has been to light the way for the rest of the scientific community, by performing, so to speak, a work of clarification and development of concepts? Would they have been able to publish their work according to the "experimentalistic" criteria that currently prevail?
Having a look at the list of Turing Awards1 or at the most cited computer science papers in CiteSEER2 is very instructive. However, given the current standards for reviewing, many of those papers would never have been published. They would have come up against journal reviewers who would have rejected such works, considering them too speculative or theoretical, as has been humorously described in fictitious reviews.4
The attentive reader will have noticed that I am inductively justifying, from the experience of history, that many of the best works in computer science (the most cited ones, to accept the present identity between "most cited" and "best," which is of course a very debatable one indeed) do not have a fundamentally experimental character, but rather a theoretical and speculative one. Nevertheless, I am afraid the "recalcitrant empiricist" will not let him or herself be convinced even by this argument...because, in the end, his or her conviction is not grounded in empirical arguments.
It may well happen that we are suffering the "swinging pendulum" effect. In the past, computer science was not so focused on experimentalism. But recently the pendulum has swung too far toward this side, and we should push it the other way. Maybe periodic swings are even helpful for science, and we should not try to stop them completely. After all, science tends to be a self-correcting system, because ultimately truth will win out, no matter how painful the process of discovery might be for those of us toiling in the trenches. As the great American philosopher and logician Charles S. Peirce put it, "the essence of truth lies in its resistance to being ignored."3
Now then, if their experimental character is not what primarily distinguishes scientific works, what does? In my view, the distinguishing feature of the scientific method is its "public," "social" character. I do not mean by thisfar from itthat scientific truth is established by consensus, but that research results must be demonstrable to others. This, after all, is the aim of scientific publications (no matter how much these publications, and especially the number of publications, serve other, less "avowable" purposes). The enemy of the scientific method is not speculative reasoning, but the appeal to some kind of Cartesian-shaped "intuitive evidence," enclosed within the individual, and which is neither communicable nor submitted to the community of researchers; the enemy is the acceptance of ideas because they are "clear and distinct" for me, regardless of whether or not they are "clear and distinct" for others.
Summing up, what the scientist looks for is to follow a way toward knowledge that can be followed by other researchers; the goal is to "convince" the scientific community of the validity of certain research results. Yet there are several possible ways to convince. Must all scientific works be reasoned and demonstrable? Yes, of course. Must they be empirically verifiable? That depends. Not all branches of science are equal; not all kinds of research are equal. If it would be absurd to try to axiomatically demonstrate the failure probability law of a microchip as a function of its temperature; it would be equally absurd to require an experimental verification of the axioms of fuzzy logic.
What the scientist looks for is to follow a way toward knowledge that can be followed by other researchers.
Experience and speculation must go hand in hand in the way of science. Some investigations will have a basically experimental character, while others will be primarily speculative, with a wide gradation between these two extremes. As long as all are demonstrable, we should not consider some to be more worthy of respect than others. If the pendulum has swung too far toward the experimentalistic side of computer science, we should now push it a bit toward the speculative field, so that the whole picture gets corrected. Thus, I would like to call upon researchers who might feel inclined toward speculative mattersand even more upon those in charge of researchneither to close the door nor give up on this kind of scientific activity, which is so essential for the progress of knowledge.
1. Association for Computing Machinery, Turing Awards; http://awards.acm.org/homepage.cfm?awd=140
2. CiteSeerX, Most Cited Computer Science Articles; http://citeseerx.ist.psu.edu/stats/articles
4. Santini, S. We are sorry to inform you. IEEE Computer 38, 12 (Dec. 2005), 126128; http://portal.acm.org/citation.cfm?id=1106763
The Digital Library is published by the Association for Computing Machinery. Copyright © 2010 ACM, Inc.
The following letter was published as a Letter to the Editor in the November 2010 CACM (http://cacm.acm.org/magazines/2010/11/100636).
In his Viewpoint "Is Computer Science Truly Scientific?" (July 2010), Gonzalo Gnova suggested that computer science suffers from "radical empiricism," leading to rejection of research not supported by empirical evidence. We take issue with both his claim and (perhaps ironically) the evidence he used to support it.
Gnova rhetorically asked "Must all scientific works be reasoned and demonstrable?," answering emphatically, "Yes, of course," to which we whole-heartedly agree. Broadly, there are two ways to achieve this goal: inference and deduction. Responding to the letter to the editor by Joseph G. Davis "No Straw Man in Empirical Research" (Sept. 2010, p. 7), Gnova said theoretical research rests on definition and proof, not on evidence. Nonetheless, he appeared to be conflating inference and deduction in his argument that seminal past research would be unacceptable today. Many of the famous computer scientists he cited to support this assertion Turing, Shannon, Knuth, Hoare, Dijkstra worked (and proved their findings) largely in the more-theoretical side of CS. Even a cursory reading of the latest Proceedings of the Symposium on Discrete Algorithms or Proceedings of Foundations of Computer Science turns up many theoretical papers with little or no empirical content. The work of other pioneers Gnova cited, including Meyer and Gamma, might have required more empirical evidence if presented today. Gnova implied their work would not be accepted, and we would therefore be unable to benefit from it. The fact that they met the requirements of their time but (arguably) not of ours does not mean they would not have risen to the occasion had the bar been set higher. We suspect they would have, and CS would be none the poorer for it.
Gnova's suggestion that CS suffers today from "radical empiricism" is an empirical, not deductive, claim that can be investigated through surveys and reviews. Still, he supported it via what he called "inductive justification," which sounds to us like argument by anecdote. Using the same inductive approach, conversations with our colleagues here at the University of California, Davis, especially those in the more theoretical areas of CS, lead us to conclude that today's reviews, though demanding and sometimes disappointing, are not "radically empirical." To the extent a problem exists in the CS review process, it is due to "hypercriticality," as Moshe Y. Vardi said in his "Editor's Letter" (July 2010, p. 5), not "radical empiricism."
Earl Barr and Christian Bird
I'm glad to hear from Barr and Bird that there are healthy subfields in CS in this respect. I used "inductive justification" to support the claim that many classical works in the field are more theoretical and speculative than experimental, not to support an argument that CS suffers today from "radical empiricism." Investigating the latter through exhaustive empirical surveys of reviews would require surveyors being able to classify a reviewer as a "radical empiricist." If my column served this purpose, then I am content with it.
The following letter was published in the Letters to the Editor in the September 2010 CACM (http://cacm.acm.org/magazines/2010/9/98018).
In his Viewpoint "Is Computer Science Truly Scientific?" (July 2010), Gonzalo Gnova would have made a stronger case if he used the words "theoretical" or "conceptual" instead of "speculative" to support his argument against the excessively empirical orientation of much of today's CS research. The life cycle of scientific ideas generally progresses from the speculative phase in which many candidate ideas are pursued, with only a few surviving to be presented or published as theoretical contributions, often supported by robust analytical models. Journal editors are unlikely to summarily reject contributions making it to this stage because they provide the conjectures and hypotheses that can be tested through rigorous empirically oriented research.
Gnova also set up a straw man when he railed against the excesses of verificationism and empiricism. Who would argue against the proposition that credible scientific advances need good empirical research experiments, simulation, proof-of-concept prototype construction, and surveys? Such research needs models and hypotheses that might have begun as speculative conjectures at an earlier point in time.
Nave empiricism has no place in CS research. Moreover, purely speculative research without adequate analytical foundations is unlikely to help advance CS (or any other) research.
Joseph G. Davis
Davis ("credible scientific advances need good empirical research") and I ("experimentation without the guide of speculative thinking is worthless") fundamentally agree. When I said "speculative thinking," I meant "theoretical contributions supported by robust analytical models," not freely dancing ideas without purpose.
There may also be slight disagreement regarding empirical validation, the excesses of which I criticized. It is clear that theories about physical phenomena require empirical validation; theories about mathematical objects do not. Many areas in CS deal with conceptual or information objects more akin to mathematical objects than to their physical counterparts. Therefore, requiring empirical validation is out of place here.
Displaying all 2 comments