Home/Magazine Archive/February 2016 (Vol. 59, No. 2)/The Moral Hazard of Complexity-Theoretic Assumptions/Abstract

Communications of the ACM
## The Moral Hazard of Complexity-Theoretic Assumptions

In November 2015, the computing world was abuzz with the news that László Babai proved the Graph-Isomorphism Problem, that is, the long-standing open problem of checking whether two given graphs are isomorphic, can be solved in quasi-polynomial time. (While polynomial means *n* raised to a fixed degree, quasi-polynomial means *n* raised to a poly-logarithmic degree.) The mainstream media noticed this news; *Chicago Magazine* wrote "Computer Scientists and Mathematicians Are Stunned by This Chicago Professor's New Proof."

If Babai's result holds under scrutiny, it is likely to become one of the most celebrated results in theoretical computer science of the past several decades. Graph isomorphism has long tantalized researchers as it was not known to be solvable in polynomial time, yet there are good reasons to believe it is not NP-complete. The new algorithm provides further evidence of that!

Given that the article mentions a paper that I am a co-author of [ "Edit Distance Cannot Be Computed in Strongly Subquadratic Time (unless SETH is false)", STOC'15], I believe it is appropriate for me to respond. In short, I believe that much of the analysis in the article stems from a confusion between press coverage and the actual scientific inquiry. Indeed, it is unclear whether Dr. Vardi addresses what he believes to be a "media" phenomenon (how journalists describe scientific developments to a broad public) or a "scientific" phenomenon (how and why the scientists make and use assumptions in their research and describe them in their papers). In order to avoid conflating these two issues, I will address them one by one, focusing on our paper as an example.

1) Media aspects: The bulk of the editorial is focused on some of the press coverage describing recent scientific developments in algorithms and complexity. In particular, Dr. Vardi mentions the title of a Boston Globe article covering our work ("For 40 Years, Computer Scientists Looked for a Solution that Doesn't Exist.") . As I already discussed this elsewhere (https://liorpachter.wordpress.com/2015/08/14/in-biology-n-does-not-go-to-infinity/#comment-4792 ), I completely agree that the title and some other parts of the article leave a lot to be desired. Among many things, the conditional aspects of the result are discussed only at the end of the article, and therefore are easy to miss.

At the same time, I believe some perspective is needed. The inaccuracy or confusion in popular reporting of scientific results is an unfortunate but common and longstanding phenomenon (see e.g., this account https://lpsdp.files.wordpress.com/2011/10/ellipsoid-stories.pdf of press coverage of the famous Khachiyan's linear programming algorithm in the 1970s). There are many reasons for this. Perhaps the chief one is the cultural gap between the press and the scientists, where journalists emphasize accessibility and newsworthiness while scientists emphasize precision. As a result, simplification in scientific reporting is a necessity, and the risk of oversimplification, inaccuracy or incorrectness is high. Fortunately, more time and effort spent on both sides can lead to more thorough and nuanced articles (e.g., see https://www.quantamagazine.org/20150929-edit-distance-computational-complexity/ ). Given that the coverage of algorithms and complexity results in popular press is growing, I believe that, in time, both scientists and journalists will gain valuable experience in this process.

2) Scientific aspects: Dr. Vardi also raises some scientific points. In particular:

a) Dr. Vardi is critical of the title of our paper: "Edit Distance Cannot Be Computed in Strongly Subquadratic Time (unless SETH is false)." I can only say that, given that we are stating the assumption explicitly in the title, in the abstract, in the introduction, and in the main body of the paper, I believe the title and the paper accurately represents its contribution.

b) Dr. Vardi is critical of the validity of SETH as a hardness assumption: this question is indeed a subject of a robust discussion and investigation (see e.g., the aforementioned Quanta article). The best guess of mine and of most of the people I discussed this with is that the assumption is true. However, this is far from a universal opinion. Quantifying the level of support for this conjecture would be an interesting project, perhaps along the lines of similar efforts concerning the P vs. NP conjecture ( https://www.cs.umd.edu/~gasarch/papers/poll2012.pdf ). In any case, it is crucial to strengthen the framework by relying on weaker assumptions, or replacing one-way reductions with equivalences; both are subjects of ongoing research. However, even the existing developments have already led to concrete benefits. For example, failed attempts to prove conditional hardness of certain problems have led to better algorithms for those tasks.

Finally, let me point out that one of the key motivations for this line of research is *precisely* the strengthening of the relationship between theory and practice in complexity and algorithms, a goal that Dr. Vardi refers to as an important challenge. Specifically, this research provides a framework for establishing evidence that certain computational questions cannot be solved within *concrete* (e.g., sub-quadratic) polynomial time bounds. In general, I believe that a careful examination of the developments in algorithms and complexity over the last decade would show that the gap between theory and practice is shrinking, not growing. But that is a topic for another discussion.

Note: this comment will been crossposted at http://blog.geomblog.org/ . Thanks to Suresh Venkatasubramanian for his hospitality.

1. It is easy to blame the media for overstating scientific results, but it is not completely fair. Journalists do get their information from the scientific community. For example, while the title of the article in question, "Edit Distance Cannot Be Computed in Strongly Subquadratic Time (unless SETH is false)," is technically accurate, it send a very different message than the hypothetical title, "SETH Implies a Strongly Quadratic Lower Bound for the Edit-Distance Problem".

2. I do not see the gap between theory and practice shrinking, but growing. See, for example, "Boolean Satisfiability: Theory and Engineering" (http://cacm.acm.org/magazines/2014/3/172516-boolean-satisfiability/fulltext). Also, a quadratic lower bound for the Edit-Distance Problem does not imply quadratic behavior on typical instances of the Edit-Distance Problem.

> It is easy to blame the media for overstating scientific results but it is not completely fair. Journalists do get their information from the scientific community.

It is easy to speculate about the flow of information without having much understanding of how the interaction actually unfolded. From a careful reading of the article it should be rather clear that the following points were conveyed (i) the result is conditional (ii) it is based on a conjecture that is stronger than P<>NP and (iii) the support for that conjecture is not universal. As I see it, the issues with the article simply show that more time and effort on both sides was needed to convey those points satisfactorily, given the complexity of the topic. It worked better the second time.

> while the title of the article in question,"Edit Distance Cannot Be Computed in Strongly Subquadratic Time (unless SETH is false)," is technically accurate, it send a very different message than the hypothetical title, "SETH Implies a Strongly Quadratic Lower Bound for the Edit-Distance Problem".

Given that the suggested change does not affect points (i)..(iii) above, I doubt it would make any difference. Of course we can all speculate about "what if" scenarios.

> I do not see the gap between theory and practice shrinking, but growing. See, for example, "Boolean Satisfiability: Theory and Engineering"...

I am afraid I disagree with this sentiment. For example, in the context combinatorial search problems, there is a growing body of work characterizing instances of NP-hard problems that can be solved in sub-exponential time. This includes various forms of average case analysis, as well as fixed parameter tractability (FPT), where the goal is to identify input parameters that make it possible to design algorithms with runtimes that are exponential only in those parameters (I know Dr. Vardi is familiar with the area, but I am including this description for completeness). There are too many papers to list here, but browsing recent STOC/FOCS/SODA/ICALP proceedings can provide many examples. Similar results are known for the edit distance and its relatives. Incidentally, conjectures about exponential hardness of CNF-SAT have been *instrumental* in these developments, making it possible to identify the "target" running times of the parametrized algorithms. This demonstrates that the conjectured exponential hardness of SAT for general inputs and the existence of empirically faster algorithms for specific domains are not contradictory, but in fact complement each other.

In any case, this is just one example. Other areas include massive data analysis, machine learning, cloud computing, etc. The list goes on.

There is indeed a fair amount of recent work aimed at bridging theory and practice, say, understanding the good performance of SAT solvers on real world instances, but this work is not being published at standard theory conferences, I am afraid. Good work on in this direction requires a combination of theory and experiment, but experimental work is not in the DNA of theory conferences.AI conferences, for example, would be a better place to look at.

I would think things are not as bad. Here are some recent references that I am aware of. They are theoretical in nature, but seem relevant. There are probably others.

Fedor V. Fomin, Daniel Lokshtanov, Neeldhara Misra, M. S. Ramanujan, Saket Saurabh,

Solving d-SAT via Backdoors to Small Treewidth. SODA 2015: 630-641

S. Gaspers and S. Szeider, Strong backdoors to bounded treewidth SAT,

in 54th Annual IEEE Symposium on Foundations of

Computer Science (FOCS), IEEE Computer Society,

2013, pp. 489498.

S. Gaspers and S. Szeider, Backdoors to acyclic

SAT, in Proceedings of the 39th International Colloquium

on Automata, Languages, and Programming

(ICALP), vol. 7391 of Lecture Notes in Computer Science,

Springer, 2012, pp. 363374

In any case, I am not claiming that there is no gap between theory and practice on this front, or that narrowing this gap is not a major research challenge (in fact, I do believe it *is* a major and important challenge). I just don't think that, overall, the gap between theory and practice is growing.

The edit-distance result Arturs Backurs and Piotr Indyk has, in my view, also a considerable practical importance (for the developing of algorithms). On the theoretical side, the result and method does not give much hope for unconditionally proving quadratic lower bound for edit-distance (although people should perhaps try, or find reasons why this would be implausible, see also the remark at the end), and there could be different views if to believe SETH or not to believe, as it is so so much stronger than NP=!P. But the result *does* say that it is implausible in practice that a sub-quadratic algorithm for (general) edit distance will be found. (An algorithm disproving SETH is no where in sight, and even if one is optimistic about it, this will be a very peculiar way to disprove SETH.)

Of course, also practice comes in several layers and you may ask about cases of practical importance, about not-too-huge strings etc.

Here is "the remark at the end:" A few weeks ago, Irit Dinur, Moshe Vardi and I discussed this a little in Jerusalem and one question that came up was if there is a reduction in the other direction, between sub-quadratic edit-distance and SETH/SAT-like question.

I am delighted to see that my editorial has led to a vigorous exchange. See also

https://rjlipton.wordpress.com/2016/02/16/waves-hazards-and-guesses/ by Lipton&Regan, http://blog.computationalcomplexity.org/2016/02/the-moral-hazard-of-avoiding-complexity.html by Fortnow, and https://www.reddit.com/r/compsci/comments/42r53o/the_moral_hazard_of_complexitytheoretic/.

Displaying **all 7** comments