Should Computer Scientists Change How They Publish?

One of the most popular panels at the 2012 edition of Snowbird was "Publication Models in Computing Research: Is a Change Needed? Are We Ready for a Change?" A lively audience of more than 100 computer scientists occupied the medium-size ballroom in which the four-person panel presented, and the audience’s comments at the end of the talk included a positive sharing of suggestions, ideas, and personal experiences. The topic clearly resonated with the audience, and it was common for attendees to be lined up by the microphone, waiting for their opportunity to speak.

Chaired by Moshe Y. Vardi, the panel featured Carlo Ghezzi, chair of software engineering in the department of electronics and information at Politecnico di Milano; Jonathan Grudin, principal researcher, Natural Interaction Group, Microsoft Research; M. Tamer Ozsu, professor of computer science at the University of Waterloo; and Fred B. Schneider, Samuel B. Eckert Professor of Computer Science at Cornell University.

Carlo Ghezzi, Politecnico di Milano
The first panelist, Carlo Ghezzi, spoke on "Towards a Richer Notion of Publication." He noted that papers and their recognition through citations are proxies for research products and the quality of the research. "This model is being challenged," said Ghezzi. "We need to develop a richer model of it."

Research products are enablers of impact, he noted, with impact being how other researchers and practitioners build on the research products. "We have been pushing the idea that computer science research products are not only journal papers, and we need to continue to do it," he urged. "In many areas of computer science, papers in isolation are not enough to build research on and make progress.

"Other research products," he said, "include software tools, prototypes, models, datasets, and benchmarks. Research products should refer to all of these."

Ghezzi also said computer scientists need to change their idea of impact. "Impact is usually noted as citations. For computer science, we need a richer notion of impact that refers to a richer notion of research products such as code downloads, licenses, and the number of people who use a tool. We should evaluate the quality of the whole bundle." He also recommended the paper "Research Evaluation in CS," which addresses the issue of evaluating the impact of research, and was published in the April 2009 issue of Communications of the ACM.

"We shouldn’t refer to only papers," he said, "but the entire ecosystem." If this richer idea of publication is implemented, Ghezzi suggested that we will reward authors who create useful artifacts, obtain results that are reusable and reproducible, and avoid unsupported claims.

Likewise, Ghezzi urged that papers which include claims about a tool or data should provide access to the tool or the data so they can be assessed, and papers that provide such access and pass an outside assessment could be given a higher status, such as "a gold category."

This is not such a bold idea, Ghezzi mentioned. Since 2008 SIGMOD has been conducting an "experimental repeatability" effort to ensure that SIGMOD papers stand as reliable, reference-able works for future research. Similarly, ESEC/FSE had an "artifact evaluation" committee in 2011 and FSE will continue to do this in 2012, he said.

"The community needs to implement this richer notion of publication and reach agreement," Ghezzi concluded, if it wants to progress "from problems to solutions."

Jonathan Grudin, Microsoft Research
Jonathan Grudin gave a colorful PowerPoint-rich talk titled "The Merging of Conferences and Journals," and made the interesting observation that in terms of publication, conferences and journals are migrating to the same niche. "What do ecologists say about this?" he asked. "Competitors cannot exist in the same niche." However, if the two competitors can mate, you can create something entirely new, he said.

In terms of their respective niches, Grudin noted that journals once focused on quality publications and built research reputations while conferences provided networking opportunities and built a research community.

In terms of publication, journals and conferences have increasingly become similar and in some cases are merging, he said.

With computer science conferences and journals both focusing on quality, Grudin pointed out tat community-building has suffered and special interest groups (SIGs) have steadily declined in membership. In 1990 10 ACM SIGs had 3,000 or more members; today only two do, he said.

In describing efforts to mate conferences and journals, Grudin noted the following: Some selective conferences become a journal issue; the authors of journal papers are invited to present at some conferences; some conferences include a revision cycle in the review process; and some conferences only include accepted journal articles. Grudin suggested that community-building requires greater inclusivity and might in the future involve an online virtual presence.

M. Tamer Ozsu, University of Waterloo
M. Tamer Ozsu’s talk on open access models was titled "To Open or Not to Open." Ozsu began by providing an overview of four different models, which he called Green Open Access (the author self-archives), Gold Open Access (publications are open at the publisher’s site), Hybrid Open Access Journals (the author or institution pays in part or whole for publication), and Author Pays (the only option is the author or institution pays for the cost of publication).

Each model, Ozsu noted, has different pros and cons "in terms of access, quality, publication being sustained, and price gouging or predatory publishing."

After providing a historical snapshot of copyright law, Ozsu discussed what authors want (visibility, easy access, impact, protection of intellectual property, and a healthy publishing environment), what readers want (easy and affordable access, trust in accessed material, sustainability of a research archive, and added value in terms of reference and citation links) and what society wants (ease of access and the use of public resources).

What’s a reasonable open access model? Ozsu dismissed Gold Open Access and Author Pays due to the lack of "a reasonable business model." He also said Author Pays is not scalable. Likewise, he noted that most of the push for Gold Open Access and Author Pays models largely stem from North America and Europe where they are supported by available research funding.

Ozsu concluded the best model is what he called Fair Access, with professional societies being the preferred publisher, in which production costs and subscription rates are kept low, and liberal access rights are supported. Ozsu also suggested that publication models move toward a license system that replaces the copyright system, with the author holding the copyright but granting a license with certain exclusive rights to the publisher.

Fred B. Schneider, Cornell University
The last panelist, Fred B. Schneider, spoke about conferences versus journals as a publication venue, and advocated conducting a meta-discussion about the connection between the publication model and the resulting science as the only way to make progress on the debate about expectations for scientific publication in CS. "We need to understand the connection between the publication model and the nature of the resulting science and the structure and costs to the CS community," he said.

Schneider noted that what we study in computer science has changed over the history of the discipline, so our publication model exists in a changing context. Among the changes about publications one can see, Schneider noted the ratio of conference to journal publications, the number of conferences, and the number of published papers expected at key career transitions, such as getting a Ph.D. or a promotion.

Schneider mentioned that most papers are now subject to the length restrictions of conference proceedings and questioned whether this has "resulted in incremental and not revolutionary work." He also lamented that shorter papers often do not include important details, such as performance data and proofs.

He discussed how the number of published papers expected of a Ph.D. has increased and suggested "we might want to rethink how many papers are necessary." Another question he said the field needs to discuss is "What is the meaning of a Ph.D. in our field?" And he wondered whether some "Ph.D.s are capable of independently undertaking a significant piece of research or are they just producing a staple gun thesis by working on a large set of smaller problems?"

In his conclusion, Schneider argued that we need more facts about computer science’s publication culture, not anecdotes, and urged the computer science community "to conduct a full-fledged study on the evolution of our literature and its impact on the nature of science and on our community practices and structure."

The Audience Responds
After Schneider’s talk, the audience was encouraged to ask questions and offer comments. One attendee advocated the communityadopt a gold model of open access that used a Creative Commons-type license. Another said conference journals are increasing the quality of their reviews and published papers, and advocated the partial open access model with the publisher being a professional association. A third person reminded the audience that talks and posters are also ways to present research and network at conferences. Another attendee said the main problem is too many researchers are concerned with tenure, being promoted, or becoming rich and famous. (In response, one panelist suggested that professors who are up for tenure should be asked to present just five papers–but they must be the professor’s five worst papers.) Juan E. Gilbert spoke of his positive publishing experiences with AERA, and blogged about them here for Communications. One attendee said the National Institutes of Health (NIH) doesn’t count papers in conference journals, and asked whether NIH should change its policy. Another person advocated that the CS community conduct experiments with the different publishing models to discover which ones work best. Lastly, an attendee lamented the lack of reproducible research results in many CS papers.

At the conclusion of the panel, chair Moshe Y. Vardi conducted an informal (and admittedly unscientific) poll of the audience. His first question was "Do publication models need to change?" With a show of hands, the majority of the audience indicated that it thought a change is needed. Vardi next asked, "Is the community ready for such a change?" This time, though, the audience’s response was, as Vardi later described it, "weakly positive."

Jack Rosenberger is senior editor, news, of Communications.