Sign In

Communications of the ACM

Contributed articles

Confronting the Myth of Rapid Obsolescence in Computing Research


View as: Print Mobile App ACM Digital Library Full Text (PDF) In the Digital Edition Share: Send by email Share on reddit Share on StumbleUpon Share on Hacker News Share on Tweeter Share on Facebook
hourglass

Illustration by Gary Neill

Computing technologies are changing everyone's social, political, economic, and cultural worlds.12 Meanwhile, scientists commonly believe that research in computing is advancing more quickly and just as quickly becoming obsolete more quickly than research in other scientific disciplines. A notable indicator is how quickly it stops being cited in the literature. Common measures of this phenomenon are called "cited half-life," "citing half-life," and the Price Index (see the sidebar "Definitions and Measures of Obsolescence"). These measures show that research in computing does not cease being cited any more quickly than research in other disciplines, indicating (contrary to popular belief) that research in computing does not become obsolete more quickly than research in other disciplines. The extent to which this is the case is important for several reasons:

Demand for innovation. Though computing has made great strides, society continues to demand more complex, reliable, robust, usable hardware and software systems. Advances in computing technology needed to meet it depend on long-term funding of fundamental research.11 However, it can be difficult to convince funding bodies to support long-term fundamental research programs in computing. One reason may be the already quick pace of development of computing applications, perhaps suggesting that the research is not as difficult as in other disciplines and that progress can be made with less funding than other disciplines. Hence, as has been reported in the context of U.S. National Science Foundation research-funding policy, when competing for research money, computer scientists argue that society has a compelling need for the results of their research, as well as CS as a basic research discipline to maintain its standing within the scientific community.19 Competing for funding with researchers from other sciences in a university setting, CS researchers must counter the argument that research funding in computing will not be prioritized because everything useful is already being done both faster and better by the IT industry anyway.

Aging research. Though relevant, the CS literature may still be considered obsolete and thereby ignored due to its age. As a researcher and journal editor, I find that reviewers frequently mention "old references," and, as a supervisor, I find Ph.D. students are often reluctant to read older literature.

Publication delay. Researchers in computing sometimes claim the relatively long lag between submission and publication of a journal article renders the research outdated before publication, arguing for submitting their manuscripts to conferences rather than to journals.

Library ROI. Due to the ever-increasing volume of research literature, libraries must make cost-effective decisions, identifying the core journals within a discipline, canceling their subscriptions to less-accessed journals, and archiving less-accessed material to save shelf space. To maximize return on their investment, libraries must collect statistics on the use of their materials.9 Research literature on computing being accessed less often or quickly becoming obsolete may affect decisions about the archiving and retention of computing journals.

Back to Top

Results

Table 1 reflects CS within the various disciplines with respect to average aggregated cited, citing half-lives, and Price Index. This result is in striking contrast to the only other work I found on obsolescence of the computing literatureCunningham and Bocock3which found a citing half-life of four years (I found 7.5), concluding that their study supported "...a commonly held belief about computer science, that it is a rapidly changing field with a relatively high obsolescence rate for its documents. This hypothesis is confirmed for the field of computer operating systems and network management..." They also reported a half-life of five years for the field of "Information Systems." The main reason for the discrepancy between their results and mine is likely that they based their analysis on a small sampleonly two journals (one that no longer exists) and four issues of the proceedings of one conference, the International Conference on Information Systems. By contrast, ISI Journal Citation Report (JCR) provided me with values for 382 computing journals.

The extent to which the cited and citing half-life measures are equivalent or complementary has been covered in the literature.17 For individual journals, the values may be quite different (such as the respective values >10 and 5.5 for Communications). Nevertheless, as in Table 1, there is strong correlation among the three measures of obsolescence at the level of overall disciplines: rcited-citing = 0.90, rcited-Price Index = 0.83, rciting-Price Index = 0.89). Literature with a long lifetime has high cited and citing half-life values but low Price Index, and vice versa, giving negative correlations between cited/citing half-live and Price Index.

Looking at the subdisciplines within computing, one finds the citation lifespan of the literature is shortest within Information Systems and longest within Theory & Methods (see Table 2). The variations among the sub-disciplines of the various disciplines are generally small; on average, the cited half-life stddev = 0.8. The variation among the journals within a discipline is much greater; on average, the cited half-life stddev = 2.2. For computing journals, stddev = 2.0, with the extremes varying from cited half-life = 1.7 and citing half-life = 3.7 to >10 for both half-life measures.

Based on the assumption that everything is changing quickly throughout society, it is easy to believe that the scientific literature is becoming obsolete more quickly than it used to. However, a comprehensive study shows that the median citation age (citing half-life) of scientific publications has increased steadily since the 1970s.10 One likely reason for this increase is the availability of online bibliographic databases and the Internet, making it easier to access older references. A 2008 study reported, "The Internet appears to have lengthened the average life of academic citations by six to eight months."1 Another reason may be the significant increase in the number of references per article.10 Having space for more references allows for increasing the time period for included references.

The reported study10 focused on medical fields, natural sciences, and engineering. To study the evolution of the aging distribution of the computing literature compared to all other disciplines, I investigated cited half-lives from 2003 to 2007 (see the sidebar "How the Study Was Done"); JCR did not provide such information earlier than 2003. I found the cited half-life of computing literature increased from 7.1 years in 2003 to 7.4 years in 2007 (4.7%), the fifth highest increase among the 22 disciplines. Geosciences was tops, with an increase from 8.2 years to 8.8 years (7.7%). The disciplines with the most decreasing cited half-life were Environment/Ecology and Engineering, with declines of 2.4% and 1.8%, respectively. The average increase among all disciplines was 0.1 year (1.9%). Hence, there seems to be a trend that the age of useful computing literature is increasing, not decreasing relative to other disciplines.

The increasing interest in research related to environment and ecology may have contributed to less old work being cited in more recent issues of the related journals. Moreover, if my study is replicated in, say, five years, we may observe different trends; for example, the financial crisis at the time of this writing (2009) may contribute to more research being done in economics and business, with more recent work being cited, or shorter half-lives.

Journals vs. conferences. I found an average of 5.9 years for the citing half-life of the 307 conference and workshop proceedings available in the ACM Digital Library. Their citing half-lives are shorter than for computing journals (7.5 years). The two main explanations for why conferences have shorter half-lives are shorter publication delay and fewer references per article.

Publication delay means the cited references grow older due to the publication process per se; that is, the references were younger when the article was submitted than when the article was published. A list of publication delays for computing journals, conferences, and other venues shows a clear tendency for journals to have longer delays than conferences (http://citeseer.ist.psu.edu/pubdelay.html). The average publication delay of journals common to the CiteSeer list and JCR was 20 months. The average publication delay of the conferences common to the CiteSeer list and the ACM Digital Library was eight months. About one-third of the JCR journals and one-quarter of the ACM Digital Library conferences were included. It is unlikely these samples were biased with respect to publication delay. Hence, we can infer that the average difference in publication delay between computing journals and conferences is approximately one year, even though the increasing use of Web-based support tools in the review process of many journals may have contributed to slightly shorter publication delays today than when the list was assembled in 2003.

The 11,719 articles in the ACM conferences (as of 2008) include, on average, 16.1 references, while the 36,004 articles in the JCR computing journals include, on average, 27.1 (26.2 if review articles are excluded); that is, journals include 70% more references than conferences. Journal articles are also generally longer than conference articles; thus, more space is available for related work. Consequently, the citing half-lives of journals may be higher than the citing half-lives of conference proceedings due in part to journals citing more references.


One should take care criticizing or ignoring literature just because it is "old"; other criteria must be used to judge quality and relevance.


When calculating the half-lives of the conference proceedings, I excluded references to URLs because their year of "publication" was rarely indicated in their citations; moreover, for those ULR references with the year indicated, it's likely that the content of the actual Web site has changed, meaning we cannot necessarily use the indicated year to calculate the age of the content of a given Web site (unlike printed publications). However, another study16 investigated how long URLs are accessible by inspecting the URLs referenced in articles in IEEE Computer and Communications from 1995 to 1999, reporting, "A noteworthy parallel can be observed between the four years we calculated as the half-life of referenced URLs and five years given as the median citation age for computer science." One may reasonably question the extent to which one is able to compare the accessibility of URLs with the inclusion of references in articles.

Nevertheless, the claim that the half-life in CS is five years is from four issues of the Proceedings of the International Conference on Information Systems.3 Due to difference between journals and conferences, it would be more correct to compare the four-year half-life of URLs with the citing half-life of 7.5 years in Table 1, as both figures result from analyzing journals. In this case, referenced articles would have a useful life approximately twice as long as the URLs. However, given that I found large variations in the citing half-lives between journals and conferences with respect to printed publications, one may find large variations in the half-lives of referenced URLs as well. Therefore, one should analyze much larger samples than only two journals to make a general statement.

Back to Top

Conclusion

My investigation found that the aging of the computing literature is not atypical compared with other scientific research disciplines, indicating that the research front in computing does not move more quickly than its counterpart in other disciplines. It is also a sign that computing is an established research discipline with long-lasting challenges and complex research problems taking years to solve. For example, developing software systems that are reliable, efficient, user-friendly, and maintainable has been, and probably always will be, a grand challenge in computing research. Moreover, it typically takes 10 to 20 years for a technology to mature from being a good research idea to being widely used in practice.14,15 This fundamental aspect of computing, combined with the importance of software in modern society, means there is no reason funding for computing research should not be at a level comparable to that found in other scientific disciplines, including physics and clinical medicine.

These results have further consequences. First, half of the citations in the computing literature are more than seven years old. Publications older than seven years may be viewed as old but still considered relevant by the authors citing them. Therefore, one should take care criticizing or ignoring literature just because it is "old"; other criteria must be used to judge quality and relevance.

The relatively long cited half-life of computing literature also indicates that the time lag between submitting a paper to a journal and it being published in that journal should not be a major concern; such work is rarely obsolete before publication. In any case, the delay may be significantly shorter in the future, as an increasing number of journals publish their articles online shortly after accepting them for publication.

My results also indicate that computing journals are not more likely to have their subscriptions cancelled or stored for a shorter time than journals of other scientific disciplines. There are significant variations, so decisions regarding particular journals must be based on more detailed information about the journals.

Here, I've discussed obsolescence at a coarse level (disciplines and sub-disciplines). It would be interesting to study obsolescence within categories of computing topics and research. For example, how does obsolescence vary between research that aims to solve (minor) practical problems and research that aims to develop comprehensive theories? However, this would require substantial effort, given there is no database that easily provides relevant data similar to what JCR provided for the study I've reported here.

Back to Top

Acknowledgments

I thank Chris Wright for help clarifying basic concepts and stimulating comments; Gilles Brassard and anonymous referees for valuable comments; and Alexander Ottesen, Birgitte Refsland, and Bjørnar Snoksrud for help collecting the reported data.

Back to Top

References

1. Barnett, G.A. and Fink, E.L. Impact of the Internet and scholar age distribution on academic citation age. Journal of the American Society for Information Science and Technology 59, 4 (Feb. 2008), 526534.

2. Burrell, Q. Stochastic modelling of the first-citation distribution. Scientometrics 52, 1 (Sept. 2001), 312.

3. Cunningham, S.J. and Bocock, D. Obsolescence of computing literature. Scientometrics 34, 2 (Oct. 1995), 255262.

4. De Solla Price, D.J. Citation measures of hard science, soft science, technology and nonscience. In Communication Among Scientists and Engineers, C.E. Nelson and D.K. Pollack, Eds. D.C. Heath and Company, Lexington, MA, 1970, 322.

5. De Solla Price, D.J. Networks of scientific papers: The pattern of bibliographic references indicates the nature of the scientific research front. Science 149, 3683 (July 1965), 510515.

6. Glänzel, W. Towards a model for diachronous and synchronous citation analyses. Scientometrics 60, 3 (Dec. 2004), 511522.

7. Goodrum, A.A., McCain, K.W., Lawrence, S., and Giles, C.L. Scholarly publishing in the Internet age: A citation analysis of computer science literature. Information Processing and Management 37, 5 (Sept. 2001), 661675.

8. ISI Web of Knowledge. Journal Citation Reports on the Web 4.2. The Thomson Corporation, 2008; http://www.isiknowledge.com/JCR

9. Ladwig, J.P. and Sommese, A.J. Using cited half-life to adjust download statistics. College & Research Libraries 66, 6 (Nov. 2005), 527542.

10. Larivière, V., Archambault, É., and Gingras, Y. Longterm variations in the aging of scientific literature: From exponential growth to steady-state science (19002004). Journal of the American Society for Information Science and Technology 59, 2 (Jan. 2008), 288296.

11. Lazowska, E.D. and Patterson, D.A. An endless frontier postponed. Science 308, 5723 (May 2005), 757.

12. Misa, Y.J. Understanding 'how computing has changed the world.' IEEE Annals of the History of Computing 29, 4 (Oct.-Dec. 2007), 5263.

13. Moed, H.F. Citation Analysis in Research Evaluation. Springer, Dordrecht, The Netherlands, 2005.

14. Osterweil, L.J., Ghezzi, C., Kramer, J., and Wolf, A.L. Determining the impact of software engineering research on practice. IEEE Computer 41, 3 (Mar. 2008), 3949.

15. Redwine Jr., S.T. and Riddle, W.E. Software technology maturation. In Proceedings of the Eighth International Conference on Software Engineering (London, Aug. 2830). IEEE Computer Society Press, Los Alamitos, CA, 1985, 189200.

16. Spinellis, D. The decay and failures of Web references. Commun. ACM 46, 1 (Jan. 2003), 7177.

17. Stinson, R. and Lancaster, F.W. Synchronous versus diachronous methods in the measurement of obsolescence by citation studies. Journal of Information Science 13, 2 (Apr. 1987), 6574.

18. Száva-Kováts, E. Unfounded attribution of the 'half-life' index-number of literature obsolescence to Burton and Kebler: A literature science study. Journal of the American Society for Information Science and Technology 53, 13 (Nov. 2002), 10981105.

19. Weingarten, F. Government funding and computing research priorities. ACM Computing Surveys 27, 1 (Mar. 1995), 4954.

Back to Top

Author

Dag I.K. Sjøberg (dagsj@ifi.uio.no) is a professor of software engineering in the Department of Informatics at the University of Oslo, Norway.

Back to Top

Footnotes

DOI: http://doi.acm.org/10.1145/1810891.1810911

Back to Top

Tables

T1Table 1. Half-lives and Price Index for all scientific disciplines.

T2Table 2. Half-lives and Price Index for computing.

Back to Top

Back to Top

UF2-1Figure. Citation half-lives.

Back to Top


©2010 ACM  0001-0782/10/0900  $10.00

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee.

The Digital Library is published by the Association for Computing Machinery. Copyright © 2010 ACM, Inc.


 

No entries found