Such assessments usually take the form of rankings. Two of the most well-known rankings, the U.S. News and World Report ranking [10] and the 1993 National Research Council ranking of U.S. doctoral programs [8], use a comprehensive methodology including both objective indicators and subjective polls. A new assessment of research-doctorate programs by the National Research Council is being conducted. Some important changes to the methodology, as described in [7], have been recommended to improve the assessment. One of the recommendations is to use a probable range, instead of a single point, to represent the assessment of an institution, addressing the “spurious precision” issue. Another recommendation is to adopt more quantitative measures, such as research output and student output, to address the “soft criteria” issue.
One such quantitative measure for research output is publications. Typically, a publication-based ranking chooses a research field, selects a group of publication venues that are considered prestigious, representative, and influential for the field, assigns a score to each paper an institution or an author has published, and ranks institutions and authors using sums of the scores. In the computer science field, the latest publication-based ranking of different institutions was finished in 1996 by Geist et al. [4]. They selected 17 archival research journals published by ACM or IEEE, giving one point to each paper appearing in a journal from January 1990 to May 1995. In the systems and software engineering field, the Journal of Systems and Software (JSS) has been publishing an annual publication-based assessment of scholars and institutions since 1994 [9]. (Henceforth, this assessment will be referred to as the JSS ranking). Each year the JSS ranking was based on papers published in the previous five years. The rankings used six journals selected by a 1991 survey of the JSS editorial board.
Assessing research institutions and scholars is a complex social and scientific process. While publication-based ranking can be used alone, it should probably serve as one quantitative indicator in a more comprehensive methodology because an assessment of institutions solely based on publications does not effectively reflect other important factors such as student quality, research funding, or impact.
Existing publication-based rankings have several limitations. One major limitation is the fact they are usually performed manually. As a result, both the number of journals considered and the time span over which the papers are assessed is limited, reducing the scope of such rankings. Ranking manually may also be the reason for considering journals exclusively and neglecting other important sources of academic communication such as conference proceedings. A second limitation is that reported rankings are limited to specific fields. Each new research field requires the construction of a new ranking system that manually repeats the same basic procedure. The previous two limitations yield a third one, that of inflexible criteria. For example, both rankings noted here made different decisions about what journals were included and how each paper was scored. While the decisions were originally made for legitimate reasons, the criteria cannot be altered without repeating the entire labor-intensive process. These limitations hinder the applicability of publication-based ranking.
Accommodating Flexible Policies
To overcome these limitations, we developed a framework that facilitates automatic and versatile publication-based ranking. It utilizes electronic bibliographic data to process a broader range of journals and conferences spanning periods longer than those previously used. This framework can accommodate many policy choices.
The contribution of this framework is not to provide yet another ranking result or methodology. Instead, we enhance traditional publication-based ranking by supplying a policy-neutral automatic mechanism that can be utilized with various choices. When combined with well-designed criteria, this framework can provide results comparable to those produced by manual processes, with reduced cost and wider applicability. Such results can be used as an additional data point in a more comprehensive assessment. However, it is the evaluator who decides whether to adopt a publication-based ranking scheme and, if so, how to conduct such a ranking with the framework.
The general steps in a publication-based ranking are:
- Choose a field;
- Select representative publication venues for the field and, optionally, assign a weight to each venue;
- Set the time range for consideration;
- Assign a score to each published paper, possibly biased by the venue’s weight;
- Divide the score among multiple authors if the paper has more than one author;
- Sum the scores for each scholar and each institution; and finally,
- Rank the scholars and institutions based on sums of their scores.
The most important policy decisions involved in this process are the following:
What field to rank? The field can be the whole field, such as all of computer science, or it can be a subfield, like systems and software engineering, as in [4] and [9], respectively. Our framework supports both choices. Any science and engineering discipline can be ranked by the framework, as long as journal and conference publications are considered an effective assessment of scholarship in that field and bibliographic data for the field is available. This framework does not apply well to humanities and social sciences, where books and reviews are the prominent publication forms.
What journals and conferences are considered important in the field? This is the key decision of the ranking process because selecting different journals and conferences may result in significantly different results. None of the previous rankings included proceedings of conferences or workshops [4, 9]. We feel proceedings from these meetings are important academic communication channels, and they are especially relevant for a rapidly developing field such as computer science, thus the framework provides support for them. However, our framework does not impose any restriction on conference or journal selection. It allows evaluators to make any decisions based on their own criteria.
What weight should papers from different journals or conferences receive? In previous rankings [4, 10], papers from different journals always receive the same weight. Since those evaluators only selected the most prestigious referred journals for their respective fields, these decisions are rational. However, different evaluators might disagree about what are the most prestigious publication venues. The framework gives evaluators much freedom in assigning different weights to different journals and conferences. They can treat their selections equally or differently.
Different answers to these questions can produce very different rankings, even for the same field, as will be illustrated in our second validation effort described here. The purpose of this framework is to provide mechanisms that support flexible policies. Those choices are made by an evaluator when conducting a ranking, not by the framework. When viewing results of rankings facilitated by this framework, users should be aware of the choices and carefully inspect them.
When combined with well-designed criteria, this framework can provide results comparable to those produced by manual processes, with reduced cost and wider applicability.
Other noteworthy policy choices are:
What entities to rank? The framework supports ranking a wide range of entities. It can rank both scholars and institutions, handle both academic and industrial institutions, and cover scholars and institutions from not only the U.S. but also from other geographical regions.
How many years of publications should be included in the ranking? The rankings of [4, 10] selected publications from the previous five years. While this is a reasonable time range for assessing the current quality of a scholar or an institution, our framework allows an evaluator to use any preferred year range.
How should the score be distributed among co-authors for a multi-author paper? After the score of a paper has been assigned based on the venue, the ranking in [4] apportioned the score equally among the authors, and the ranking in [9] gave each author a little bit more than a simple equal share of the score to avoid penalizing a multi-author paper. Our framework supports both schemes, along with others.
The biggest difficulty in developing this framework is the insufficient availability of bibliographic data that contain the institution with which an author was affiliated when the paper was published. While there are several digital bibliographic services such as DBLP [3], Computer Science Bibliography [2], and the ACM Digital Library [1], only INSPEC [6], which is also used by the IEEE Explore digital library [5], consistently provides the author affiliation information. A limitation of INSPEC is that it only records the affiliation information for the first author, thus manual editing is necessary for affiliations of all authors. We chose INSPEC as the source of data when designing the framework and conducting experiments.
Validation
To validate our framework, we used it to perform two rankings. The first ranking assessed U.S. computing graduate programs. We adopted similar criteria as used in [4] and reached comparable results. This validated the hypothesis that our automatic framework can produce results comparable to those from manual processes. The second ranking evaluated institutions and scholars in software engineering. We adopted similar criteria as used in the JSS ranking [9], but focused on different publication venues. Our results were different, illustrating that different policies can produce disparate results. We will discuss possible reasons for the differences.
Ranking of U.S. Computing Graduate Programs. We used our framework to repeat the ranking of [4], based on publication data from 1995 to 2003. Other than the different time range, the only other different criterion was that we selected a scoring scheme that gives credit only to the first author, while in [4] the score was distributed equally among multiple authors. We adopted this policy because INSPEC bibliographic data only records the affiliation of the first author and we decided not to perform manual editing for this ranking. The resulting top 50 U.S. computing graduate programs are listed in Table 1. The first column is the rank from our ranking. The second column is the rank reported in [4]. Overall, the two rankings largely agree with each other. The difference between the two ranks of each program is within five for 21 programs. This confirms the plausibility of our framework. However, our ranking is performed automatically.
Ranking in Software Engineering. We also ranked the software engineering field. We chose two journals and two conferences that are generally considered the most prestigious in the field: ACM Transactions on Software Engineering and Methodology, IEEE Transactions on Software Engineering, the International Conference on Software Engineering, and the ACM SIGSOFT International Symposium on the Foundations of Software Engineering. We gave each paper the same score of one point. To compare with the JSS ranking [9], we adopted the score distribution scheme used in that ranking. To support this scheme, we manually edited the bibliographic data to include affiliation information for multiple authors. Based on data from 2000 to 2004, the resulting top 50 institutions and scholars are listed in Table 2 and Table 3, respectively. The first column in each table is the rank from our ranking. The second column in each table is the rank reported in the JSS ranking, if it can be found in [9].
As can be seen from Table 2 and Table 3, most of the top scholars and institutions are U.S.-based, but a significant number of them come from Europe. Thus, we believe the ranking is representative of the entire field, not just U.S.-centric. This is expected given the international nature of the conferences and journals.
Reasons for the difference. Our ranking is significantly different from the JSS ranking. The second column in Table 2 shows that only two of the top 15 institutions from the JSS ranking are among the top 15 of our ranking. The second column in Table 3 shows that only two of the top 15 scholars from the JSS ranking are among the top 15 of our ranking.
Two policy disparities probably contribute to the difference. First, we included two conferences in our ranking that the JSS ranking did not consider. Secondly, our ranking and the JSS ranking selected different journals and these journals contributed scores differently. The JSS ranking heavily relies on papers published in itself and the journal Information and Software Technology. It also includes a magazine, IEEE Software. The JSS ranking receives almost no influence from ACM Transactions on Software Engineering and Methodology. This study illustrates that the framework can produce dramatically different results when used with different policies, even for the same field.
Conclusion
Rankings based on publications can supply useful data in a comprehensive assessment process [4, 8]. We have provided an automatic and versatile framework to support such rankings for research institutions and scholars. While producing comparable results as those from manual processes, this framework can save labor for evaluators and allow for more flexible policy choices. However, the results produced must be viewed within the context of the adopted policy choices.
The current ranking framework has some limitations, such as not differentially weighting papers from the same venue and relying on English bibliographic data. Additional improvements are also possible, such as using citations as additional quality assessments and incorporating complete author affiliation information automatically.
The framework and data used in this article can be downloaded from www.isr.uci.edu/projects/ ranking/.
Join the Discussion (0)
Become a Member or Sign In to Post a Comment