David A. Patterson’s article "Latency Lags Bandwidth" (Oct. 2004) cut straight to the heart of an issue I’ve experienced daily for the past seven years writing operational support software in the telecom satellite industry. I applaud him and Communications for presenting a topic that is too often ignored in our bandwidth-centric world. Whether we’re developing satellites or the most up-to-date computer cluster, we must never forget that computers exist in the physical world and obey physical laws—no matter how much we may try to abstract away from reality.
David A. Patterson’s hypothesis that "Latency Lags Bandwidth" (Oct. 2004) may have some unexpected implications. In the late 1960s, the designers of the original ARPANET/Internet devised packet switching as a way to conserve bandwidth. Packet switching enables a large number of lightly loaded "virtual circuits" to share the bandwidth provided by a network’s physical links. This benefit is offset by an increase in latency due to packet resequencing delays, as well as to queuing delays at congestion points within the network.
With Internet bandwidth increasingly abundant and latency emerging as a critical concern for certain classes of applications, the decision to favor bandwidth over latency may have to be reevaluated. If Patterson’s hypothesis is indeed correct, the motivation for rethinking the rationale for packet switching is likely to be increasingly compelling over time.
Another point is that physical latency, which depends primarily on speed and distance, must be distinguished from congestion-based latency, which is associated with load-dependent queuing delays. Patterson’s hypothesis applies primarily to physical latency, though the two are sometimes confused. Content-delivery services (such as Akamai) advertise that they reduce physical latency by employing proprietary algorithms that cache Web content at edge servers physically close to end users. Although this observation is certainly correct, it isn’t always the most important part of the story.
Under heavy loads, a content-delivery service can distribute download requests across thousands of edge caches, dramatically reducing queuing delays (congestion-based latency) at the original Web servers and at the array of edge caches. This powerful but nonproprietary benefit received little attention during the high-profile launch of these services several years ago. The distinction between reducing physical latency and reducing congestion-based latency remains poorly understood for content-delivery services.
If total latency includes a significant amount of congestion-based latency, end-to-end measurement might also raise subtle issues, especially for distributed applications where latency at geographically dispersed locations must be associated with a single underlying transaction and then summed appropriately. In contrast, measuring bandwidth is much more straightforward. Measurements are generally made at a single point, and since bandwidth is a function of raw bit rate, there is no need to identify individual transactions and then trace them through multiple servers and physical links.
The widespread acceptance of Web services and service-oriented architecture will increase the use of latency-based service level agreements. This situation will reinforce the importance of measuring latency in complex environments and may produce further evidence supporting Patterson’s hypothesis.
Jeffrey P. Buzen
As Buzen is a pioneer in performance measurement, it would be wise to agree with his observations. My article concentrated on the underlying technology, ignoring issues like congestion. If the workload is so great that congestion occurs, then latency increases; in this circumstance, more bandwidth would thus help latency.
The article ignored the greater bandwidth that comes with redundant components and parallel transmission. Hence, the increase in bandwidth over latency is understated in the field. At a time when we have bountiful bandwidth (due to the large investment in optical fiber), it is no longer clear that packet switching is the best approach.
I also want to acknowledge a number of colleagues who helped me with these ideas but whose names were not included in the article: David Anderson, Gordon Bell, Doug Clark, John Fowler, Armando Fox, Jim Gray, Mark Hill, John Hennessy, John Ousterhout, and John Wawrzynek.
David A. Patterson
Share Disturbing Information on E-Voting
The articles on e-voting (Oct. 2004) were shocking and disturbing. I implore ACM to release more of this information to the public by making the articles publicly available at www.acm.org and by issuing more press releases to get the word out. This information is simply too important not to be distributed as widely as possible.
What a timely and wide-ranging issue (Oct. 2004). Bravo. The one puzzling article was "Implementing Voting Systems: The Georgia Method" by Brit J. Williams and Merle S. King. On the one hand, it provided a view of what some election officials think. On the other, it would appear to fail the standard tests for bias and factuality I expect from Communications.
The subtitle "Ensuring the Integrity of Elections in Georgia" seems almost like an oxymoron after the 2002 election in which Senator Max Cleland (a triple-amputee Vietnam veteran) was defeated following disputed votes and slanderous personal attacks.
The factuality of the statement "due to the requirements of a secret ballot it is impossible to conduct an accurate study of voting patterns" is disputed just pages later in another article—"Auditing Elections" by Douglas W. Jones—as well as by other methods (such as exit polling).
I fully support the ACM Statement on Voting Systems and hope our profession and government take it to heart.
Bobby Kahn was then-Governor Roy Barnes’s chief of staff and is currently the chairman of the Georgia Democratic Party. Kahn has said, "I would love to believe that Roy Barnes and Max Cleland really won on Election Day but lost because of some voting conspiracy. That just didn’t happen." He also said fears about e-voting are being fanned "by a combination of computer people who don’t know anything about politics and political people who don’t know anything about computers." [Gwinnett Daily Post, 2002].
Auditing methods can indirectly reveal certain behaviors of voters, voting technologies, and tabulation methods, but they cannot reveal a voter’s intent. Undervoting the top race on a ballot can occur for many reasons. If your audit methods reveal voter intent, then you have violated the secrecy of the ballot. In Georgia, that’s against the law.
Brit J. Williams and Merle S. King
Ensure Quality Assurance for Bioinformatics Applications
Thank you for the special section on bioinformatics (Nov. 2004).
As the science enabled by bioinformatics moves from the research laboratory to the clinic, the related software applications move with it. However, the quality criteria for research tools are very different from the quality criteria for clinical applications. One helps generate hypotheses for investigation, the other guides decisions about whether and how to treat patients. This situation places special responsibilities on software professionals.
I propose that publication of any bioinformatics or computational biology application must reflect a professional obligation to include a related quality-assurance policy, with an explicit statement on whether or not the application is considered clinical grade.
Follow the Money
James Y.L. Thong et al. left out of their article "What Leads to User Acceptance of Digital Libraries?" (Nov. 2004) what I think is the main reason people don’t use digital libraries: They cost too much.
As a member of both ACM and IEEE, it would cost me several hundred dollars extra to access each of their libraries; neither is complete, so you need both. I have no doubt many others are also needed, depending on a particular consumer’s speciality.
Until the cost of such libraries is made much more reasonable than it is today, many people will simply consider them not cost effective.
David H. Jameson
Chappaqua, New York
Share Open Source Sources
Science requires repeatable experiments. One of the problems with studying software development scientifically is that so many projects are proprietary. In order to get permission to study proprietary projects, researchers often promise not to disclose project details—making it impossible for others to check their work.
One of the big advantages of open source software (OSS) is that it is open; the source code is available for everyone to examine. I was disappointed to find that the article "Open Source Software Development Should Strive for Even Greater Code Maintainability" by Ioannis Samoladas et al. (Oct. 2004) on measuring open source did not tell us what projects were used. The authors cited a prior article about ethics and open source and said it is unethical to name the projects. I have not read that article, but I strongly believe that any article that draws that conclusion is junk.
Because Samoladas et al. did not report the sources of their data, no one can check their work. In fact, for their project to be repeatable, they must publish the software they used to gather metrics, the versions of the OSS they used, and anything else necessary for people to repeat their experiments.
Johnson states that keeping OSS project names secret is bad (in his blog, he calls it "pernicious") and that people who say such a thing are irrelevant when it comes to free software and open source. In order to defend our opinion—that OSS project names should be kept secret—we recommend reading a 2001 article "Ethics and Open Source" by Khaled El Emam in Empirical Software Engineering. Please note that El Emam was ranked first among software engineering scientists for the period 19992003 in a 2002 article by Robert L. Glass and T.Y. Chen "An Assessment of Systems and Software Engineering Scholars and Institutions (19992001)" in the Journal of Systems and Software. In the same article, Aristotle University was ranked 15th among the top institutions in the field of systems and software engineering; for more on our work on free software and open source, see sweng.csd.auth.gr/foss_pubs.html.
Scientific research is based on samples and the measurements on them. Since in our research we used a sample of OSS projects, the names were not important for obtaining the published results. Anyone who wishes to perform a similar study will readily find another sample and be able to compare the results.
Our decision to not publish OSS project names in our work is correct and fully supported by El Emam’s article.
I. Samoladas, I. Stamelos, and L. Angelis