Research and Advances
Architecture and Hardware Digital government

Bistro: a Scalable and Secure Data Transfer Service For Digital Government Applications

Government at all levels is a major collector and provider of data.
Posted
  1. References
  2. Authors
  3. Footnotes
  4. Figures
ACM Digital Library

Communications of the ACM
Volume 46, Number 1 (2003), Pages 50-51
Bistro: a scalable and secure data transfer service for digital government applications
Leana Golubchik, William C. Cheng, Cheng-Fu Chou, Samir Khuller, Hanan Samet, C. Justin Wan

Table of Contents

back to top  

Government at all levels is a major collector and provider of data.

Our project focuses on the collection of data over wide-area networks (WANs) and addresses the scalability issues that arise in the context of Internet-based massive data collection applications. Furthermore, security, due to the need for privacy and integrity of the data, is a central issue for data collection applications that use a public infrastructure such as the Internet. Numerous digital government applications require data collection over WANs [5].

One compelling example of such an application is the Internal Revenue Service’s electronic submission of income tax forms. Other digital government applications include collecting census data, federal statistics, and surveys; gathering and tallying of electronic votes; collecting crime data for the U.S. Justice department; collecting data from sensors for disaster response applications; collecting data from geological surveys; collecting electronic filings of patents, permits, and securities (for SEC) applications; grant proposals and contract bids submissions; and so on. All these applications have scalability and security needs in common.

The poor performance that may be experienced by current digital government users, given the existing state of technology (as in Figure 1a), is largely due to how (independent) data transfers using TCP/IP work over the Internet. TCP/IP is good at equally sharing bandwidth between data streams, which in large-scale applications can lead to poor performance for individual clients (as they receive only a very small share of this bandwidth). Given that TCP/IP is here to stay for the foreseeable future, what is needed is a scalable yet cost- effective solution that can be easily deployed over the existing Internet technology.

We are designing and developing a system called Bistro, which addresses the scalability needs of digital government data collection applications while allowing them to share the same infrastructure and resources efficiently, cost-effectively, and securely [1]. Bistro’s basic approach is to introduce intermediate hosts—bistros—which allow replacement of a traditionally “synchronized client push” approach with a “nonsynchronized combination of client-push and server-pull” approach (as depicted in Figure 1b). This in turn allows spreading of the workload on the destination server and the network over time, with subsequent elimination of hot spots as well as significant improvements in performance for both clients and servers. Our ongoing research [2, 4] indicates that orders of magnitude of improvement can be achieved with the Bistro architecture and the corresponding data collection algorithms it affords.

Bistro’s design allows for a gradual deployment and experimentation over the Internet (by simply downloading Bistro server software and installing it on public servers). Bistro’s security protocol and trust structure [3] are designed such that only encrypted data travels through (not necessarily trusted) bistros. This means a government agency does not need to trust bistros installed by other agencies or commercial institutions. At the same time, these (untrusted) bistros can significantly improve the agency’s data collection performance. Each application (within each agency) can have its own scalability, security, fault tolerance, and other data collection needs, and these applications and agencies can still share available resources, if so desired, across all Bistro servers.

We believe an appropriately designed single infrastructure such as Bistro can address all digital government wide-area data collection needs in a scalable, secure, and cost-effective manner. (For more information, see bourbon.usc.edu/iml/bistro/.

Back to Top

Back to Top

Back to Top

Back to Top

Figures

F1 Figure 1. Data collection for digital government applications.

Back to top

    1. Bhattacharjee, S., Cheng, W.C., Chou, C-F, Golubchik, L, and Khuller, S. Bistro: A platform for building scalable wide-area upload applications. ACM SIGMETRICS Performance Evaluation Review 28, 2 (Sept. 2000), 29–35. (Also presented at the Workshop on Performance and Architecture of Web Servers, June 2000.)

    2. Cheng, W.C., Chou, C-F, and Golubchik, L. Performance of online batch-based digital signatures. Submitted for publication.

    3. Cheng, W.C., Chou, C-F, Golubchik, L., and Khuller, S. A secure and scalable wide-area upload service. In Proceedings of the 2nd International Conference on Internet Computing 2 (June 2001), 733–739.

    4. Cheng, W.C., Chou, C-F, Golubchik, L., Khuller, S., and Wan, Y.C. On a graph-theoretic approach to scheduling large-scale data transfers. Submitted for publication.

    5. Cheng, W.C., Chou, C-F., Golubchik, L., Khuller, S., and Samet, H. Scalable data collection for Internet-based digital government applications. Proceedings of the 1st National Conference on Digital Government Research. (Los Angeles, CA, May 2001), 108–113.

    This work is supported in part by the NSF Digital Government Grant #0091474.

Join the Discussion (0)

Become a Member or Sign In to Post a Comment

The Latest from CACM

Shape the Future of Computing

ACM encourages its members to take a direct hand in shaping the future of the association. There are more ways than ever to get involved.

Get Involved

Communications of the ACM (CACM) is now a fully Open Access publication.

By opening CACM to the world, we hope to increase engagement among the broader computer science community and encourage non-members to discover the rich resources ACM has to offer.

Learn More