Advertisement

Author Archives

Research and Advances

Bistro: a Scalable and Secure Data Transfer Service For Digital Government Applications

Government at all levels is a major collector and provider of data.Our project focuses on the collection of data over wide-area networks (WANs) and addresses the scalability issues that arise in the context of Internet-based massive data collection applications. Furthermore, security, due to the need for privacy and integrity of the data, is a central issue for data collection applications that use a public infrastructure such as the Internet. Numerous digital government applications require data collection over WANs [5].One compelling example of such an application is the Internal Revenue Service's electronic submission of income tax forms. Other digital government applications include collecting census data, federal statistics, and surveys; gathering and tallying of electronic votes; collecting crime data for the U.S. Justice department; collecting data from sensors for disaster response applications; collecting data from geological surveys; collecting electronic filings of patents, permits, and securities (for SEC) applications; grant proposals and contract bids submissions; and so on. All these applications have scalability and security needs in common.The poor performance that may be experienced by current digital government users, given the existing state of technology (as in Figure 1a), is largely due to how (independent) data transfers using TCP/IP work over the Internet. TCP/IP is good at equally sharing bandwidth between data streams, which in large-scale applications can lead to poor performance for individual clients (as they receive only a very small share of this bandwidth). Given that TCP/IP is here to stay for the foreseeable future, what is needed is a scalable yet cost- effective solution that can be easily deployed over the existing Internet technology.We are designing and developing a system called Bistro, which addresses the scalability needs of digital government data collection applications while allowing them to share the same infrastructure and resources efficiently, cost-effectively, and securely [1]. Bistro's basic approach is to introduce intermediate hosts---bistros---which allow replacement of a traditionally "synchronized client push" approach with a "nonsynchronized combination of client-push and server-pull" approach (as depicted in Figure 1b). This in turn allows spreading of the workload on the destination server and the network over time, with subsequent elimination of hot spots as well as significant improvements in performance for both clients and servers. Our ongoing research [2, 4] indicates that orders of magnitude of improvement can be achieved with the Bistro architecture and the corresponding data collection algorithms it affords.Bistro's design allows for a gradual deployment and experimentation over the Internet (by simply downloading Bistro server software and installing it on public servers). Bistro's security protocol and trust structure [3] are designed such that only encrypted data travels through (not necessarily trusted) bistros. This means a government agency does not need to trust bistros installed by other agencies or commercial institutions. At the same time, these (untrusted) bistros can significantly improve the agency's data collection performance. Each application (within each agency) can have its own scalability, security, fault tolerance, and other data collection needs, and these applications and agencies can still share available resources, if so desired, across all Bistro servers.We believe an appropriately designed single infrastructure such as Bistro can address all digital government wide-area data collection needs in a scalable, secure, and cost-effective manner. (For more information, see bourbon.usc.edu/iml/bistro/.
Research and Advances

-se of the Sand Spatial Browser For Digital Government Applications

Numerous federal agencies produce official statistics made accessible to ordinary citizens for searching and data retrieval. This is frequently done via the Internet through a Web browser interface. If this data is presented in textual format, it can often be searched and retrieved by such attributes as topic, responsible agency, keywords, or press release. However, if the data is of spatial nature, for example, in the form of a map, then using text-based queries is often too cumbersome for the intended audience. We describe the use of the SAND Spatial Browser to provide more power to users of these databases by enabling them to define and explore the specific spatial region of interest graphically. The SAND Spatial Browser allows users to form either purely spatial or mixed spatial/nonspatial queries intuitively, which can present information to users that might have been missed if only a textual interface was available.
Research and Advances

Introduction

Information technologies are being applied vigorously by governmental units at national, regional, and local levels around the world. The application of IT to government service is often termed "e-government" and the larger concept of government that depends upon IT to achieve basic missions is termed "digital government." This distinction is, of course, lexically arbitrary, but serves to distinguish R&D specifically aimed at creating techniques for applying IT to government operations. Such R&D efforts also consider the long-term impact of these applications on citizens and government itself.
Research and Advances

Data structures for quadtree approximation and compression

A number of data structures for representing images by quadtrees without pointers are discussed. The image is treated as a collection of leaf nodes. Each leaf node is represented by use of a locational code corresponding to a sequence of directional codes that locate the leaf along a path from the root of the tree. Somewhat related is the concept of a forest which is a representation that consists of a collection of maximal blocks. It is reviewed and refined to enable the representation of a quadtree as a sequence of approximations. In essence, all BLACK and WHITE nodes are said to be of type GB and GW, respectively. GRAY nodes are of type GB if at least two of their sons are of type GB; otherwise, they are of type GW. Sequences of approximations using various combinations of locational codes of GB and GW nodes are proposed and shown to be superior to approximation methods based on truncation of nodes below a certain level in the tree. These approximations have two important properties. First, they are progressive in the sense that as more of the image is transmitted, the receiving device can construct a better approximation (contrast with facsimile methods which transmit the image one line at a time). Second, they are proved to lead to compression in the sense that they never require more than MIN(B, W) nodes where B and W correspond to the number of BLACK and WHITE nodes in the original quadtree. Algorithms are given for constructing the approximation sequences as well as decoding them to rebuild the original quadtree.
Research and Advances

A quadtree medial axis transform

As printed Quadtree skeletons are exact representations of the image and are used because they are observed to yield space efficiently and a decreased sensitivity to shifts in contrast with the quadtree. The QMAT can be used as the underlying representation when solving most problems that can be solved by using a quadtree. An algorithm is presented for the computation of the QMAT of a given quadtree by only examining each BLACK node's adjacent and abutting neighbors. Corrected Abstract (published as corrigendum in CACM 27, 2 (February 1984) p. 151) The skeletal and medial axis transform concepts used in traditional image processing representations are adapted to the quadtree representation. The result is the definition of of a new data structure termed the Quadtree Medial Axis Transform (QMAT). A QMAT results in a partition of the image into a set of nondisjoint squares having sides whose lengths are sums of powers of 2 rather than, as is the case with quadtrees, a set of disjoint squares having sides of lengths which are powers of 2. The motivation is not to study skeletons for the usual purpose of obtainings approximations of the image. Instead, quadtree skeletons are exact representations of the image and are used because they are observed to yield space efficiency and a decreased sensitvity to shifts in contrast with the quadtree. The QMAT can be used as the underlying representation when solving most problems that can be solved by using a quadtree. An algorithm is presented for the computation of the QMAT of a given quadtree by only examining each BLACK node's adjacent and abutting neighbors. Analysis of the algorithm reveals an average execution time proportional to the complexity of the image, i.e., the number of BLACK blocks.

Shape the Future of Computing

ACM encourages its members to take a direct hand in shaping the future of the association. There are more ways than ever to get involved.

Get Involved