Research and Advances
Computing Applications

Spatial Data Integration in a Collaborative Design Framework

Building an extended architecture to eliminate boundaries to accessing and sharing data.
Posted
  1. Introduction
  2. The SPeCS Architecture
  3. X-Arc
  4. X-Arc Services
  5. Conclusion
  6. References
  7. Authors
  8. Footnotes
  9. Figures

The need to share and integrate data from several sources has been a fundamental motivator throughout the evolution of computer technology. Sharing spatial data is necessary due to the growing complexity of environmental analyses and due to the high cost of data acquisition, since large volumes of spatial data are a main characteristic of Geographical Information Systems (GISs). During the last decade, the data interchange between organizations via the Internet became not only possible, but also essential. The largest barrier encountered has been the heterogeneous data structures utilized by the different data models to represent the graphic and non-graphic components of the spatial data. In most cases, GISs use proprietary structures designed to maximize spatial data manipulation performance, but often at the expense of the needs of information sharing. Therefore, the integration of these data sources is a very complex task due to the diversity of existing applications and the fact that spatial data incorporates different semantics, formats, and representations.

Mediation is one of the database integration techniques that provides data access for distributed and heterogeneous data sources. The mediation technique uses one global component (a mediator) and multiple local components (wrappers). The mediator accesses the information shared by the wrappers, which in turn manage the access to the local data sources [3]. Given this situation, we have developed and extended architecture known as X-Arc [2], a data integration architecture that uses metadata, heterogeneous data access, and mediation concepts in its components to provide services to locate, share, and publish data sources through the Web. The metadata managed by X-Arc aids the execution services and can be used as a catalog of available data sources.

However, considering that GIS users belong to specific application areas representing different competencies, political agendas, or social interests and often lack the knowledge about other areas with which they may have to work, we propose the use of SPeCS [1]. SPeCS supplies a collaborative work environment that increases synergy and user cooperation during the data interchange between GIS data sources. The SPeCS architecture is composed of three main functional modules: Decision Tools, Knowledge Tools, and the Integration Layer. X-Arc belongs to the last layer. We took advantage of SPeCS’ support for collaboration in X-Arc’s integration, publication, and access tasks. Here we describe the X-Arc Spatial Data Integration in the SPeCS Collaborative Design Framework.

Back to Top

The SPeCS Architecture

SPeCS architecture offers a cooperative, common, flexible, and easy to use work environment in which the members of a group may be geographically distributed in heterogeneous environments, and must interact during a decision-making process.

Interface. The prototype presented here incorportates an interface with a main screen containing the components necessary for the work group (see Figure 1). It shows the areas for the discussion and allows private or shared notes, which can be seen by the rest of the team. Besides these graphic interface areas, there is a menu with the operations for creating and modifying the model entities. For example, a user may introduce a new climatic characteristic, not foreseen up to that moment, or subdivide an area for having important details that must be studied by the group.

The system supplies the users with the ability to make textual notes using their preferred word processor applications and to share this information with other members of the work group, optionally indicating the region or subregion of the cartographic basis which, in turn, refers to the argumentation. Members of the group may present proposals at any time, and a historical list may aid in the argumentation.

As in all argumentation systems, the SPeCS prototype allows users to manifest their positive or negative positions, as well as related explanations and the thoughts brought up as the proposal is discussed. The system still allows open proposals by anonymous authors, since this may be interesting for institutional or political reasons. This feature may be turned off to accommodate more rigid administrative structures, or in cases where this attribute is used unduly. The capacity for geographic expression must be made easy and the access to all kinds of heterogeneous data sources must be made feasible since this is the main characteristic of this environment.

Functional Modules. The main functional modules listed when the SPeCS architecture was described earlier are shown in Figure 2. The Decision Tools aid the activities involved in a decision-making process from the problem definition phase to the specification and documentation of all solutions created by the discussion. The tools also help track the responsibilities assumed at the time the decisions were made by the group members, all of which can be referenced to geographic elements. These tools offer an environment where members of a group can prioritize issues, perform different types of voting (yes/no, multiple choice, and weighed choice), evaluate different criteria, elaborate documents in group, perform brainstorming, analyze projects, rank solutions, and so forth.

The Knowledge Tools include mechanisms for decision planning and meter, visual chatting (V-Chat), knowledge management, and dynamic survey. The Decision Planning tool allows the customization of the workflow decision process, disabling and introducing decision steps in accordance to the problem- solving features. The V-Chat tool (see www.weidorz.org) encapsulates chat, forum, and email facilities, as shown earlier in Figure 1. The basis of the conversation is a semantic framework, allowing the reuse of decision argumentation.

The knowledge management facilities include: a search tool for locating relevant information from the integrated databases or from the Internet; a knowledge capturing tool to input new or provide access to existing rules, models, and data in participating data; a knowledge generation engine with an inference machine that provides the integration of rules, models, and data for the generation of new knowledge based on previously validated data. A knowledge validation tool helps the users, through automatic and manual mechanisms, retrieve, analyze, and transfer the validated knowledge from the Knowledge Generated Repository to the appropriate Data, Rule, and/or Model repositories.

A Dynamic Survey tool was implemented to help researchers cope with the constant changes in the GIS requirements, enabling easy change to survey structure and attributes. This tool can also be used to capture the feedback from local actors to be compared with those from others regions where some kind of inference has been made in the past by group decisions, allowing a measure of decision efficiency. The Decision Meter tool can compare solutions proposed in the past and the actual results in order to improve the conceptual basis for future decisions. This allows success and failure factors to be pinpointed, thus helping researchers to make better decisions. A Workflow Engine is also being developed, which will allow a smoother interaction with the GIS environment.

The Integration Layer, implemented by X-Arc, permits the integration and sharing of heterogeneous data sources using mediation techniques. This module provides basic data services to the upper-level modules; X-Arc will be discussed in more detail later in this article.

When dealing with the knowledge repositories, it is important that all group components share the same taxonomy and enjoy the same standard of knowledge, thus encouraging consensus based on common premises. The absence of this type of perspective may lead to the failure of all efforts put into the assembly of this argumentation structure, since the lack of common knowledge concerning the problem and the solution makes it difficult for the group ideas and opinions to converge.

Back to Top

X-Arc

There are several solutions for database integration, which are presented through different names: multi-database systems (MDBS), mediators, federated databases, and interoperable systems. These solutions seek to preserve the existing systems, maintaining their autonomy through a system with facilities for the access and sharing of the data stored in local databases. The mediation architecture emerges as an interesting integration proposal and is formed by one global component (mediator) and multiple local components (wrappers) [3].

The X-Arc extended architecture is intended to integrate data sources using the mediation technology, providing interoperability among the data repositories spread over the Web while maintaining their autonomy. It encapsulates mediation systems in its components to provide integration, publishing, and searching services to its users. Furthermore, the availability and computing capabilities of data sources are explored by the components and its services, as well as the heterogeneity of its data (structure, format, type, and metadata) is minimized, in order to provide uniform access to the data.

This architecture is based on mediators and ontologies to provide the binding of heterogeneous data to its domain applications. To facilitate the identification of related data sources and their appropriate domain application, each data source is associated to metadata, which provides the domain ontology. X-Arc provides a uniform view of the available data repositories organized in the domain taxonomy. Domain ontologies are used to help the search for related environmental data through the representation of domain semantic concepts. Therefore, this approach promotes data integration and mechanisms to translate data requests across ontologies.

Figure 3 shows a basic configuration of the X-Arc architecture with its main components: Manager and Partner. For each main component and published data source, there is associated metadata. In X-Arc, the Manager is an agent essentially responsible for the management of the whole architecture, while the Partner is an agent responsible for data publishing.

The Manager component is responsible for the metadata maintenance of all data sources integrated by the architecture. It has a metadata database that is constructed during the architecture startup and updated at each new Partner initialization. Due to a Query Processor service, this component is able to rewrite and send all sub-queries to the appropriate Partners for query execution. Nonetheless, the Manager does not have to be consulted for a Partner to have its data accessed, but the metadata provided by the Manager can help users locate the shared data spread through the architecture.

The Partner component is mainly responsible for the data integration and publication services. A data repository must be controlled by one Partner, which can publish and integrate more than one repository. Each Partner is a mediator and for each data source there is an associated wrapper and metadata, which describes the localization, platform details, data format, structure, and representation, as well as the access type available. The data sources can be scalar or spatial and can be stored in tables or files (containing structured or semi-structured data). Therefore, the Partners can be:

  • Scalar: in which the mediation system built into the data integration services supplies descriptive data stored in tables, text, or files, for example.
  • Spatial: in which the mediation system built into the data integration services supplies spatial data through maps and images.

The metadata manipulated by the architecture consists of information collected by the Manager from the Partners and from each data source to be published by the architecture. The metadata aids the user in the localization of interesting data and helps the Manager during query execution. Both Manager and Partner are responsible for metadata maintenance and data source integration (scalar or spatial). With X-Arc, environmental users have the autonomy and agility provided by the mediation-based integration to share their data with each other, as well as achieving cooperation and data sharing between heterogeneous and distributed databases.

Back to Top

X-Arc Services

The X-Arc services are responsible for the integration, publishing, and searching facilities of the architecture. They provide the basic data services needed in the SPeCS architecture as shown in Figure 2, and are discussed in more detail here:

  • Meta Schema Manager: responsible for the collection, management, and publication of metadata associated with each data source. It provides information to the Query Processor during the query rewriting and the sub-query distribution process;
  • Non-Spatial Data Publication: responsible for the scalar data publication. All scalar data retrieved by the Query Processor or by the Data Access is published in XML format. Through the Meta Schema Manager, the Data Publication can convert the content of the data sources from its format to XML;
  • Spatial Data Publication: similar to the Data Publication service, however the data to be published in XML format is spatial;
  • Data Access: responsible for the direct access to the published data in the partners and the passing of the data to the scalar and spatial publication services. This service encapsulates scalar and spatial mediation systems, which access the data sources through wrappers;
  • Query Processor: responsible for the manipulation of each query sent to the architecture. The main operations executed by the Query Processor are: query optimization, rewriting and distribution, and query result aggregation. Thus, the sub-queries are sent only to the partners who publish the data requested in the queries; and
  • Security: responsible for the data security, providing three data access levels—complete, read only, and denied. Each data request to the Data Access or Query Processor is validated by the Security service before the execution.

Back to Top

Conclusion

The requirements and specifications of a Spatial Decision Support Collaborative System in environmental projects helped to define the complexity of spatial decision making and the need for coordination among the researchers, thus introducing new interaction perspectives. Through the mediation-based architecture of the X-Arc SPeCS component, users can share scalar and spatial data through the Web, using XML as a standard for information interchange. The metadata managed by this architecture helps localizing, accessing, and sharing spatial data. The SPeCS tools supply the collaborative work environment with several facilities for remote GIS users and, through X-Arc services, a user can locate, access, and retrieve all data available in the architecture in an easy, flexible, and agile way.

The X-Arc architecture introduces new perspectives in studying the collaborative and cooperative interaction among users in a GIS environment to share and integrate data sources during a decision-making process.

Back to Top

Back to Top

Back to Top

Back to Top

Figures

F1 Figure 1. SPeCS interface prototype.

F2 Figure 2. SPeCS functional model.

F3 Figure 3. X-Arc main components.

Back to top

    1. Medeiros, S. et al. SPeCS—A spatial decision support collaborative system for environment design. In Proceedings of the Fifth International Workshop in CSCW in Design. (Hong Kong, Nov. 2000).

    2. Pinto, G. et al. X-Arc spatial data integration in the SPeCS collaborative design framework. In Proceedings of Sixth International Workshop in CSCW in Design. (London, CA, July 2001).

    3. Wiederhold, G. Mediation in information systems. ACM Computing Surveys 27, 2 (1995).

    This work was partially financed by CNPq and CAPES agencies.

Join the Discussion (0)

Become a Member or Sign In to Post a Comment

The Latest from CACM

Shape the Future of Computing

ACM encourages its members to take a direct hand in shaping the future of the association. There are more ways than ever to get involved.

Get Involved

Communications of the ACM (CACM) is now a fully Open Access publication.

By opening CACM to the world, we hope to increase engagement among the broader computer science community and encourage non-members to discover the rich resources ACM has to offer.

Learn More