A variety of application areas involve spatial and temporal data. Spatial data is used in GIS, logistics, CAD/CAM, robotics and medical imaging to name a few. Systems in financial markets, inventory management, professional sports, consumer research and payroll regularly use historical or temporal data. Within these kinds of systems, spatial and temporal integrity constraints are of key importance.1,2 Modeling business rules in traditional applications has strongly been advocated.7,11
Yet, in the development of spatial and temporal applications, we see that few constraints are typically modeled during conceptual database design. In contrast, few spatio-temporal constraints are modeled during conceptual database design. We feel that part of the reason for this may be the lack of standardization in capturing the semantics of spatial / temporal constraints. Our work proposes the notions of windows of evaluation and constraint bounds to address part of the problem.
We now talk about some spatio-temporal data constraints using a simplified Customer Relationship Management (CRM) scenario to illustrate the kinds of constraints we are referring to as well as provide some motivating examples.
Example: A CRM application tracks information about customers and promotions (simplified conceptual model in Borges et al.,1 with place-holders for links to other data themes).
Included in the constraints for the “Enroll In” relationship, are the following set of business rules with spatio-temporal components. The rules are numbered for ease of reference.
(1) During 2008, a corporate customer may be enrolled in only two promotions at any point time; but since the promotions vary by US state, we restrict this to a two promotion program enrollments in each state. (2) If the same promotion is run more than once in a month, they can enroll in it a maximum of 3 times in that month. (3) Over the course of a month (in 2008/2009), they may choose to enroll in 5 different promotions per state in the West or Midwest regions; (4) and cumulatively avail of no more than $5000 in credits per sales region over the course of a quarter. (5) Large customers (purchases over 250K per annum) can enroll in 25 promotions across a single business region and receive up to $50000 in credits over the year. (6) The Thanksgiving (run between November 25rd and 28th) and year-end promotions (run between December 16th and 31st) for 2008 are limited to the first 1000 customers who enroll.
Examining these constraints, we see they cannot typically be modeled during conceptual database design. Designers either omit them or document them in free text form and then hope they eventually get implemented in PL/ SQL, Java, C#, or some other procedural application language. A number of advantages accrue7 from modeling constraint semantics during conceptual design including, reduced cost of correcting errors, consistent application enforcement, improved ability to validate requirements with users, and visibility of rules (which in turn allows for easier modification in a dynamic environment). Therefore, we feel it is advantageous to capture these rules during the conceptual design stage itself. To do so, a standard and concise mechanism for explicitly modeling the temporal and spatial semantics of rules should exist. We propose the notions of evaluation windows and applicability bounds over time and space to succinctly capture a very important aspect of spatial and temporal business rule semantics.
Understanding the Semantics of Spatio-Temporal Constraints
Examining the constraints in the CRM application described earlier, we begin to see a few common themes. Let us examine the first rule taken from our example in our CRM example.
Given that we’re dealing with cardinality, there are naturally, the numeric limits, i.e., two (promotions). This is commonly expressed in most conceptual models. However, let us consider the impact of temporal and spatial semantics on these rules. These cardinality limits are specified with respect to specific time frames and spatial extents, i.e., at each point in time and in each U.S. state. We term these as the evaluation windows, since they denote the time and space window over which the constraint is evaluated. Finally, we see a third component, which bounds the extent of the evaluation windows and hence we term it as applicability bounds: i.e., the policy is valid during 2008 and in U.S.A. The constraint applicability bounds refer to when and where the constraint is to be enforced. The window of evaluation specifies the size of the time or space window over which the constraint is evaluated. We analyze the semantics of evaluation windows and applicability bounds and in more detail in the rest of the article, as also the relationships between the two concepts.
Windows of Evaluation. The time window over which a constraint is evaluated classifies it into either sequenced or non-sequenced. Similarly, the space window over which a constraint is evaluated classifies it into either unit-spaced or collective-spaced.a
Sequenced. A temporal constraint is sequenced with respect to a similar snapshot constraint in the conceptual model, when it extends the semantics of a snapshot constraint to an evaluation valid at each point in time. Given a snapshot constraint, “A customer can enroll in up to two promotions,” the corresponding sequenced constraint is: “A customer can enroll in up to two promotions at each point in time.” A point in time is dependent on the granularity of the underlying data. Thus if the granularity of the
enroll_in relationship is day, the customer can be enrolled in up to two promotions in any given day.
A current constraint can be defined as a special kind of sequenced constraint that is evaluated against the data at the current time. The time instant corresponding to current time is not fixed, instead it is continuously changing. The temporal variable now can capture the current time in the database system.9 The granularity for the current time depends on the underlying data.
Non-sequenced. A constraint is non-sequenced if it is applicable over a set of granularity-instants or a time interval (including the lifetime of the data entity). Given a snapshot constraint, “A customer can participate in up to two promotions,” a non-sequenced constraint could state that, “Over the course of a month, a customer may choose to enroll in five different promotions.” For a non-sequenced constraint to apply to a construct (relationship/entity class), that construct must be time varying.
The nature of the time frame of evaluation for non-sequenced constraints may classify the constraints into two types, either fixed-window or a sliding-window.
Fixed-window constraints are evaluated once over the duration specified by the constraint. An example of fixed-window constraint is: “No more than 1000 different customers can enroll in a promotion between 2008-12-16 and 2008-12-31.” The temporal element specifying the window size and the temporal element specifying the applicability bounds are identical.
Sliding-window constraints are evaluated multiple times within the constraint applicability bounds. The length of the temporal element specifying the window size is a strictly smaller than that of the constraint applicability bounds. If the length is equal, it is equivalent to defining a fixed window constraint. Associated with a sliding-window constraint is the size (or granularity) of each slide. We have the following relationship between the constraint aspects: granularity of the data ≤ granularity (size) of the slide ≤ size of the evaluation window ≤ size of the applicability bounds.
An example of a sliding-window constraint is: “Over the course of a month, a customer may choose to enroll in five different promotions; the constraint should be evaluated each calendar month, and is applicable between 2008-01-01 and 2008-12-31.” We indicated previously that the
enroll_in relationship had a granularity of day. So, we have the information as noted in Table 1.
We observe: the granularity of
enroll_in (day) ≤ the granularity of the slide (month) ≤ the size of the window (month) ≤ the constraint applicability period (2008-01-01 to 2008-12-31).
Sliding-window and sequenced constraints are similar (likewise, a parallel can be drawn between current constraints and fixed-window non-sequenced constraints) in that each is evaluated multiple times within the time interval specified by the applicability bounds. The difference lies in the granularity of the window of evaluation. The granularity for sequenced constraints is always the smallest possible granularity that is meaningful for the data.
Unit-spaced. A unit-spaced spatial constraint is evaluated at each point in space. Given a non-spatial constraint, “Customers may enroll in up to twenty different promotions,” a corresponding unit-spaced constraint may be, “Customers may enroll in up to two different promotions in each state. “b A unit in space is dependent on the granularity of the underlying data—in this case the spatial granularity associated with the
enroll_in relationship (i.e., a U.S. state).
Collective-spaced. A constraint is collective-spaced if it is applied over a set of spatial granularity units. Given a non-spatial constraint, “Customers may enroll in up to twenty different promotions,” a corresponding collective-spaced constraint is, “Customers may enroll in up to twenty different promotions across the whole country.” Or say, “Customers may enroll in up to 10 different promotions in each region (West, Southwest, Midwest, Southeast, and so on).”
Collective-spaced constraints may be classified into two types, those to be evaluated over a fixed-window (area) of space, or those to be evaluated over a sliding-window (area) of space.
Fixed-window constraints are evaluated once for the window specified by the constraint. The spatial element specifying the window size and the spatial element specifying the constraint applicability are identical for this type of constraint. An example of a fixed-window constraint is: “Customers may enroll in up to twenty different promotions across the United States.”
Sliding-window constraints are evaluated multiple times over the constraint applicability bounds. The size of the spatial element specifying the window size should be strictly smaller than the size of the constraint applicability bounds. Associated with the sliding-window constraint is the granularity of each slide. We have the following relationship between the constraint aspects: granularity of the data ≤ size of the slide ≤ size of the evaluation window ≤ size of the constraint applicability.
An example of a sliding-window constraint is: “Customers may enroll in up to 10 different promotions in each region.” Earlier, we indicated the
enroll_in had a granularity of U.S. state. So, we come up with the following spatial information for this constraint as noted in Table 2.
Sliding-window and unit-spaced constraints are similar in that each is evaluated multiple times within the spatial region specified by the applicability bounds. The difference lies in the granularity of the window of evaluation. The granularity for unit-spaced constraints is always the smallest possible granularity meaningful for the data, i.e., the spatial granularity of the underlying data.
Relationship between Components of the Evaluation Window
A window of evaluation can be defined as a 3-tuple consisting of: Data Granularity (DG); Window size (WS); and Slide size (SS). The relationships between the elements of the 3-tuple highlight differences among the windows of evaluation (See Table 3).
Applicability bounds define when and where the constraint should be checked against the data. They bound the earliest begin and latest end points of evaluation windows. The applicability bounds are not when the constraint was incorporated into the schema or database (that relates to the transaction time aspect while our focus is on valid time), or when it is actually fired (as a trigger in the DBMS). A constraint may be implemented today to check past and future periods. When the database runs the constraint code (e.g., on June 24,2008), is also different from the valid-time of the data it is checked against. We might choose to check on June 24, 2008 against past data (i.e., data with a valid time in the past), or against data with a valid-time in the future. Of course we cannot check against data that does not yet exist in the system.
If not specified, constraint applicability bounds can logically assume the following defaults (dependant on the corresponding window of evaluation) as noted in Table 4.
Incorporating Spatio-Temporal Semantics into the Conceptual Schema
We propose an annotation approach (borrowing from the ideas presented for ST-USM7) to incorporate windows of evaluation and applicability bounds into the modeled database schema. The annotations can be placed in the data dictionary. The basic syntax is:
- <Core-Constraint> OVER
- <Evaluation Window> WITHIN
- <Applicability Bounds>.
The evaluation window is further described as: <Temporal Window> // <Spatial Window>. Applicability bounds are similarly defined. Taking our example rule 1, we have a core constraint of a customer co-occuring with [min=0, max=2] promotions OVER month // state WITHIN (2008-01-01, 2008-12-31) // USA. For economy of article length, we do not provide more examples, but the extension is straightforward. The full syntax in Backus-Naur Form (BNF) and additional examples with greater complexity are also available.3
Comparison with Previous Approaches
Most spatio-temporal conceptual data models usually consider two temporal windows of evaluation: sequenced and lifetime, and a single spatial window of evaluation: all of space. Examples of such models include GeoOOA4 and STER.10 These models can only capture a part of the spatio-temporal semantics for rules 1 and 6. More expressive models like MADS5 and ST-USM7 also model unit-spaced constraints. So they can in addition capture part of the semantics for rules 3 and 4. None of the models consider applicability bounds, and that aspect of the semantics is completely new for conceptual modeling.
Thus, while powerful languages exist at the logical level to implement constraints, there has been little work done so far in analyzing the spatio-temporal semantics of constraints, and providing a means to specify them at the conceptual design phase.
Relational Design Considerations
The existence of a non-sequenced constraint implies the need to maintain the history of the affected data. Similarly, a unit-spaced or a collective-spaced constraint (with the exception of a constraint that applies “over all space relevant to the database”) requires the storage of associated spatial information. A DBMS with spatial support is high recommended while creating triggers or program modules that enforce collective spaced constraints, though if there are limited kinds of spatial evaluation windows and applicability bounds, the associated spatial data can be stored as attributes (e.g., a “Midwest region” label can be attached to relevant customers). Much temporal data can be managed using a conventional DBMS and generating history tables,8 however the support is improved with a temporal database. For sequenced, unit-spaced and sliding-window constraints, temporal and spatial aggregation needs to be performed when the temporal and spatial extent of the data being entered spans more than one evaluation window.
For ease of understanding, we illustrated the proposed concepts of evaluation windows and applicability bounds over time and space by means of a simplified example. There may be other more complex constraints, but the same semantic principles of decomposing based on evaluation windows and applicability bounds apply. To test whether our concepts were practical and expressive enough for a real-world application, we evaluated them using a case study at an Institutional Review Board (IRB) of a major public university. The IRB was chosen due to its having a large number of rules and constraints, the satisfaction of which was critical for regulatory enforcement and legal compliance. The results of our study showed that the concepts we proposed were robust enough to manage the semantics of the organization, and further enabled more than doubling the number of business constraints captured with a conceptual model.3 At the same time, an experiment conducted (with graduate MIS students who were familiar with database design and development) showed that there was an improvement in performance in eliciting rules when they were aware of the proposed spatio-temporal constraint semantics.
We feel that modeling constraints at the conceptual design stage allows for more accurate representation of semantics than current methods. The consequent visibility of modeled rules can lead to better validation of user requirements and reduced cost in error correction. Of course, many opportunities remain for future work. With the increasing adoption of a model-driven architecture (MDA) approach, the possibility of automating the translation of constraints into database triggers and application code is an exciting prospect that could boost IT productivity and facilitate easier change management. The pre-requisites for this are an analysis of both the constraint-type semantics (for example, as done for set-based constraints6) and spatio-temporal semantics (as presented in this work). Combining the two we can generate mapping rules that can translate modeled constraints into application code. Also, since the ideas of evaluation windows / applicability bounds (and the accompanying annotation syntax) is easily extensible, additional dimensions of time (for example, transaction time) and space (including user-defined dimensions) can be added as applications require.