Research and Advances
Architecture and Hardware

An Integrated Approach to Enterprise Computing Architectures

Synthesizing an organization's computing infrastructure to support the spectrum of tasks performed by users in a geographically distributed organization.
Posted
  1. Introduction
  2. The Need for an Integrated Approach
  3. Methodology
  4. Implementation
  5. Application and Results
  6. Conclusion
  7. References
  8. Authors
  9. Footnotes
  10. Figures
  11. Tables
  12. Sidebar: Heuristic Classification
  13. Figure

The alignment of an organization’s information technology to its business strategies is a recurrent theme in IS research [1], and has featured prominently in recent surveys of critical issues for IS management [2]. Current corporate downsizing trends have had the effect of flattening organizations’ structures. A transformation of information systems has accompanied this organizational flattening. Several different architectures have evolved during the transition from the monolithic centralized systems of the past to the decentralized, distributed, client/server, and network-based computing architectures of the present. Despite their differences, many of these architectures share an important property—allocation of processing tasks and/or data across multiple computing platforms. In simple cases this might involve storing data or applications on a LAN server and retrieving them using a PC. In more complex scenarios, we encounter partitioning of databases and application programs, data migration, multiphase database updates, and so forth. The common thread in these scenarios is the use of cooperative computing to accomplish a single task.

The rapid growth of cooperative computing throughout the 1990s has transformed the IS function and its management in many organizations. The characteristics of this transformation frequently include a downsizing of systems away from mainframe environments to smaller hardware platforms, coupled with network-based approaches to information management. In other cases, there has been a growth in the size and sophistication of end-user-developed systems, or the “upscaling” of departmental or LAN-based computing, with LANs becoming the repositories for mission-critical corporate data. Computing problems that once were assigned to mainframe computers are now routinely assigned to desktop computing platforms. Price-performance ratios continue to improve dramatically over relatively short periods of time. The emergence of the Internet and the Web offer unprecedented opportunities as well as challenging management problems. In the midst of an ever-increasing set of technology choices, IS managers must still confront fundamental questions regarding the nature of underlying technology infrastructures and the application of rapidly shifting technologies to business decision-making.

The term “enterprise computing architecture” is used to describe the set of computing platforms and the data networking facilities to support an organization’s information needs. Once rather stable in nature, architectures are now subject to frequent revision as organizations seek to attain the best technological “fit.” This is no longer a simple task, given the increasing set of technological options. This becomes an important issue for IS managers as reliance on information technology increases. Despite this, effective strategies for specifying an enterprise computing architecture are still lacking.

Architectures are the manifestation of an organization’s overall IS strategy. Technical integration is increasingly seen as a means to leverage the overall strategic objectives of a business. Appropriate architectures allow organizations to meet current as well as projected information needs, and to successfully adopt new information processing paradigms in a cost-effective manner. The benefits of coordinated architectures include: reduction of undesirable redundancy of system components, appropriate allocation of information processing functions to platforms, meaningful allocation of computing resources to organization locations, and the ability to share information resources across organizational entities at manageable cost.

Back to Top

The Need for an Integrated Approach

The specification of an appropriate enterprise computing architecture is a broad, multifaceted problem. Prior research in the area of enterprise computing architectures has generally addressed individual pieces of this problem separately. Upon determination of the requirements for a distributed application, a number of separate techniques can be employed to design an architecture to meet these requirements, including file allocation [7], network design [6], processor selection [3] and process allocation [8], distributed system and database design [9], and so forth. Most of these techniques consider one facet of architecture design at a time, and assume that the others are given—traditional data distribution environment assumes that modules have been assigned to processors, processors have been assigned to sites, and that the network design in terms of physical links between sites is known. This piecemeal approach is illustrated in Figure 1a. However, these problems are really interdependent, and freezing several dimensions in an effort to tackle one dimension does not effectively address the enterprise computing architecture needs. Furthermore, the techniques developed in earlier research were intended to be applied in environments in which application systems can be neatly partitioned and allocated to individual processors.

Unfortunately, these assumptions are proving untenable, particularly in the case of client/server computing, and an integrated methodology that offers real-world applicability of findings is needed. Client/server and network computing paradigms no longer assume a predefined partitioning of modules and allocation to individual processors. Such changes mandate new and different approaches to designing an appropriate enterprise computing architecture. In these new environments, computing architectures must reflect and be driven by an organization’s information requirements. These requirements include specification of the enterprise’s geographical and organizational structures, assignment of business process responsibilities to individuals, and the placement of individuals at specific locations. The resulting process structure is then superimposed on the existing organization structure to derive the global enterprise requirements. This in turn drives the decisions about what software and data components are needed at each location, which can then be used to derive the hardware needed at the desktop and server levels. Knowledge about hardware and software decisions will be necessary for network design decisions. This approach is more sequential in nature, and is considerably more tractable, and is illustrated in Figure 1b. While much more realistic, this integrated approach also presents a more complex problem. Integrated architecture specification problems tend to be NP-complete, and are susceptible to computational intractability as the number of design variables increase. The proliferation of technology options for clients and servers, as well as the networks for linking them together only serves to exacerbate this. It would be practically impossible to include all conceivable permutations of computing hardware in any consideration of architectures. Consequently, complete enumeration techniques such as optimization are not applicable. As an alternative, we adopt an approach based on heuristic classification [5].

Back to Top

Methodology

The details of our approach appear in Figure 2. Our technique allows the enterprise computing architecture designer to generate a consistent, technology-independent model of the architecture in the context of the requirements that it must satisfy. It is important to note that no underlying assumptions about existing technologies need to be made. A brief description of the essential properties of this approach follows.

Information requirements are provided by the designer in the form of applications and data resources needed to support current and future business processes. The notion of an application or data resource is sufficiently broad to encompass a multi-level scheme for systems, subsystems, or individual software modules, as well as databases, individual tables, collections of data objects, or application-specific data formats such as spreadsheet files. Consistent with the notion of a distributed multi-tiered architecture, end users may be considered individually or in a variety of organizational structures including groups (accounting, marketing) or sites (an office in one city).

It is expected that the requirements provided by a user are likely to be imprecise and specified in an inconsistent fashion. For example, a requirement such as: “every user should be able to use Microsoft Excel whenever they need to” is not given at the same level of detail as: “Sue in the marketing department at the Milwaukee sales office needs to run stored query procedure ABC every Friday.” Consequently, the initial requirements must be put into a common format, which we term atomic-level requirements. Atomic-level requirements take each system or subsystem and factor them down to the individual module level. In addition, consolidated data stores are factored down to the individual table or object level, and each site level or group level end-user requirement is factored down to the individual level. The goal is not to alter the requirements, but to unambiguously standardize them.

These standardized requirements are then considered in the context of various alternatives for resource allocation. The approach is general enough to accommodate traditional applications (including legacy and 3GL systems) as well as more contemporary applications (including 4GL or OO languages, packaged software, and so forth); running against a wide variety of data stores, (including flat files, RDBMS, and OODBMS).

As an initial strategy, two basic alternative approaches to component allocation are considered—minimization and localization. In the minimization approach, program or data components are allocated to the fewest locations practical, with a view toward reducing redundancy and improving integrity. In the localization approach, program and data components are allocated according to the closest practical proximity to their potential users, so as to improve performance. These approaches generate considerably different architectures. For example, if users at two sites require access to a component, the minimization approach would allocate a copy to only one of these sites, while the localization approach would in all probability assign it to both sites. The modeling technique allows for designer-supplied values, forced replication, sharing restrictions, and so forth. The procedures offer default values if data is not provided for at least one instance of a component.

Based on the results from the two approaches, the effects of various application partitioning strategies are then considered. These are based on common, standardized definitions of centralized, thin client, fat client, and decentralized processing, as well as offering options for network-based applications and data. The interaction of component distribution and application partitioning alternatives results in several dozen options for overall component allocation, which define the initial set of modeling alternatives. These alternatives represent various points in the overall problem space, in the context of the designer’s requirements.

Each of these alternatives forms the basis for input to a simulation-based capacity planning tool. The capacity planning tool models the processing, memory, storage and data transmission requirements for an enterprise computing architecture under a frequency weighted set of user demands. We refer to the process as generic modeling, since there are no assumptions made about the actual technology components that will be used to provide the required capabilities. Each generic model is based on a potential allocation of resources and a set of user demands, which are derived from designer-supplied data; an example of a typical user profile is shown in Table 1.

Examination of individual, end-user-level needs is relatively straightforward, once the data is collected and standardized. Typically, worst-case values are selected. This is consistent with the notion of an inevitable but nonetheless desirable excess client level capacity in distributed environments [11]. The effects of the interactions of users, activities, and idle times at the server and network infrastructure levels pose a more elaborate problem. There is often, in practice, a temptation to yield to estimates based on worst cases, but this can result in server and network capacity estimates that are overly pessimistic. The simulation-based, generic models provided in this approach offer a much more realistic alternative. While it can be argued that unit costs for excess capacity continue to spiral downward, cost is not likely to be the only criterion employed in an enterprise computing architecture decision. The simulation modeling process considers each profile of user demands in the context of the previously generated resource allocation options.

The simulation-based generic models may be compared on the basis of several criteria. These include the redundancy of components, the capacities required at client- and server-level nodes, and the intrasite and intersite communications bandwidths required for a particular generic model. Ideally, one of the models would be sufficiently appealing, but it seems equally probable that the designer might choose to explore the solution space around a given alternative or set of alternatives. In this case, the requirements or designer preference data may be modified as desired, and the process may be repeated. For an organization that has already implemented a particular architecture, this would serve as an opportunity to explore alternatives for system reengineering, or the addition of new locations or functions to an existing architecture.

Promising alternatives are next mapped to generic architectures, which capture the architectural requirements in a technology- and implementation-independent format. One or more candidate architectures are selected for further analysis, and mapped to specific sets of technology components. This mapping produces a “clean-slate” set of technology components that would satisfy the requirements and deliver the capacities described in a particular generic architecture. These may then be evaluated on criteria such as acquisition and operating costs, total and excess capacities delivered, ease of migration from an existing infrastructure, and so forth. While the prototype discussed in the next section was implemented using a single technology base, alternative technology bases would allow organizations to make meaningful comparisons across a wide range of product categories or solution providers.

There is no single quantitative measure of the overall appeal of an architecture. Therefore, the designer of an enterprise computing architecture must ultimately decide which, if any, of the candidates is appropriate. Once again, it seems probable that any recommended solution might not be exactly what the designer had in mind, but would represent a good starting point in the overall solution space. The solution space around this point can then be explored by modification of requirements data, alternative resource allocations, extended simulation study, or substitution of alternate technology base components to yield a preferred solution.

Back to Top

Implementation

The methodology described in this article has been implemented in a prototype decision support system called The Information Architect. The prototype is implemented in Visual Basic, and is described in detail in [10]. A few representative examples are included in this section for illustration. Figure 3 shows a typical requirements data capture screen. Currently, relative complexities of the interface, processing, and data manipulation are specified using an ordinal scale. This can be adapted to a ratio scale (for example, 0–100) to provide greater flexibility in designer input, if warranted.

In this example, the procedure for describing a software component is used. In most cases, the designer is able to select from predefined profiles for most components, and can modify the profile data as necessary. For an organization incorporating existing systems and databases as well as proposed components, much of the data would be readily available from existing system documentation or standard utility packages.

The next example shows the designer’s view of the simulation modeling facility (see Figure 4). From this point the designer can choose which alternatives to model, the extent of the simulation study, or to visually inspect the input data that reflect the combinations of strategic preferences, allocation and partitioning options that are embodied in the requirements data.

Finally, the designer may compare results from the simulation modeling process both before and after mapping the generic architectures. The example shown in Figure 5 shows a typical comparison of alternatives before considering the technologies needed to implement them.

Back to Top

Application and Results

The methodology and the prototype implementation were tested by applying them to two real-world problem scenarios. The first case involved a medium-sized manufacturing and distribution company in the midwestern U.S. The second case dealt with a regional processing facility for a large nonprofit organization, also located in the midwestern U.S. The two organizations are of similar size in terms of the total number of users their information architectures must support. Beyond that, there are numerous differences that were ultimately reflected in the Information Architect’s recommendations for appropriate architecture configurations. Table 2 presents a comparison of some of the relevant attributes of the organizations.

Case I considered an organization that is widely dispersed geographically, whereas Case II considered one that is a highly centralized, both physically and procedurally. Case I also exhibited a somewhat uniform distribution of many functions among multiple sites. In both cases, data collection was accomplished by a combination of reviewing current system documentation and interviewing IT management and potential solution providers as appropriate. An anticipated outcome of using this technique was the discipline and structure imposed on the set of formal system specifications. This was observed in both cases. At the request of both participating organizations, the initial technology base was constrained to IBM PC, RS/6000 or AS/400 servers, and generic PC-class desktop systems. Intersite communications costs were based on nondiscounted rates obtained from a regional telephone service provider. Incidental overhead factors such as internal wiring were not included, as they would be essentially common to any of the alternatives being considered.

For Case I, end-user-level system costs are invariable across all alternatives, largely due to the overhead required to support personal productivity applications (spreadsheets, word processing, and so forth). This is reflected in the redundancy coefficients1 for program components, which range from approximately 22.6 in a centralized (minimized) scenario to 33.0 in a fat-client (localized) scenario. Redundancy for data components, including forced replication of data, ranged from approximately 1.5 in the minimized scenarios to 1.7 in the localized scenarios.

Table 3 illustrates the projected acquisition and operating costs for various modeling scenarios in Case I. Though the total projected cost of acquisition and three-year operations for any of these proposed architectures varies by approximately 20%, the differences are observed exclusively in the server platforms and communications costs. The effects of component allocation strategies are much greater than the effects of application partitioning. This reflects distance-based costs of data transmission required for concurrency of multiple instances of data components.

In Case I, there was little variation observed in recommended configurations as a function of software partitioning. In Case II, the effects are more noticeable. An important difference is the types of applications that must be supported by each organization’s information architecture. The typical application for Case I might best be described as a low data volume update transaction. For Case II, a more appropriate description would be a high data volume query.

Case II exhibited more variation in the recommended architectures across the modeling scenarios considered. Again, there was little difference in the user-level recommendations, for the same reason as Case I: the desire to support personal productivity applications on local machines for each end user. However, the close geographical proximity of the sites reduced or eliminated much of the distance-related costs that were observed in Case I. The cost differences across the recommendations are not dramatic, but the configurations show much more variation. In architectures appropriate for decentralized or “fat client” processing, several work group-level servers were not required (data components were located at the site rather than group levels in these scenarios). In these cases, the recommended architecture was able to suggest a means to take advantage of excess processor capacity at the user level by allocating some application components to the user level.

In Case I, several recommended allocations of data resources were considerably different from the current implementation. These differences were prominent in the minimal allocation models. Closer inspection of the modeling data revealed that there were more potential users of these data resources at one of the remote sites than at the home office, and the recommended allocations were entirely logical. Interestingly, neither case study exhibited significant consequences between architectures that would support the classic “thin” and “fat” client partitioning strategies. It is unclear whether this reflects the nature of the underlying requirements for the particular organizations, or whether the consequences of application partitioning are more closely tied to the size of the organization. At the very least, there is a suggestion that the traditional cost savings justification for cooperative processing models may be weaker than we had anticipated [4]. Larger organizations may well exhibit greater differences of scale in outcomes across these strategies.

The issue of granularity of the proposed solution options needs to be considered. As an example, processor requirements for a given server node might vary by 100% to 200% across different modeling scenarios, yet all outcomes might be supported by the same server. The Information Architect was able to illustrate the consequences of alternative design and allocation strategies, as well as their interactions, in a manner that was useful for the organizations as they consider implementing new information architectures. IS managers in both organizations reported that the process produced useful insights into their architectural decision-making process. The organization in Case I is actively implementing server-level portions of one of the recommended architectures, but it is still too soon to report results other than a subjective finding of satisfaction with the outcomes.

Back to Top

Conclusion

Procedures for determining appropriate enterprise computing architectures continue to feature prominently on list of IS management concerns. The techniques presented in this article allow an organization to capture, formalize, and translate information requirements into viable alternative configurations for an enterprise computing architecture. In two separate case studies, the methodology was effective in helping organizations to assess the consequences of alternative resource allocation and application design strategies on the architectures needed to support them. The techniques presented here allow an organization to simultaneously evaluate multiple alternatives for an enterprise computing architecture. The evaluation framework presented in this study employed criteria of hardware cost, operating cost, and redundancy, but the methodology is easily extended to include other criteria. Prior approaches have typically produced single designs, and offered little if any basis for evaluation or comparison.

The procedure also permits the derivation of enterprise computing architectures that reflect specific design objectives, such as the distribution of data stores and application programs within and among sites, the extent of application partitioning, etc. This offers the potential to effectively manage the architecture design by putting the designer in control of specifying preferences and objectives. While this poses a risk that some alternatives may be excluded when specifying preferences, the benefits of allowing the designer to specify and manipulate preferences clearly outweigh that risk.

This research makes an important methodological contribution by demonstrating that an integrated approach combining multiple techniques is possible, useful, and practical. Simulation modeling, rule-based reasoning, and heuristic classification have all been used to reduce otherwise intractable design problems to manageable proportions. This research combines the strengths of each of these techniques to facilitate consideration of problems of broader scope than before. The use of simulation-based procedures for estimating server capacities is also appealing. While more exact techniques like queueing theory are available, they tend to exhibit problems of tractability when applied to real-world problems. A simulation-based approach provides more realistic and cost-effective estimation of server capacity requirements. Heuristic-based approaches have also enjoyed considerable success in the literature, but most often in the context of narrow subsets of larger problem domains. We have suggested, and offer evidence to support the notion that these approaches are also applicable to these larger domains. This methodology is flexible and adaptable to permit consideration of a variety of problem scenarios, beyond the scope of the cases presented.

The growth of Internet and intranet applications raises considerable interest in techniques to plan for their successful adoption and exploitation. The techniques developed and presented here are capable of accommodating these applications. The modular design of interchangeable components in the Information Architect also facilitates meaningful comparison of alternatives across multiple sets of technology offerings. Given the nature of problem addressed, it offers unique opportunities for collaboration between organizations, researchers, and technology providers.

Back to Top

Back to Top

Back to Top

Back to Top

Figures

F1 Figure 1. Traditional vs. integrated architecture design methodologies.

F2 Figure 2. Integrated design methodology.

F3 Figure 3. Requirements data entry example.

F4 Figure 4. Simulation modeling interface.

F5 Figure 5. Generic modeling summary statistics.

Back to Top

Tables

T1 Table 1. Sample end-user task and idle-time weighting.

T2 Table 2. Overview of case studies.

T3 Table 3. Cost comparisons by scenario: Case one.

Back to Top

UF1-1 Figure.

    1. Boynton, A.C., Zmud, R.W., and Jacobs, G.C. The influence of IT management practice on IT use in large organizations. MIS Q. 18, 3 (Sept. 1994), 299–318.

    2. Brancheau, J., Janz, B., and Wetherbe, J. Key issues in information systems management: 1994–95 SIM Delphi Results. MIS Q. 20, 2 (June 1996), 225–242.

    3. Burd, S. and Kassicieh, S. Decision support for supercomputer acquisition. Operations Research 39, 3 (May–June 1991), 366–377.

    4. Burris, P. and Christiansen, C. Cost-to-Use of Midrange and PC LAN Systems in the Networked Enterprise. IDC#7439, International Data Corp., Framingham, MA, 1993.

    5. Clancey, W.J. Heuristic classification. Artificial Intelligence 27, 3 (Dec. 1985), 289–350.

    6. Dupuy, A., Schwartz, J., Yemini, Y., and Bacon, D. NEST: A network simulation and prototyping testbed. Commun. ACM 33, 10 (Oct. 1990), 63–74.

    7. Gavish, B. and Pirkul, H. Computer and database location in distributed computer systems. IEEE Trans. Computers C-35, 7 (July 1986), 583–590.

    8. Houstis, C.E. Module Allocation of real-time applications to distributed systems. IEEE Trans. Software Engineering SE-16, 7 (July 1990), 699–709.

    9. Jain, H.K. A comprehensive model for the design of distributed computer systems. IEEE Trans. Software Engineering SE-13, 10 (Oct. 1987), 1,092–1,104.

    10. Nezlek, G.S. Architectures for cooperative computing: A knowledge-based approach. Ph.D. dissertation, University of Wisconsin-Milwaukee, June 1997.

    11. Satyanarayanan, M. The influence of scale on distributed file system design. IEEE Trans. Software Engineering SE-18, 1 (Jan. 1992), 1–8.

    12. Boynton, A.C., Zmud, R.W., and Jacobs, G.C. The influence of IT management practice on IT use in large organizations. MIS Q. 18, 3 (Sept. 1994), 299–318.

    1Redundancy coefficients are scaled so that a coefficient of zero results if there are no duplicated components, a coefficient of 1.00 would reflect an average of two instances of each element, and so forth.

Join the Discussion (0)

Become a Member or Sign In to Post a Comment

The Latest from CACM

Shape the Future of Computing

ACM encourages its members to take a direct hand in shaping the future of the association. There are more ways than ever to get involved.

Get Involved

Communications of the ACM (CACM) is now a fully Open Access publication.

By opening CACM to the world, we hope to increase engagement among the broader computer science community and encourage non-members to discover the rich resources ACM has to offer.

Learn More