On the afternoon of May 6, 2010, the U.S. equity markets experienced an extraordinary upheaval. Over approximately 10 minutes, the Dow Jones Industrial Average dropped more than 600 points, representing the disappearance of approximately $800 billion of market value. The share price of several blue-chip multinational companies fluctuated dramatically; shares that had been at tens of dollars plummeted to a penny in some cases and rocketed to values over $100,000 per share in others. As suddenly as this market downturn occurred, it reversed, so over the next few minutes most of the loss was recovered and share prices returned to levels close to what they had been before the crash.
This event came to be known as the "Flash Crash," and, in the inquiry report published six months later,7 the trigger event was identified as a single block sale of $4.1 billion of futures contracts executed with uncommon urgency on behalf of a fund-management company. That sale began a complex pattern of interactions between the high-frequency algorithmic trading systems (algos) that buy and sell blocks of financial instruments on incredibly short timescales.
A software bug did not cause the Flash Crash; rather, the interactions of independently managed software systems created conditions unforeseen (probably unforeseeable) by the owners and developers of the trading systems. Within seconds, the result was a failure in the broader socio-technical markets that increasingly rely on the algos (see the sidebar "Socio-Technical Systems").
Society depends on complex IT systems created by integrating and orchestrating independently managed systems. The incredible increase in scale and complexity in them over the past decade means new software-engineering techniques are needed to help us cope with their inherent complexity. Here, we explain the principal reasons today's software-engineering methods and tools do not scale, proposing a research and education agenda to help address the inherent problems of large-scale complex IT systems, or LSCITS, engineering.
The key characteristic of these systems is that they are assembled from other systems that are independently controlled and managed. While there is increasing awareness in the software-engineering community of related issues,10 the most relevant background work comes from systems engineering. Systems engineering focuses on developing systems as a whole, as defined by the International Council for Systems Engineering (http://www.incose.org/): "Systems engineering integrates all the disciplines and specialty groups into a team effort forming a structured development process that proceeds from concept to production to operation. Systems engineering considers both the business and the technical needs of all customers with the goal of providing a quality product that meets the user needs."
Systems engineering emerged to take a systemwide perspective on complex engineered systems involving structures and electrical and mechanical systems. Almost all systems today are software-intensive, and systems engineers address the challenge of constructing ultra-large-scale software systems.17 The most relevant aspects of systems engineering is work on "system of systems," or SoS,12 about which Maier said the distinction between SoS and complex monolithic systems is that SoS elements are operationally and managerially independent. Characterizing SoS, he covered a range of systems, from directed (developed for a particular purpose) to virtual (lacking a central management authority or centrally agreed purpose). LSCITS is a type of SoS in which the elements are owned and managed by different organizations. In this classification, the collection of systems that led to the Flash Crash (an LSCITS) would be called a "virtual system of systems." However, since Maier's article was published in 1998, the word "virtual" has generally taken on a different meaningvirtual machines; consequently, we propose an alternative term that we find more descriptive"coalition of systems."
Developers cannot analyze inherent complexity during system development, as it depends on the system's dynamic operating environment.
The systems in a coalition of systems work together, sometimes reluctantly, as doing so is in their mutual interest. Coalitions of systems are not explicitly designed but come into existence as different member systems interact according to agreed-upon protocols. Like political coalitions, there might even be hostility between various members, and members enter and leave according to their interpretation of their own best interests.
The interacting algos that led to the Flash Crash represent an example of a coalition of systems, serving the purposes of their owners and cooperating only because they have to. The owners of the individual systems were competing finance companies that were often mutually hostile. Each system jealously guarded its own information and could change without consulting any other system.
Dynamic coalitions of software-intensive systems are a challenge for software engineering. Designing dependability into the coalition is not possible, as there is no overall design authority, nor is it possible to centrally control the behavior of individual systems. The systems in the coalition can change unpredictably or be completely replaced, and the organizations running them might themselves cease to exist. Coalition "design" involves the protocols for communications, and each organization using the coalition orchestrates the constituent systems its own way. However, the designers and managers of each individual system must consider how to make it robust enough to ensure their organizations are not threatened by failure or undesirable behavior elsewhere in the coalition.
Complexity stems from the number and type of relationships between the system's components and between the system and its environment. If a relatively small number of relationships exist between system components and they change relatively slowly over time, then engineers can develop deterministic models of the system and make predictions concerning its properties.
However, when the elements in a system involve many dynamic relationships, complexity is inevitable. Complex systems are nondeterministic, and system characteristics cannot be predicted by analyzing the system's constituents. Such characteristics emerge when the whole system is put to use and changes over time, depending how it is used and on the state of its external environment.
Dynamic relationships include those between system elements and the system's environment that change. For example, a trust relationship is a dynamic relationship; initially, component A might not trust component B, so, following some interchange, A checks that B has performed as expected. Over time, these checks may be reduced in scope as A's trust in B increases. However, some failure in B may profoundly influence that trust, and, after the failure, even more stringent checks might be introduced.
Complexity stemming from the dynamic relationships between elements in a system depends on the existence and nature of these relationships. Engineers cannot analyze this inherent complexity during system development, as it depends on the system's dynamic operating environment. Coalitions of systems in which elements are large software systems are always inherently complex. The relationships between the elements of the coalition change because they are not independent of how the systems are used or of the nature of their operating environments. Consequently, the nonfunctional (often even the functional) behavior of coalitions of systems is emergent and impossible to predict completely.
Even when the relationships between system elements are simpler, relatively static, and, in principle, understandable, there may be so many elements and relationships that understanding them is practically impossible. Such complexity is called "epistemic complexity" due to our lack of knowledge of the system rather than some inherent system characteristics.16 For example, it may be possible in principle to deduce the traceability relationships between requirements and design, but, if appropriate tools are not available, doing so may be practically impossible.
If you do not know enough about a system's components and their relationships, you cannot make predictions about overall behavior, even if the system lacks dynamic relationships between its elements. Epistemic complexity increases with system size; as ever-larger systems are built, they are inevitably more difficult to understand and their behavior and properties more difficult to predict. This distinction between inherent and epistemic complexity is important. As discussed in the following section, it is the primary reason new approaches to software engineering are needed.
In some respects, software engineering has been incredibly successful. Compared to the systems built in the 1970s and 1980s, modern software is much larger, more complex, more reliable, and often developed more quickly. Software products deliver astonishing functionality at relatively low cost.
Software engineering has focused on reducing and managing epistemic complexity, so, where inherent complexity is relatively limited and a single organization controls all system elements, software engineering is highly effective. However, for coalitions of systems with a high degree of inherent complexity, today's software engineering techniques are inadequate.
This is reflected in the failure that is all too common in large government-funded projects. The software may be delivered late, be more expensive to develop than anticipated, and inadequate for the needs of its users. An example of such a project was the attempt, from 2000 to 2010, to automate U.K. health records; the project was ultimately abandoned at a cost estimated at $5 billion$10 billion.19
The fundamental reason today's software engineering cannot effectively manage inherent complexity is that its basis is in developing individual programs rather than in interacting systems. The consequence is that software-engineering methods are unsuitable for building LSCITS. To appreciate why, we need to examine the essential divide-and-conquer reductionist assumption that is the basis of all modern engineering.
Reductionism is a philosophical position that a complex system is no more than the sum of its parts, and that an account of the overall system can be reduced to accounts of individual constituents. From an engineering perspective, this means systems engineers must be able to design a system so it is composed of discrete smaller parts and interfaces allowing the parts to work together. A systems engineer then builds the system elements and integrates them to create the desired overall system.
Researchers generally adopt this reductionist assumption, and their work concerns finding better ways to decompose problems or systems (such as software architecture), better ways to create the parts of the system (such as object-oriented techniques), or better ways to do system integration (such as test-first development). Underlying all software-engineering methods and techniques (see Figure 1) are three reductionist assumptions:
System owners control system development. A reductionist perspective takes the view that an ultimate controller has the authority to take decisions about a system and is therefore able to enforce decisions on, say, how components interact. However, when systems consist of independently owned and managed elements, there is no such owner or controller and no central authority to take or enforce design decisions;
Decisions are rational, driven by technical criteria. Decision making in organizations is profoundly influenced by political considerations, with actors striving to maintain or improve their current positions to avoid losing face. Technical considerations are rarely the most significant factor in large-system decision making; and
The problem is definable, and system boundaries are clear. The nature of "wicked problems"15 is that the "problem" is constantly changing, depending on the perceptions and status of stakeholders. As stakeholder positions change, the boundaries are likewise redefined.
However, for coalitions of systems, these assumptions never hold true, and many software project "failures," where software is delivered late and/or over budget, are a consequence of adherence to the reductionist view. To help address inherent complexity, software engineering must look toward the systems, people, and organizations that make up a software system's environment. We need to represent, analyze, model, and simulate potential operational environments for coalitions of systems to help us understand and manage, so far as possible, the complex relationships in the coalition.
Since 2006, initiatives in the U.S. and in Europe have sought to address engineering large coalitions of systems. In the U.S., a report by the influential Software Engineering Institute at Carnegie Mellon University (http://www.sei.cmu.edu/) on ultra-large-scale systems (ULSS)13 triggered research leading to creation of the Center for Ultra-Large Scale Software-Intensive Systems, or ULSSIS (http://ulssis.cs.virginia.edu/ULSSIS), a research consortium involving the University of Virginia, Michigan State University, Vanderbilt University, and the University of California, San Diego. In the U.K., the comparable LSCITS Initiative addresses problems of inherent and epistemic complexity in LSCITS, while Hillary Sillitto, a senior systems architect at Thales Land & Joint Systems U.K., has proposed ULSS design principles.17
Northrop et al.13 made the point that developing ultra-large-scale systems needs to go beyond incremental improvements to current methods, identifying seven important research areas: human interaction, computational emergence, design, computational engineering, adaptive system infrastructure, adaptable and predictable system quality and policy, and acquisition and management. The SEI ULSS report suggested it is essential to deploy expertise from a range of disciplines to address these challenges.
We agree the research required is interdisciplinary and that incremental improvement in existing techniques is unable to address the long-term software-engineering challenges of ultra-large-scale systems engineering. However, a weakness of the SEI report was its failure to set out a roadmap outlining how large-scale systems engineering can get from where it is today to the research it proposed.
Software engineers worldwide creating large complex software systems require more immediate, perhaps more incremental, research, driven by the practical problems of complex IT systems engineering. The pragmatic proposals we outline here begin to address some of them, aiming for medium-term, as well as a longer-term, impact on LSCITS engineering.
The research topics we propose here might be viewed as part of the roadmap that could lead us from current practice to LSCITS engineering. We see them as a bridge between the short- and medium-term imperative to improve our ability to create coalitions of systems and the longer-term vision set out in the SEI ULSS report.
Developing coalitions of systems involves engineering individual systems to work in the orchestration, as well as configuration, of a coalition to meet organizational needs. Based on the ideas in the SEI ULSS report and on our own experience in the U.K. LSCITS Initiative, we have identified 10 questions that can help define a research agenda for future LSCITS software engineering:
How can interactions between independent systems be modeled and simulated? To help understand and manage coalitions of systems LSCITS engineers need dynamic models that are updated in real time with information from the system itself. These models are needed to help make what-if assessments of the consequences of system-change options. This requires new performance- and failure-modeling techniques where the models adapt automatically due to system-monitoring data. We do not suggest simulations can be complete or predict all possible problems. However, other engineering disciplines (such as civil and aeronautical engineering) have benefited enormously from simulation, and comparable benefits could be achieved for software engineering.
How can coalitions of systems be monitored? And what are the warning signs problems produce? In the run-up to the Flash Crash, no warning signs indicated the market was tending toward an unstable state. To help avoid transition to an unstable system state, systems engineers need to know the indicators that provide information about the state of the coalition of systems, how they may be used to provide both early warnings of system problems, and, if necessary, switch to safe-mode operating conditions that prevent the possibility of damage.
How can systems be designed to recover from failure? A fundamental principle of software engineering is that systems should be built so they do not fail, leading to development of methods and tools based on fault avoidance, fault detection, and fault tolerance. However, as coalitions of systems are constructed with independently managed elements and negotiated requirements, avoiding failure is increasingly impractical. Indeed, what seems to be a failure for some users may have no effect on others. Because some failures are ambiguous, automated systems cannot cope on their own. Human operators must use information from the system, intervening to enable it to recover from failure. This means understanding the socio-technical processes of failure recovery, the support the operators need, and how to design coalition members to be "good citizens" able to support failure recovery.
The nonfunctional (and, often, the functional) behavior of coalitions of systems is emergent and impossible to predict completely.
How can socio-technical factors be integrated into systems and software-engineering methods? Software- and systems-engineering methods support development of technical systems and generally consider human, social, and organizational issues to be outside the system's boundary. However, such nontechnical factors significantly affect development, integration, and operation of coalitions of systems. Though a considerable body of work covers socio-technical systems, it has not been industrialized or made accessible to practitioners. Baxter and Sommerville2 surveyed this work and proposed a route to industrial-scale use of socio-technical methods. However, much more research and experience is required before socio-technical analyses are used routinely for complex systems engineering.
To what extent can coalitions of systems be self-managing? Needed is research into self-management so systems are able to detect changes in both their operation and operational environment and dynamically reconfigure themselves to cope with the changes. The danger is that reconfiguration will create further complexity, so a key requirement is for the techniques to operate in a safe, predictable, auditable way ensuring self-management does not conflict with "design for recovery."
How can organizations manage complex, dynamically changing system configurations? Coalitions of systems will be constructed through orchestration and configuration, and desired system configurations will change dynamically in response to load, indicators of system health, unavailability of components, and system-health warnings. Ways are needed to support construction by configuration, managing configuration changes and recording changes, including automated changes from the self-management system, in real time, so an audit trail includes the configuration of the coalition at any point in time.
How should the agile engineering of coalitions of systems be supported? The business environment changes quickly in response to economic circumstances, competition, and business reorganization. Likewise, coalitions of systems must be able to change quickly to reflect current business needs. A model of system change that relies on lengthy processes of requirements analysis and approval does not work. Agile methods of programming have been successful for small- to medium-size systems where the dominant activity is systems development. For large complex systems, development processes are often dominated by coordination activities involving multiple stakeholders and engineers who are not colocated. How can agile approaches be effective for "systems development in the large" to support multi-organization global systems development?
How should coalitions of systems be regulated and certified? Many such coalitions represent critical systems, failure of which could threaten individuals, organizations, and national economies. They may have to be certified by regulators checking that, as far as possible, they do not pose a threat to their operators or to the wider systems environment. But certification is expensive. For some safety-critical systems, the cost of certification can exceed the cost of development, and certification costs will increase as systems become larger and more complex. Though certification as practiced today is almost certainly impossible for coalitions of systems, research is urgently needed into incremental and evolutionary certification so our ability to deploy critical complex systems is not curtailed by certification requirements. This issue is social, as well as technical, as societies decide what level of certification is socially and legally acceptable.
How can systems undergo "probabilistic verification"? Today's techniques of system testing and more formal analysis are based on the assumption that a system involves a definitive specification and that behavior deviating from it is recognized. Coalitions of systems have no such specification nor is system behavior guaranteed to be deterministic. The key verification issue will not be whether the system is correct but the probability that it satisfies essential properties (such as safety) that take into account its probabilistic, real-time, nondeterministic behavior.8,11
How should shared knowledge in a coalition of systems be represented? We assume the systems in a coalition interact through service interfaces so the system has no overarching controller. Information is encoded in a standards-based representation. The key problem will not be compatibility but understanding what the information exchange really means. This is addressed today on a system-by-system basis through negotiation between system owners to clarify the meaning of shared information. However, if dynamic coalitions are allowed, with systems entering and leaving the coalition, negotiation is not practical. The key is developing a means of sharing the meaning of information, perhaps through ontologies like those proposed by Antoniou and van Harmelen1 involving the semantic Web.
A major problem researchers must address is lack of knowledge of what happens in real systems. High-profile failures (such as the Flash Crash) lead to inquiries, but more is needed about the practical difficulties faced by developers and operators of coalitions of systems and how to address them as they arise. New ideas, tools, and methods must be supported by long-term empirical studies of the systems and their development processes to provide a solid information base for research and innovation.
The U.K. LSCITS Initiative5 addresses some of them, working with partners from the computer, financial services, and health-care industries to develop an understanding of the fundamental systems engineering problems they face. Key to this work is a long-term engagement with the National Health Information Center to create coalitions of systems to provide external access to and analysis of vast amounts of health and patient data.
The project is developing practical techniques of socio-technical systems engineering2 and exploring design for failure.18 It has so far developed practical, predictable techniques for autonomic system management3,4 and is investigating the scaling up of agile methods14 and exploring incremental system certification9 and development of techniques for system simulation and modeling.
Education. To address the practical issues of creating, managing, and operating LSCITS, engineers need knowledge and understanding of the systems and with techniques outside a "normal" software- or systems-engineering education. In the U.K., the LSCITS Initiative provides a new kind of doctoral degree, comparable in standard to a Ph.D. in computer science or engineering. Students get an engineering doctorate, or EngD, in LSCITS,20 with the following key differences between EngD and Ph.D.:
Industrial problems. Students must work on and spend significant time on an industrial problem. Universities cannot simply replicate the complexity of modern software-intensive systems, with few faculty members having experience and understanding of the systems;
Range of courses. Students must take a range of courses focusing on complexity and systems engineering (such as for LSCITS, socio-technical systems, high-integrity systems engineering, empirical methods, and technology innovation); and
Portfolio of work. Students do not have to deliver a conventional thesis, a book on a single topic, but can deliver a portfolio of work around their selected area; it is a better reflection of work in industry and makes it easier for the topic to evolve as systems change and new research emerges.
However, graduating a few advanced doctoral students is not enough. Universities and industry must also create master's courses that educate complex-systems engineers for the coming decades; our thoughts on what might be covered are outlined in Figure 2. The courses must be multidisciplinary, combining engineering and business disciplines. It is not only the knowledge the disciplines bring that is important but also that students be sensitized to the perspectives of a variety of disciplines and so move beyond the silo of single-discipline thinking.
Since the emergence of widespread networking in the 1990s, all societies have grown increasingly dependent on complex software-intensive systems, with failure having profound social and economic consequences. Industrial organizations and government agencies alike build these systems without understanding how to analyze their behavior and without appropriate engineering principles to support their construction.
The SEI ULSS report14 argued that current engineering methods are inadequate, saying: "For 40 years, we have embraced the traditional engineering perspective. The basic premise underlying the research agenda presented in this document is that beyond certain complexity thresholds, a traditional centralized engineering perspective is no longer adequate nor can it be the primary means by which ultra-complex systems are made real." A key contribution of our work in LSCITS is articulating the fundamental reasons this assertion is true. By examining how engineering has a basis in the philosophical notion of reductionism and how reductionism breaks down in the face of complexity, it is inevitable that traditional software-engineering methods will fail when used to develop LSCITS. Current software engineering is simply not good enough. We need to think differently to address the urgent need for new engineering approaches to help construct large-scale complex coalitions of systems we can trust.
We would like to thank our colleagues Gordon Baxter and John Rooksby of St. Andrews University in Scotland and Hillary Sillitto of Thales Land & Joint Systems U.K. for their constructive comments on drafts of this article. The work report here was partially funded by the U.K. Engineering and Physical Science Research Council (www.epsrc.ac.uk) grant EP/F001096/1.
3. Calinescu, R., Grunske, L., Kwiatkowska, M., Mirandola, R., and Tamburrelli, G. Dynamic QoS management and optimisation in service-based systems. IEEE Transactions on Software Engineering 37, 3 (Mar. 2011), 387409.
4. Calinescu, R. and Kwiatkowska, M. Using quantitative analysis to implement autonomic IT systems. In Proceedings of the 31st International Conference on Software Engineering (Vancouver, May). IEEE Computer Society Press, Los Alamitos, CA, 2009, 100110.
5. Cliff, D., Calinescu, R., Keen, J., Kelly, T., Kwiatkowska, M., McDermid, J., Paige, R., and Sommerville, I. The U.K. Large-Scale Complex IT Systems Initiative 2010; http://lscits.cs.bris.ac.uk/docs/lscits_overview_2010.pdf
6. Cliff, D. and Northrop, L. The Global Financial Markets: An Ultra-Large-Scale Systems Perspective. Briefing paper for the U.K. Government Office for Science Foresight Project on the Future of Computer Trading in the Financial Markets, 2011; http://www.bis.gov.uk/assets/bispartners/foresight/docs/computer-trading/11-1223-dr4-global-financial-markets-systems-perspective.pdf
7. Commodity Futures Trading Commission and Securities and Exchange Commission (U.S.). Findings Regarding the Market Events of May 6th, 2010. Report of the CFTC and SEC to the Joint Advisory Committee on Emerging Regulatory Issues, 2010; http://www.sec.gov/news/studies/2010/marketevents-report.pdf
8. Ge, X., Paige, R.F., and McDermid, J.A. Analyzing system failure behaviors with PRISM. In Proceedings of the Fourth IEEE International Conference on Secure Software Integration and Reliability Improvement Companion (Singapore, June). IEEE Computer Society Press, Los Alamitos, CA, 2010, 130136.
9. Ge, X., Paige, R.F., and McDermid, J.A. An iterative approach for development of safety-critical software and safety arguments. In Proceedings of Agile 2010 (Orlando, FL, Aug.). IEEE Computer Society Press, Los Alamitos, CA, 2010, 3543.
13. Northrop, L. et al. Ultra-Large-Scale Systems: The Software Challenge of the Future. Technical Report. Carnegie Mellon University Software Engineering Institute, Pittsburgh, PA, 2006; http://www.sei.cmu.edu/library/abstracts/books/0978695607.cfm
14. Paige, R.F., Charalambous, R., Ge, X., and Brooke, P.J. Towards agile development of high-integrity systems. In Proceedings of the 27th International Conference on Computer Safety, Reliability, and Security (Newcastle, U.K., Sept.) Springer-Verlag, Heidelberg, 2008, 3043.
16. Rushby, J. Software verification and system assurance. In Proceedings of the Seventh IEEE International Conference on Software Engineering and Formal Methods (Hanoi, Nov.). IEEE Computer Society Press, Los Alamitos, CA, 2009, 19.
18. Sommerville, I. Designing for Recovery: New Challenges for Large-scale Complex IT Systems. Keynote address, Eighth IEEE Conference on Composition-Based Software Systems (Madrid, Feb. 2008); http://sites.google.com/site/iansommerville/keynote-talks/DesigningForRecovery.pdf
19. U.K. Cabinet Office. Programme Assessment Review of the National Programme for IT. Major Projects Authority, London, 2011; http://www.cabinetoffice.gov.uk/resource-library/review-department-healthnational-programme-it
20. University of York. The LSCITS Engineering Doctorate Centre, York, England, 2009; http://www.cs.york.ac.uk/EngD/
©2012 ACM 0001-0782/12/0700 $10.00
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and full citation on the first page. Copyright for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or fee. Request permission to publish from firstname.lastname@example.org or fax (212) 869-0481.
The Digital Library is published by the Association for Computing Machinery. Copyright © 2012 ACM, Inc.
Glad to see someone exploring the issues from the technical side. There are multiple threads of technology-enabled (and dependent) operational processes that are being shaped by changes in the underlying toolsets. It's important that we as technologists are willing to acknowledge the "cross-disciplinary" issues, and the fact that current engineering disciplines don't permit deterministic application of the technical capabilities, but possibly more important now is the lack of awareness or understanding on the part of non-technical players, especially in the regulatory and business arenas. I'm currently working on regulatory and industry responses to the lack of standardized identifiers in financial services -- a contributing factor to why neither business actors or policy makers can get good enough information fast enough to even understand pieces of the "big picture". One of the things that stands out most clearly is the insistence of "those who matter" that they do not have to understand or consider technology issues, especially as to how they apply to implementing policies or procedures. Absolutely crazy, but taken as perceived wisdom that cannot be challenged. Quite likely to see some proposed solutions that ignore technology aspects and guarantee further problems (shortly) down the road. Very frustrating, and I encourage all to think about how exactly to convey the importance of understanding *how* something is done requires as much thought as *why* it is to be done -- at least if you care about *whether or not* it actually gets done.
The following letter was published in the Letters to the Editor in the October 2012 CACM (http://cacm.acm.org/magazines/2012/10/155547).
In "Large-Scale Complex IT Systems" (July 2012) Ian Sommerville et al. reached unwarranted conclusions, blaming project failures on modular programming: "Current software engineering is simply not good enough." Moreover, they did so largely because they missed something about large-scale systems. Their term, "coalition," implies alliance and joint action that does not exist among real-world competitors. They said large-scale systems "coalitions" have different owners with possibly divergent interests (such as in the 2010 Flash Crash mentioned in the article) and then expect the software "coalition" used by the owners to work cooperatively and well, which makes no sense to me. Even if the owners, along with their best minds and sophisticated software, did cooperate to some extent, they would in fact be attempting to deal with some of the most difficult problems on earth (such as earning zillions of dollars in competitive global markets). Expecting software to solve these problems in economics makes no sense when even the most expert humans lack solutions.
The following letter was published in the Letters to the Editor in the October 2012 CACM (http://cacm.acm.org/magazines/2012/10/155547).
Reading Ian Sommerville et al. (July 2012), I could not help but wonder whether new initiatives and institutions are really needed to study and create ultra/large-scale complex artificial systems. We should instead ponder how the behavior and consequences of such systems might be beyond our control and so should not exist in the first place. I am not referring to grand-challenge projects in science and engineering like space exploration and genomics with clear goals and benefits but the ill-conceived, arbitrary, self-interest-driven monstrosities that risk unpredictable behavior and harmful consequences. Wishful thinking, hubris, irresponsible tinkering, greed, and the quest for power drive them, so they should be seen not as a grand challenge but as a grand warning.
Why invent new, ultimately wasteful/destructive "interesting" problems when we could instead focus on the chronic "boring" deadly ones? From war, polluting transportation, and preventable disease to lack of clean water and air. These are real, not contrived, with unglamorous solutions that are infinitely more beneficial for all.
Displaying all 3 comments