The high expectations of AI have triggered worldwide interest and concern, generating 400+ policy documents on responsible AI. Intense discussions over the ethical issues lay a helpful foundation, preparing researchers, managers, policy makers, and educators for constructive discussions that will lead to clear recommendations for building the reliable, safe, and trustworthy systems6 that will be commercial success. This Viewpoint focuses on four themes that lead to 15 recommendations for moving forward. The four themes combine AI thinking with human-centered User Experience Design (UXD).
Ethics and Design. Ethical discussions are a vital foundation, but raising the edifice of responsible AI requires design decisions to guide software engineering teams, business managers, industry leaders, and government policymakers. Ethical concerns are catalogued in the Berkman Klein Center report3 that offers ethical principles in eight categories: privacy, accountability, safety and security, transparency and explainability, fairness and non-discrimination, human control of technology, professional responsibility, and promotion of human values. These important ethical foundations can be strengthened with actionable design guidelines.
Autonomous Algorithms and Human Control. The recent CRA report2 on "Assured Autonomy" and the IEEE's influential report4 on "Ethically Aligned Design" are strongly devoted to "Autonomous and Intelligent Systems." The reports emphasize machine autonomy, which becomes safer when human control can be exercised to prevent damage. I share the desire for autonomy by way of elegant and efficient algorithms, while adding well-designed control panels for users and supervisors to ensure safer outcomes. Autonomous aerial drones become more effective as remotely piloted aircraft and NASA's Mars Rovers can make autonomous movements, but there is a whole control room of operators managing the larger picture of what is happening.
Humans in the Group; Computers in the Loop. While people are instinctively social, they benefit from well-designed computers. Some designers favor developing computers as collaborators, teammates, and partners, when adding control panels and status displays would make them comprehensible appliances. Machine and deep learning strategies will be more widely used if they are integrated in visual user interfaces, as they are in counterterrorism centers, financial trading rooms, and transportation or utility control centers.
Explainable AI (XAI) and Comprehensible AI (CAI). Many researchers from AI and HCI have turned to the problem of providing explanations of AI decisions, as required by the European General Data Protection Regulation (GDPR) stipulating a "right to explanation."13 Explanations of why mortgage applications or parole requests are rejected can include local or global descriptions, but a useful complementary approach is to prevent confusion and surprise by making comprehensible user interfaces that enable rapid interactive exploration of decision spaces.
Combining AI with UXD will enable rapid progress to the goals of reliable, safe, and trustworthy systems.
Combining AI with UXD will enable rapid progress to the goals of reliable, safe, and trustworthy systems. Software engineers, designers, developers, and their managers are practitioners who need more than ethical discussion. They want clear guidance about what to do today as they work toward deadlines with their limited team resources. They operate in competitive markets that reward speed, clarity, and performance.
This Viewpoint is a brief introduction to the 15 recommendations in a recent article in the ACM Transactions on Interactive Intelligent Systems,8 which bridge the gap between widely discussed ethical principles and practical steps for effective governance that will lead to reliable, safe, and trustworthy AI systems. That article offers detailed descriptions and numerous references. The recommendations, grouped into three levels of governance structures, are meant to provoke discussions that could lead to validated, refined, and widely implemented practices (see the figure here).
Figure. Governance structures to guide teams, organizations, and industry leaders.
TEAM: Reliable Systems Based on Sound Software Engineering Practices
These practices are intended for software engineering teams of designers, developers, and managers.
Audit trails and analysis tools: Software engineers could emulate the safety of civil aviation by making a "flight data recorder for every robot" to record actions, so that retrospective analyses of failures and near misses could track what happened and how dangers were avoided. Then analysts could make recommendations for improving design and training. Audit trails require upfront effort, but by reducing failures and improving performance they reduce injuries, damage, and costs.
Software engineering workflows: Proposals for distinctive workflows for machine learning projects require expanded efforts with user requirements gathering, data collection, cleaning, and labeling, with use of visualization and data analytics to understand abnormal distributions, errors and missing data, clusters, gaps, and anomalies.
Verification and validation testing: The unpredictable performance of machine learning algorithms, means that algorithms need careful testing with numerous benchmark tasks. Since machine learning is highly dependent on the training data, different datasets need to be collected for each context to increase accuracy and reduce biases.
Bias testing to enhance fairness: Beyond algorithm correctness and data quality, careful testing will enhance fairness by lessening gender, ethnic, racial, and other biases.1 Toolkits for fairness testing from researchers and commercial providers can seed the process, but involvement from stakeholders will do much to increase fairness and build connections that could be helpful when problems emerge.
Explainable user interfaces: Software engineers recognize that explainable user interfaces enable more reliable development processes since algorithmic errors and anomalous data can be found more easily when explainability is supported. Exploratory user interfaces, often based on visualization are proving to be increasingly valuable in preventing confusion and understanding errors. Weld and Bansal14 recommend that designers should "make the explanation system interactive so users can drill down until they are satisfied with their understanding."
Critics of these practices believe innovation is happening so quickly that these are luxuries that most software engineering teams cannot afford. Changing from the current practice of releasing partially tested software will yield more reliable and safer products and services.
ORGANIZATION: Safety Culture through Business Management Strategies
Management investment in an organizational safety culture requires budget and personnel, but the payoff is in reduced injuries, damage, and costs.
Leadership commitment to safety: Leadership commitment is made visible to employees by frequent restatements of that commitment, positive efforts in hiring, repeated training, and dealing openly with failures and near misses. Reviews of incidents, such as monthly hospital review board meetings can bring much increased patient safety.
Hiring and training oriented to safety: When safety is included in job hiring position statements, that commitment becomes visible to current employees and potential new hires. Safety cultures may need experienced safety professionals from health, human resources, organizational design, ethnography, and forensics. Training exercises take time but the payoff comes when failures can be avoided and recovery made rapid.
Extensive reporting of failures and near misses: Safety-oriented organizations regularly report on their failures (sometimes referred to as adverse events) and near misses (sometimes referred to as "close calls"). Near misses can be small mistakes that are handled easily or dangerous practices that can be avoided, thereby limiting serious failures. NASA's Aviation Safety Reporting System (https://go.nasa.gov/2TdJhi1) and the Food and Drug Administration's Adverse Event Reporting System (https://bit.ly/3zl1A5B) provide models for public reporting of problems users encounter, while Bugzilla (https://bit.ly/3cRlkUT) is a useful model for technical bug reporting.
Internal review boards for problems and future plans: Commitment to a safety culture is shown by regularly scheduled monthly meetings to discuss failures and near misses, as well as to celebrate resilient efforts in the face of serious challenges. Review boards, such as hospital-based ones, may include managers, staff, and others, who offer diverse perspectives on how to promote continuous improvement. Google, Microsoft, Facebook, and other leading corporations have established internal review processes for AI systems.
Alignment with industry standard practices: Participation in industry standards groups such as those run by the IEEE, International Standards Organization, or the Robotics Industries Association show organizational commitment to developing good practices. Al-oriented extensions of the software engineering Capability Maturity Model11 are being developed to enable organizations to develop carefully managed processes and appropriate metrics to improve software quality.
Changing from the current practice of releasing partially tested software will yield more reliable and safer products and services.
Skeptics worry that organizations are so driven by short-term schedules, budgets, and competitive pressures that the commitment to these safety culture practices will be modest and fleeting.5 Organizations can respond by issuing annual safety reports with standard measures and independent oversight. It may take years for organizations to mature enough so they make serious commitments to safety.
INDUSTRY: Trustworthy Certification by Independent Oversight
The third governance layer brings industry-specific independent oversight to achieve trustworthy systems that receive wide public acceptance. The key to independent oversight is to support the legal, moral, and ethical principles of human or organizational responsibility and liability for their products and services. Responsibility is a complex topic, with nuanced variations such as legal liability, professional accountability, moral responsibility, and ethical bias. Independent oversight is widely used by businesses, government agencies, universities, non-governmental organizations, and civic society to stimulate discussions, review plans, monitor ongoing processes, and analyze failures. The goal of independent oversight is to promote continuous improvements that ensure reliable, safe, and trustworthy products and services.
Government interventions and regulation: Many current AI industry leaders and government policy makers fear that government regulation would limit innovation, but when done carefully, regulation can accelerate innovation as it did with automobile safety and fuel efficiency. A U.S. government memorandum for Executive Branch Departments and Agencies offered ten principles for "stewardship of AI applications,"10,12 but then backed away by suggesting that: "The private sector and other stakeholders may develop voluntary consensus standards that concern AI applications, which provide nonregulatory approaches to manage risks associated with AI applications that are potentially more adaptable to the demands of a rapidly evolving technology."
Accounting firms conduct external audits for AI systems: Independent financial audit firms, which analyze corporate financial statements to certify they are accurate, truthful, and complete, could develop reviewing strategies for corporate AI projects. They would also make recommendations to their client companies about what improvements to make. These firms often develop close relationships with internal auditing committees, so that there is a good chance that recommendations will be implemented.
Insurance companies compensate for AI failures: The insurance industry is a potential guarantor of trustworthiness, as it is in the building, manufacturing, and medical domains. Insurance companies could specify requirements for insurability of AI systems in manufacturing, medical, transportation, industrial, and other domains. They have long played a key role in ensuring building safety by requiring adherence to building codes for structural strength, fire safety, flood protection, and many other features. Building codes could be a model for software engineers, as described in Landwehr's proposal for "a building code for building code."6 He extends historical analogies to plumbing, fire, or electrical standards by reviewing software engineering for avionics, medical devices, and cybersecurity, but the extension to AI systems seems natural.
Non-governmental and civil society organizations: NGOs have proven to be early leaders in developing new ideas about AI principles and ethics, but now they will need to increase their attention to developing ideas about implementing software engineering practices and business management strategies. An inspiring example is how the Algorithmic Justice League was able to get large technology companies to improve their facial recognition products so as to reduce gender and racial bias within a two-year period. This group's pressure was influential in the Spring 2020 decisions of leading companies to halt their sales to police agencies in the wake of the intense movement to limit police racial bias.
Professional organizations and research institutes: Established and new organizations have been vigorously engaged in international discussions on ethical and practical design principles for responsible AI. However, skeptics caution that industry experts and leaders often dominate professional organizations, so they may push for less restrictive guidelines and standards.9 Ensuring diverse participation in professional organizations and open reporting, such as the Partnership on AI's Incident Database can promote meaningful design improvements. Academic research centers have been influential but their resources are often dwarfed by the budgets, systems, and datasets held by leading companies. Industry-supported research centers such as Open AI (https://openai.org) and the Partnership on AI (https://partnershiponai.org) could play a role in technology innovation and more effective governance.
These recommendations are meant to increase reliability, safety, and trustworthiness while increasing the benefits of AI technologies.
These recommendations for teams, organizations, and industries are meant to increase reliability, safety, and trustworthiness while increasing the benefits of AI technologies. After all, the stakes are high: the right kinds of technology advance human values and dignity, while promoting self-efficacy, creativity, and responsibility. The wrong kinds of technology will increase the dangers from failures and malicious actors. Constructive adoption of these recommendations could do much to improve privacy, security, environmental protection, economic development, healthcare, social justice, and human rights.
1. Baeza-Yates, R. Bias on the Web. Commun. ACM 61, 6 (June 2018), 54–61.
2. Computing Research Association. Assured Autonomy: Path toward living with autonomous systems we can trust. Washington, D.C. (Oct. 2020); https://bit.ly/3weGL9W
3. Fjeld, J. et al. Principled artificial intelligence: Mapping consensus in ethical and rights-based approaches to principles for AI. Berkman Klein Center Research Publication, (2020-1); https://bit.ly/358NYfN
4. IEEE Global Initiative on Ethics of Autonomous and Intelligent Systems. Ethically Aligned Design: A Vision for Prioritizing Human Well-being with Autonomous and Intelligent Systems, First Edition. IEEE (2019); https://bit.ly/3gaBrig
5. Larouzee, J. and Le Coze, J.C. Good and bad reasons: The Swiss cheese model and its critics. Safety Science, 126, 104660 (2020); https://bit.ly/3izZ8Ca
6. Landwehr, C. We need a building code for building code. Commun. ACM 58, 2 (Feb. 2015), 24–26.
7. Shneiderman, B. Human-centered artificial intelligence: Reliable, safe and trustworthy. International Journal of Human Computer Interaction 36, 6 (2020), 495–504; https://bit.ly/3gaUetz
8. Shneiderman, B. Bridging the gap between ethics and practice: Guidelines for reliable, safe, and trustworthy Human-Centered AI Systems, ACM Transactions on Interactive Intelligent Systems 10, 4, Article 26 (2020); https://bit.ly/3xisPff
9. Slayton, R. and Clark-Ginsberg, A. Beyond regulatory capture: Coproducing expertise for critical infrastructure protection. Regulation & Governance 12, 1 (2018), 115–130; https://bit.ly/3xbhjSK
10. U.S. White House. American Artificial Intelligence Initiative: Year One Annual Report. Office of Science and Technology Policy (2020); https://bit.ly/2Tl6sXT
11. von Wangenheim, C.G. et al. Creating software process capability/maturity models. IEEE Software 27, 4 (2010), 92–94.
12. Vought, R.T. Guidance for Regulation of Artificial Intelligence Applications. U.S. White House Announcement, Washington, D.C. (Feb. 11, 2019); https://bit.ly/35bhlxT
13. Wachter, S., Mittelstadt, B., and Russell, C. Counterfactual explanations without opening the black box: Automated decisions and the GDPR. Harvard Journal of Law and Technology 31 (2017), 841–887.
14. Weld, D.S. and Bansal, G. The challenge of crafting intelligible intelligence. Commun. ACM 62, 6 (June 2019), 70–79.
Ben Shneiderman (firstname.lastname@example.org) is an Emeritus Distinguished University Professor in the Department of Computer Science, Founding Director (1983–2000) of the Human-Computer Interaction Laboratory, and a Member of the U.S. National Academy of Engineering.
Copyright held by author.
Request permission to (re)publish from the owner/author