Research and product teams in artificial intelligence (AI) and data science need to think more deeply about how their work maximizes benefits and mitigates harms—particularly as their efforts are applied to increasingly challenging problems with diverse, global impacts.7 Achieving good results may entail making complex trade-offs, especially when initial goals are difficult to specify and disputed and when a project's initial objectives have complex, long-term side effects.
In this column, I argue we cannot make these trade-offs by an appeal to ethics alone. Instead, we must:
Begin with a deep understanding of the field-specific challenges we face;
Strongly commit to integrity; and
Leverage understanding from ethicists, but also engage with the broader liberal arts communities (economics, political science, history, literature, and so forth) to better envision the impact of our decisions and gain the needed breadth and depth their expertise can provide. As Brodhead writes, our "dynamism of a culture of innovation has come in large part from our distinctive liberal arts tradition, in which students are exposed to many different forms of knowledge and analysis, laying down a mental reservoir that can be drawn on in ever-changing ways to deal with the unforeseeable new challenges."3
Only by the application of these three items can we apply our evermore-powerful technologies to achieve the best results.
I have had the opportunity to read and reflect on this topic while co-authoring Data Science in Context: Foundations, Challenges, Opportunities,14,15 where my co-authors and I enumerate, categorize, and describe the complex issues that arise when applying data science. These "gotchas" led us to create an "Analysis Rubric"' containing top-level elements encompassing technical and societal considerations, as shown in the table in this column.
Table. Field-specific challenges in data science.15
The rubric is intended to help scientists, engineers, and product managers as well as social scientists and policymakers ensure they have considered the breadth of issues needed to apply data science beneficially. The rubric also identifies and motivates many research challenges, such as causal inference, detecting abuse or fake data, and providing interpretable machine learning. While this list focuses on data science, many of the topics are adaptable to AI, computing research more broadly, engineering, and product development.
Contemporary embedded or standalone technology classes in many universities include case studies relating to these rubric topics. For example, The University of Toronto's Embedded Ethics Education Initiative has a module that discusses the societal benefits and counterbalancing privacy risks of COVID-19 contact tracing, and another that presents the complexity of setting the right objectives in recommendation systems.5 One of MIT's modules discusses the balance of privacy versus data quality relating to the use of differential privacy in the census; another explores the complex goals and impacts of algorithmic approaches to defining voting district boundaries.11 A Stanford case study explores the potential safety benefits of self-driving vehicles versus their differential impacts on sub-populations.9
Unfortunately, the identification of challenges and trade-offs does not specify how to balance competing goals. This complexity becomes manifest as one considers the long-term, widely dispersed consequences of decisions.
Unfortunately, the identification of challenges and tradeoffs does not specify how to balance competing interests.
In considering this problem, my co-authors and I realized that any discussion of trade-offs must start with integrity, which is the foundation (in computer science terminology, the "secure kernel") for the proper conduct of all science and engineering endeavors. In data science and AI, this has become ever more apparent as we are confronted with visible risks arising from the misuse and misrepresentation of data and extending to computing's broader impacts. As professionals, we must disclose the limits of our art, practice lawful behavior, always tell the truth, and not misrepresent our conclusions or capabilities.
With a foundation in integrity, we turned to the Belmont Principles for applied ethics in biomedical and behavioral research as a useful ethical framework.16 The three Belmont principles can be applied more broadly and particularly to applications of data science.
Respect for persons supports an individual's freedom to act autonomously and be free from undue manipulation. In medical research, it often surfaces as "informed consent," while in applications of data science, it serves as the rationale for clear disclosure, "opt-in," and similar policies.
Beneficence refers to the need to maximize benefits and balance them against risks. Negative consequences may be unanticipated or emergent phenomena, and they require continual monitoring and vigilance. In healthcare, beneficence is often associated with "first, do no harm," but this is overly simplistic as almost every treatment has some risk.
Justice requires attention to the distribution of risk and benefits across a population. In human subject research, Belmont suggests that there should be particular benefits to the populations who bear risk from the research. In applications of data science, considerations of justice extend to broader considerations of fairness across larger populations.
Certainly, there are other important and relevant philosophical principles beyond Belmont. Here are a few with particular relevance today:
Aristotle argues that virtue serves the good of our community and, at the same time, is necessary for our own happiness or completion.1 His writing presages the need for considering the societal impact of technologies that individuals choose to adopt. Shannon Vallor's recent book Technology and the Virtues focuses on the need to apply virtue ethics to AI-related issues.17
John Stuart Mill espouses the Harm Principle, which states "the only purpose for which power can be rightfully exercised over any member of a civilized community, against his will, is to prevent harm to others," but he also advocates "exertion to promote the good of others."10 His two statements help us understand differing views on the regulation of technology, the definition of fairness objectives, and many other topics we read about daily.
The principles leading up to war (jus ad bellum) and conducting war (jus in bello) may inform our views on the application of data science and AI in military domains,12 for instance in the context of Russia's invasion of Ukraine. Ethical principles are different in such extraordinary and harmful circumstances.
Experts would add more philosophers and philosophical frameworks to help guide our thinking.
With this column's first emphasis on technical elements (for example, dependability) and its second advocacy for honesty and ethics (for example, Belmont), readers might surmise we have a complete framework for educating and guiding ourselves toward the effective and constructive uses of our technologies. Upon reflection, I have concluded these two components are both necessary but not sufficient. This is because there are other frameworks and models of thought (not usually labeled as ethics) that are equally critical to the right and proper uses of technology. While ethics tends to concentrate on what world we want, there must also be focus on the pragmatics of how societies organize themselves to achieve practical progress toward their goals. There are many situations where certain ethical objectives may not be directly achievable due to the strains they put on a political system, the economic inefficiencies they cause, or other long-term consequences.
We thus need not just an understanding of ethics, but also of other disciplines that enable us to reach the best possible conclusions:
Economics focuses on the study of systems that foster logical and effective use of resources to meet private and social goals, including such objectives as economic freedom, fairness in reward and the balance of wealth, economic security, growth and efficiency, price stability, and employment. Economics treats topics such as the balance between equity and efficiency, economic freedom and social welfare, and incentives to promote (or lessen) desired (or harmful) economic results or behaviors. In a market economy, profit and its associated surplus catalyze innovation and may conflict with ethical principles that might espouse redistribution to foster fairness.
Political science focuses on the study of societal governance, often with the objective of developing more effective political systems. Based on consideration of political theory, comparisons of different political systems, and international relations, the field illuminates the impact of technology on societies and their governments and also informs us as to the ability to effectively apply regulation or legal mandates. Topics such as political processes, fairness, liberty, justice, stability, international order, and more arise.
History focuses on explaining, evaluating, and otherwise analyzing historical events that have transformed societies, often to inform us of the potential impact of decisions. Churchill, channeling Santayana, is widely quoted as saying, "Those who fail to learn from history are condemned to repeat it."13
Literature is the study of texts with the goal of increasing understanding of the human condition—the world around us: as it was, is, and may be.
We must turn to more fields of study, whether other social sciences or the arts, if we are to truly leverage society's accumulated wisdom.
Examples abound where the contemporary decisions on the application of technology require broad considerations of not just ethics, but economics and political science. Targeted advertising is a good example:
Examples of ethical concerns arise from all three Belmont Principles: respect-for-persons addresses issues relating to opt-in, beneficence requires mitigating harmful manipulation, and justice considers the potential for societally imbalanced benefits and harms.
Economic concerns arise from the role of advertising in catalyzing new and more efficient markets, creative endeavors, innovative products, and much of the Web ecosystem while balancing potential negatives, including concentration of power and wealth.
Political considerations include balancing the likelihood that governmental entities can add value without undue harm to societal innovation, prosperity, or their own legitimacy.
From a historical perspective, there is much to learn about unintended consequences of seemingly good decisions—for example, where regulation has worked reasonably well (for example, medical advertising), and where it has led to poor outcomes (for example, incessant cookie warnings).
Fiction may portray dystopian worlds that warn us of harmful consequences or delight us with the prospect of changes, which we might otherwise narrowly reject.
This list of relevant disciplines is limited and may still not supply sufficient perspective. We must turn to more fields of study, whether other social sciences or the arts, if we are to truly leverage society's accumulated wisdom. The recent book, Framers, which broadly discusses the models underlying human decisions, argues that great breadth is needed for truly insightful decisions.8 More directly, Connolly's provocatively titled piece, "Why Computing Belongs within the Social Sciences," argues compellingly that we should "Embrace other disciplines' insight."6
This discussion motivates the main thrust of this column: A three-part framework that we, as practitioners and educators, can apply to guide computational and data-centric technologies to achieving the most beneficial results.
We, as computer scientists, data scientists, experts in machine learning, and other subjects, must teach and attend meticulously to our own field-specific challenges. Just as structural engineers must teach and apply the consequences of metal fatigue or oxidation, our field requires in-depth research, understanding, and careful application of the topics contributing to dependability, understandability, fairness, sustainability, and others appearing in the accompanying table.
We must also practice, teach, and demand integrity. We must be scrupulously careful to disclose the limits of our capabilities, to practice lawful behavior, and to always tell the truth. We must act to prevent the growing amounts of societal dishonesty from entering our field and our practice. I explicitly call out integrity as this second item because we can be individually responsible for it and make it part of all our courses, it is often clearer than other topics in ethics, and integrity is the foundation of trust, reproducibility, and reasoning.
We and our students must have increased grounding in the liberal arts. As our research and work products are applied to ever more complex problems, they will require trade-offs that rely not only on ethical considerations, but equally on economics, political science, history, other social sciences, and the humanities. When we educate computer science or data science students, we should ensure they take a sufficient number of courses in other disciplines, and we should explicitly advise them on their non-technical curricular choices. This suggestion is broadly consistent with Connolly's view6 but advocates leveraging a students' core curricula or distribution requirements more than replacing technical courses. We should have the humility to recognize we cannot become experts in all relevant disciplines, so we must work collaboratively with others. Just possibly, as programming becomes increasingly automated,2 the distinguishing characteristic of our field's most successful contributors may tilt even more toward those having this breadth of education.
We should have the humility to recognize we cannot become experts in all relevant disciplines, so we must work collaboratively with others.
We must be broadly educated if we are to guide the growth of computing, data science, and AI and its application to ever more difficult, even wicked,4 problems. As technical experts, we certainly need to both lead and educate our students on our field-specific challenges and lead in the education and practice of integrity. However, we and our students also must have broad grounding in the non-technical topics that influence whether our research or products can achieve the positive results we desire. These topics include ethics, but also the other enablers of our societies including economics, political science, history, and more. As educators, we must guide our students to an important collection of non-technical courses. We also must increasingly collaborate with professionals in other fields, as we cannot be experts in everything.
13. Santayana, G. Introduction, and Reason in Common Sense. C. Scribner's Sons (1905).
14. Spector, A.Z. et al. More than just algorithms: A discussion with Alfred Spector, Peter Norvig, Chris Wiggins, Jeannette Wing, Ben Fried, and Michael Tingley. Commun. ACM 66, 8 (Aug. 2023).
15. Spector, A.Z. et al. Data Science in Context: Foundations, Challenges, Opportunities. Cambridge University Press. (2022).
16. United States National Commission for the Protection of Human Subjects of Biomedical and Behavioral Research. The Belmont Report: Ethical Principles and Guidelines for the Protection of Human Subjects of Research. (1978).
17. Vallor, S. Technology and the Virtues: A Philosophical Guide to a Future Worth Wanting. Oxford University Press (2016).