Machine learning now powers a huge range of applications, from speech recognition systems to search engines, self-driving cars, and prison-sentencing systems. Many applications that were once designed and programmed by humans now combine human-written components with behaviors learned from data. This shift presents new challenges to computer science (CS) practitioners and educators. In this column, we consider how machine learning might change what we consider to be core CS knowledge and skills, and how this should impact the design of both machine learning courses and the broader CS university curriculum.
Computing educators1,6 have historically considered the core of CS to be a collection of human-comprehensible abstractions in the form of data structures and algorithms. Deterministic and logically verifiable algorithms have been central to the epistemology and practices of computer science.
With machine learning (ML) this changes: First, the typical model is likely to be an opaque composite of millions of parameters, not a human-readable algorithm. Second, the verification process is not a logical proof of correctness, but rather a statistical demonstration of effectiveness. As Langley5 observed, ML is an empirical science that shares epistemological approaches with fields such as physics and chemistry.
While traditional software is built by human programmers who describe the steps needed to accomplish a goal (how to do it), a typical ML system is built by describing the objective that the system is trying to maximize (what to achieve). The learning procedure then uses a dataset of examples to determine the model that achieves this maximization. The trained model takes on the role of both data structure and algorithm. The role that each parameter plays is not clear to a human, and these computational solutions no longer reflect humans' conceptual descriptions of problem domains, but instead function as summaries of the data that are understandable only in terms of their empirically measurable performance.
To succeed with ML, many students will not concentrate on algorithm development, but rather on data collection, data cleaning, model choice, and statistical testing.
ML has historically been a niche area of CS, but now it is increasingly relevant to core CS disciplines, from computer architecture to operating systems.3 It may even be fair to say that ML is now a core area of CS, providing a parallel theoretical basis to the lambda calculus for defining and reasoning about computational systems. The growing importance of ML thus raises challenging questions for CS education: How should practical and theoretical ML topics now be integrated into undergraduate curricula? And how can we make room for expanded ML content in a way that augmentsrather than displacesclassical CS skills, within undergraduate degree programs whose duration must remain fairly static?
Changes to the Introductory Sequence. Most CS undergraduate programs begin with introductory courses that emphasize the development of programming skills, covering topics like control structures, the definition and use of functions, basic data types, and the design and implementation of simple algorithms.4
In many cases, assignments in these courses make use of existing library functions, for instance to read and write data to the filesystem. Students are not expected to fully understand how these libraries and the underlying hardware infrastructure work, so much as to use the interfaces that these libraries present. The aims of introductory courses are students' development of notional machines2 for reasoning about how a computer executes a program, and the development of the pragmatic skills for writing and debugging programs that computers can execute.
ML has historically been a niche area of CS, but now it is increasingly relevant to core CS disciplines.
These same two aims can also describe introductory courses for an ML-as-core world. We do not envision that ML methods would replace symbolic programming in such courses, but they would provide alternative means for defining and debugging the behaviors of functions within students' programs. Students will learn early on about two kinds of notional machinethat of the classical logical computer and that of the statistical model. They will learn methods for authoring, testing, and debugging programs for each kind of notional machine, and learn to combine both models within software systems.
We imagine that future introductory courses will include ML through the use of beginner-friendly program editors, libraries, and assignments that encourage students to define some functions using ML, and then to integrate those functions within programs that are authored using more traditional methods. For instance, students might take a game they created in a prior assignment using classical programming, and then use ML techniques to create a gestural interface (for example, using accelerometers from a smartphone, pose information from a webcam, or audio from a microphone) for moving the player's character up, down, left, and right within that game. Such assignments would engage students in creating or curating training examples, measuring how well their trained models perform, and debugging models by adjusting training data or choices about learning algorithms and features.
Such activities do not require deep understanding of ML algorithms, just as reading from a filesystem using high-level APIs does not require deep understanding of computer hardware or operating systems. Yet these activities can introduce new CS students to epistemological practices core to ML, laying the foundation for encountering ML again in other contexts (whether an elective in ML theory, advanced electives in computer vision or architecture, or in professional software development). Such activities additionally enable the creation of new and engaging types of software (for example, systems that are driven by real-time sensors or social-media data) that are very difficult for novice programmers (and even experts) to create without ML.
Changes to the Advanced Core. In most CS degree programs, the introductory sequence is followed by a set of more advanced courses. How should that more advanced core change in light of ML?
Current courses in software verification and validation stress two points: proof of correctness and tests that verify Boolean properties of programs. But with ML applications, the emphasis is on experiment design and on statistical inference about the results of experiments. Future coursework should include data-driven software testing methodologies, such as the development of test suites that evaluate whether software tools perform acceptably when trained using specific datasets, and that can monitor measurable regressions over time.
Human-computer interaction (HCI) courses may be expanded to reflect how ML changes both the nature of human-facing technologies that can be created and the processes by which they are created and evaluated. For instance, ML enables the creation of applications that dynamically adapt in response to data about their use. HCI education currently emphasizes the use of empirical methods from psychology and anthropology to understand users' needs and evaluate new technologies; now, the ability to apply ML to log data capturing users' interactions with a product can drive new ways of understanding users' experiences and translating these into design recommendations. Future HCI coursework will need to include these ML-based systems design and evaluation methodologies.
Operating systems courses describe best practices for tasks such as allocating memory and scheduling processes. Typically, the values of key parameters for those tasks are chosen through experience. But with ML the parameter values, and sometimes the whole approach, can be allowed to vary depending on the tasks that are actually running, enabling systems that are more efficient and more adaptable to changing work loads, even ones not foreseen by their designer. Future OS coursework may need to include the study of ML techniques for dynamically optimizing system performance.3
Changes to Prerequisite and Concurrent Expectations. It is typical for CS curricula to require coursework outside of CS departments, such as courses in mathematics and physics. In many cases, and especially when CS programs are housed within schools of engineering, these requirements emphasize calculus coursework. Many programs include coursework in probability and statistics, though notably the authors of ACM and IEEE's joint Computing Curricula 2013 "believe it is not necessary for all CS programs to require a full course in probability theory for all majors."4
Are these recommendations still appropriate? Many programs require coursework in probability and statistics, which we enthusiastically encourage, as they are crucial for engaging with the theory behind ML algorithm design and analysis, and for working effectively with certain powerful types of ML approaches. Linear algebra is essential for both ML practitioners and researchers, as is knowledge about optimization. The set of foundational knowledge for ML is thus both broad and distinct from that conventionally required to obtain a CS degree. What, therefore, should be considered essential to the training of tomorrow's computer scientists?
The ACM-IEEE Computer Science Curricula 20134 identifies 18 different Knowledge Areas (KAs), including Algorithms and Complexity, Architecture and Organization, Discrete Structures, and Intelligent Systems. The definitions and recommended durations of attention to the KAs reflect a classic view of CS; ML is referred to exclusively within a few suggested elective offerings. We believe the rapid rise in the use of ML within CS in just the past few years indicates the need to rethink guiding documents like this, along with commensurate changes in the educational offerings of computing departments.
In addition, research on how people learn ML is desperately needed. Nearly the entirety of the published computing education literature pertains to classical approaches to computing. As we have mentioned earlier in this column, ML systems are fundamentally different than traditional data structures and algorithms, and must, therefore, be reasoned about and learned differently. Many insights from mathematics and statistics education research are likely to be relevant to machine learning education research, but researchers in these fields only rarely intersect with computing education researchers. Therefore, we call upon funding agencies and professional societies such as ACM to use their convening power to bring together computing education researchers and math education researchers in support of developing a rich knowledge base about the teaching and learning of machine learning.
2. Boulay, B.D., O'Shea, T., and Monk, J. The black box inside the glass box: Presenting computing concepts to novices. International Journal of Man-Machine Studies 14, 3 (Apr. 1981), 237249; https://doi.org/10.1016/S0020-7373(81)80056-9.
4. Joint Task Force on Computing Curricula, Association for Computing Machinery, IEEE Computer Society (2013). Computer science curricula 2013; https://bit.ly/2E6dDGR
The Digital Library is published by the Association for Computing Machinery. Copyright © 2018 ACM, Inc.
I applaud the authors for this article. And while I fully agree with the general sentiment of the authors, I want to point out some inaccurate characterizations of the discussion of probability and machine learning in ACM-IEEE Computer Science Curricula 2013 (CS2013).
For example, the authors write "notably the authors of ACM and IEEE's joint Computing Curricula 2013 'believe it is not necessary for all CS programs to require a full course in probability theory for all majors.'" It is instructive to look at the complete sentence that this quote is taken from: "Similarly, while we do note a growing trend in the use of probability and statistics in computing (reflected by the increased number of core hours on these topics in the Body of Knowledge) and believe that this trend is likely to continue in the future, we still believe it is not necessary for all CS programs to require a full course in probability theory for all majors."
The point of this quote is not to emphasize that probability is not important for CS majors as the authors' tend to suggest in their article, but to the contrary, that probability is growing in importance and will continue to do so. Still, in 2013 (and perhaps still today) a *full* course in probability is likely not needed for *all* CS majors, especially at schools that may have hard limits on the number of classes that can be required for an undergraduate major. For example, some CS majors (perhaps those who are not emphasizing AI in their program could suffice with inclusion of probability as part of a discrete math class), while other students (who were focusing on AI) would be required to take a full class (or more) on probability theory. Notably, in CS2013 there are 8 total core hours of probability included in the Discrete Structures Knowledge Area. CS2013 also contains a number of "exemplar classes" showing examples of both courses on Discrete Structures/Mathematics that include a section on probability as well as a full course on "Probability Theory for Computer Scientists" to show how both these models may be possibilities in undergraduate curricula.
Also, the authors mention that in CS2013, "ML is referred to exclusively within a few suggested elective offerings." This is not correct. In the "Intelligent Systems" Knowledge area there are two core hours on "Basic Machine Learning." While we fully grant that two core hours is not a lot of time, the point of including some core hours shows that in 2013 it was already clear that CS majors should get some ML. Moreover, the inclusion of an (admittedly elective) "Advanced Machine Learning" knowledge unit was meant to emphasize that, for students pursuing AI-related fields, they really should be getting more than the minimum. As mentioned throughout CS2013, the core hours are a *minimum* that students should satisfy and that it is expected that most programs will include many hours beyond the core to define a full curriculum. For students doing any sort of AI-related work, they certainly should have more opportunities than just the specified core hours to learn about probability theory and ML. Indeed, to this end, there are six exemplar courses in CS2013 showing different models of covering the Intelligent Systems area, all including many hours beyond the core requirements.
With that said, I reiterate that the authors' point in this article is well taken. ML is an area that will continue to increase in importance and it would benefit CS programs to include more of it in undergraduate curricula. CS2013 was already trying to highlight this trend half a decade ago by including core hours on ML that did not exist in prior curricular guidelines and also creating a much more thorough elective area for Advanced Machine Learning which did not previously exist, along with exemplars of courses to show how actual instantiations of how such material could be incorporated into CS curricula.
We thank you for your thoughtful and detailed reply. We acknowledge the validity of your point that ML is in the Curriculum core. Furthermore, we agree that some schools do offer good coverage of machine learning, but note that the core ACM-IEEE Curriculum does not require them to do so.
The essence of our argument is that Machine Learning is no longer a peripheral topic within CS, but rather has moved to the core of what new computer scientists need to know. From this vantage point, the treatment of ML as an elective topic in joint ACM-IEEE curriculum recommendations is now inappropriate. While exemplar elective descriptions are useful in illustrating how departments could incorporate ML, should they so choose, those exemplars are still electives, not requirements for the core of the curriculum. We hope that future revisions of the Computing Curriculum will move Machine Learning to the core, along with with making concomitant changes to recommendations for probability and statistics education.
The 2013 ACM-IEEE Computing Curriculum divides its content requirements and recommendations into three bins: Core Tier-1, Core Tier-2, and Elective. Page 29 of the Curriculum describes these terms as follows: "computer-science curricula should cover all the Core Tier-1 topics, all or almost all of the Core Tier-2 topics, and significant depth in many of the Elective topics (i.e., the core is not sufficient for an undergraduate degree in computer science)." Later, the document says, "Core Tier-2 topics are generally essential in an undergraduate computer-science degree. Requiring the vast majority of them is a minimum expectation, and if a program prefers to cover all of the Core Tier-2 topics, we encourage them to do so A computer-science curriculum should aim to cover 90-100% of the Core Tier-2 topics, with 80% considered a minimum."
The nature of the Tier-1 vs. Tier-2 distinction is to make Tier-2-listed topics strongly recommended but not required. In other words, the 2013 Computing Curriculum has two classes of electives: the strongly recommended electives ("Core-Tier2") and the recommended electives ("Electives"). All of the Intelligent Systems core content is in the Tier-2 core, constituting 2 hours of the 308 hours of core curriculum that ACM recommends, and that departments could elect to adopt.
A department could elect to not require coursework in Intelligent Systems at all, or choose to exclude just the ML parts, and still satisfy the ACM curriculum requirements. Thus, we believe our claim that ML is elective content within the 2013 Curriculum to be accurate.
Displaying all 2 comments