In computing, faculty play many critical roles, including training the next generation of researchers, advancing scientific research across a diverse array of computing topics, and translating that research into practice. The composition of the academic workforce thus shapes what advances are made and who benefits from them,20,21 in part because demographic diversity in science is known to accelerate innovation and improve problem solving.17,31
Key Insights
- Women and people of color remain dramatically underrepresented among computing faculty, and improvements in demographic diversity are slow and uneven.
- But computing’s subfields exhibit wide differences in faculty gender composition, from a low of 13.1% women in Theory of Computer Science to a high of 20.0% in Human-Computer Interaction. Faculty working in computing subfields with more women also tend to hold positions at less-prestigious institutions.
- There has been steady progress toward gender equality in all subfields, but subfields with the greatest faculty representation at prestigious institutions tend to be approximately 25 years behind the less-prestigious subfields in gender representation.
- These results illustrate how the choice of subfield in a faculty search can shape a department’s gender diversity.
Despite a continued emphasis on broadening participation, women faculty in the U.S. remain underrepresented relative to women’s share of the U.S. population by more than a factor of two, and Black, Hispanic, and Native faculty by more than a factor of five.37,40 Women’s underrepresentation among computing researchers also persists internationally. For example, women are estimated to comprise less than 10% of contributors to international computer science journals.25
Explanations for this persistent pattern generally fall into two categories. On one hand, there are generational problems, in which faculty diversity changes slowly because it takes many years for diversity increases at the earliest stages of training to propagate up to more senior levels.16 On the other hand, there are structural and social climate problems in the U.S.,1 in which members of underrepresented groups who aspire to or have a faculty career are pushed or pulled out of the community, which may counteract efforts to address generational problems. In concert, these two effects may lead to a persistent overrepresentation of majority groups5 despite efforts to the contrary.
We consider a third class of problem, which exists because most faculty are hired via searches that focus on a particular subfield of computing—for example, artificial intelligence (AI). As a result, field-level demographic dynamics such as gender, racial, and socioeconomic representation are in fact driven by diversity differences across computing’s subfields and the representation of those subfields among the suppliers of future faculty.8 For example, faculty searches in subfields with fewer women than other subfields are less likely to increase a department’s gender diversity. Similarly, if more racially or gender-diverse subfields are underrepresented at elite departments—the ones that produce the majority of future faculty8—then the diversity of those sub-fields is unlikely to be reflected in new faculty hires. While there is evidence that job searches that do not focus on a particular subfield can attract more diverse candidates,6 most searches in computing remain subfield-specific.
In practice, faculty hiring closely follows a prestige hierarchy, in which more prestigious departments produce a disproportionate share of all computing faculty,8 and a department’s position within this hierarchy can be inferred directly from where its graduates were hired as faculty.8,10 In this way, high-prestige departments exert a correspondingly large influence over the field’s demographics,38 and efforts to understand patterns, trends, and causes of demographic diversity in computing must account for the effects of prestige.
What are the implications of subfield structure and prestige in faculty hiring for diversity and demographic trends in computing? Here, we address this question by studying the intersections of gender, race, socioeconomic status, prestige, and subfield structure in computing. Our analysis uses a comprehensive database of training and employment records for 6,882 tenure-track faculty from 269 Ph.D.-granting computing departments in the U.S., linked with 327,969 publications. We first quantify variation in gender, race, socioeconomic status, and prestige across computing’s sub-fields. We then develop simple forecasts of future gender diversity for the entire field, which account for diversification trends over time at the subfield level. We close with a discussion of the specific patterns and trends in faculty diversity we observe and how they relate to more general patterns in academia, and we highlight a few specific implications of our findings for long- and short-term efforts to increase demographic diversity among computing faculty.
Data
Our analysis spans 6,882 tenured or tenure-track faculty at U.S. Ph.D.-granting computing departments between 2010 and 2018. It includes faculty names, academic rank, institution, and the year and institution from which they received their Ph.D. training. The underlying data is derived from a larger census-style dataset obtained under a data use agreement with the Academic Analytics Research Center (AARC). For this study, we define the field of computing to include computer science departments and joint departments between computer science and information sciences, computer engineering, and other closely related departments.
To these basic education and employment variables, we add information on gender, race, childhood socio-economic status, faculty subfield, and institutional prestige using a combination of institutional covariates, automated tools, detailed publication information, and a large survey of faculty, which we describe below.
Gender. We use a set of name-based tools to match faculty with the genders that are culturally associated with their names (see the online appendix at https://bit.ly/3TcdKr0 for details). This methodology assigns only binary (woman/man) labels to faculty, even as we recognize that gender is nonbinary. This approach is a compromise due to the technical limitations of name-based gender methodologies and is not intended to reinforce the gender binary. We assess the reliability of our gender-labeling methodology using self-labeled genders from a representative survey of computing faculty we conducted in 2017. Comparing these gender labels, our name-based methodology agrees with self-identified genders 97% of the time (N = 985).
Race and childhood socioeconomic status. Faculty race is known only for the 608 faculty (8.8%) who self-reported their race in our survey. Our survey question’s design follows the U.S. Office of Management and Budget’s standards for collecting race data, which facilitates comparisons between the computing professoriate and aggregated U.S. census data. We recognize that these categories are imperfect, socially constructed representations of racial, ethnic, and place-of-origin identities. For example, the census category “Asian” is broad and includes South Asians, Southeast Asians, and East Asians, among others, each of which themselves contain diverse groups.
In our survey, 633 faculty (9.2%) report the highest level of education achieved by their parents or legal guardians, which we use as a simple indicator of faculty’s childhood socioeconomic status, following Morgan et al.27
Subfields. We assign each professor to a distribution over computing research subfields, based on their publications in the DBLP computer science bibliography. Using unique matches to faculty names, we algorithmically linked 5,472 faculty (80%) to their listed publications, leading to a set of 327,969 author-linked publications.
Publications were then assigned to computing research areas using a topic model of paper titles. We first manually identified 35 computing research areas grouped into eight computing sub-fields using domain knowledge and advice from subfield specialists. For each research area, we algorithmically extracted a set of “anchor words” that are highly informative of publication topic as measured by mutual information. These anchor words were then used to parameterize a topic model, guiding the clustering of publication titles to be aligned with our intended delineation of research areas. We then checked the topic assignments by manually verifying that the final, larger set of words the model learned to associate with each topic aligned with commonly agreed-upon computing research areas, and that the assigned research areas for a set of well-known computing scientists agree with their known expertise. We elaborate on this process in the online supplementary material.
While computing research can be divided into a multiplicity of fine-grained topics, faculty hiring typically takes place at a higher level. For example, departments aiming to hire in the subfield of human-computer interaction (HCI) may consider applicants who specialize in any of a variety of its nested research areas. Under our taxonomy for computing research, each of the 35 identified research areas belong to exactly one of the eight subfields: computational learning, systems, theory of computer science, numerical and scientific computing, HCI, inter-disciplinary computing, programming languages, and software engineering.
Because faculty often publish in a wide variety of areas, we assign a distribution over subfields to each professor, in proportion to the share of their publications classified into each subfield. Under this assignment, faculty belong to multiple subfields, meaning that our subsequent estimates of subfield sizes can take on non-integer values. This “soft” assignment scheme better captures the range of research topics that faculty work on across the boundaries of multiple subfields, compared to a “hard” assignment into a single subfield. We consider the hard assignment scheme in the online appendix.
Institutional prestige. There are many ways to quantify institutional prestige in computing, including authoritative rankings, such as the U.S. News & World Report rankings of computer science graduate departments or the older National Research Council (NRC) rankings. Such rankings have been widely criticized for their subjective selection of institutional characteristics and for largely measuring only the inputs to the educational and research process.2 In contrast, publication-based approaches, like that of CS-Rankings.org, at least measure outputs of the education and research process but nevertheless depend on subjective choices and values, and they are sensitive to pathologies in the academic publishing system.14
We use an alternative, output-based ranking, based on institutional placement power, which quantifies prestige according to how well an institution can place its graduates as faculty at other institutions.10 This avoids many of the weaknesses of other measures of institutional prestige. The prestige rankings produced by this approach strongly correlate with other computer science rankings, including U.S. News & World Report, NRC, and related methods based on faculty hiring,8,10 and they are representative of hiring patterns across all eight computing subfields (which we show in the online appendix.) This indicates that all these measures are capturing aspects of the underlying social processes that drive measures of prestige.
Demographic reference data. Finally, we compare the demographic composition of current computing faculty to two reference populations: the U.S. population and the population of U.S. computer science Ph.D. recipients. We use the U.S. Census and the National Science Foundation’s Survey of Earned Doctorates (SED)30 to reconstruct the demographics of these reference populations.
Most current computing faculty received their Ph.D. within the past 40 years, but the demographics of these two reference populations have changed substantially during that time. A simple comparison of the diversity of current faculty to the diversity observed in a reference population at some specific point in time can be misleading. Instead, we construct a time-adjusted reference population based on the demographics of the year each professor received their degree.
For the U.S. population, we match each professor to the U.S. census year nearest to the year of their Ph.D. and construct from the set of such years a weighted-average demographic distribution of the U.S. Similarly, we calculate a weighted-average demographic distribution of U.S. computing Ph.D. recipients by matching faculty to the closest year recorded by the SED’s records of computer and information sciences doctoral recipients, which date back to 1980. While most faculty match to the survey for their exact Ph.D. year, 11% match to 1980, the earliest SED year, meaning they received their Ph.D. in or prior to 1980. This procedure will tend to slightly overestimate the true diversity in the reference population. Using this methodology, we also construct reference populations for each computing subfield, which account for different age demographics across subfields.
Results
Using this augmented data, we first quantify the gender, racial, and socio-economic representation of faculty across computing subfields and provide a quantitative view of the demographic composition at stages prior to becoming faculty. We then ask if computing departments’ choices of which subfields to hire in is predictive of overall departmental gender diversity. Then, we measure differences in subfield representation across the hierarchy of institutional prestige and quantify how subfield prestige covaries with subfield gender diversity. Finally, we use trends in subfield diversification and growth over time to forecast the future gender diversity of the field as a whole.
Gender, race, and socioeconomic status. We find wide differences in gender composition across the eight computing subfields (see Figure 1a and the table here; X2 = 20.65, N = 4421, p < 0.01), and the ranging from theory of computer science (13.1% women) and programming languages (14.2%) to interdisciplinary computing (19.7%) and HCI (20.0%). No computing subfield is close to being representative of gender composition in the U.S. reference population (51.1% women). However, the proportions of women faculty in both interdisciplinary computing and HCI modestly exceed the proportion we would expect based on the time-adjusted share of women Ph.D. recipients (19.0%). This subfield-level heterogeneity suggests that gender diversity problems are not monolithic, and some subfields may address them more successfully than others.
Figure 1. Gender and racial representation among faculty by computing subfield.
Table. Number of tenured or tenure-track faculty and corresponding gender compositions for eight computing subfields, along with gender compositions of two reference populations, the population of computer science Ph.D’s,29 and the U.S. population, each adjusted for changes over time over the years that faculty were trained.
In contrast, we do not find significant differences in racial composition across subfields (see Figure 1a; X2 = 30.88, N = 547, p = 0.67). Rather, across all subfields, we find that some racial groups are systematically underrepresented among faculty, while others are overrepresented. To better elucidate these differences across groups, we decompose the professional pathway to becoming faculty into two stages.
The first stage spans all steps up to and including obtaining a Ph.D. Hence, by comparing the proportions of different racial groups in the reference U.S. population to those in the reference population of recipients of computing Ph.D’s, we may quantify the relative rates of racial enrichment or depletion over this stage. Over this first stage, we find that White and Asian representation is enriched by factors of 1.1 and 5.0, respectively, while Black, Hispanic, and Native representation is depleted by factors of 4.5, 4.8, and 4.0 (Figure 1b). For comparison, women’s representation at this stage is depleted by a factor of 2.7.
The second stage spans all steps between obtaining a Ph.D. and becoming faculty in a computing department. By comparing the racial proportions of a Ph.D.-recipient reference population with those of our faculty population, we can quantify the racialized rates of progression into the faculty workforce. Over this second stage, we find that White representation is depleted relative to the Ph.D. recipients, perhaps because White Ph.D’s are less likely to remain in academia (for example, choosing positions in industry) or because they are less likely to receive and accept a faculty position. The enrichment of White representation in the first stage of the pathway to becoming faculty is largely compensated by their depletion in the second stage, so that White representation among computing faculty is very close to expected levels, given the overall U.S. population. Conversely, Asian representation is enriched in the first and second stages, leading to a substantial overrepresentation of Asian faculty in computing relative to the U.S. reference population. Black, Hispanic, and Native representation sees no significant enrichment or depletion in the second stage.
This evidence suggests that the largest systematic source of racial under-representation occurs in the first stage of the pathway to becoming faculty, prior to the transition from Ph.D. to faculty. This first stage includes graduate admissions and retention, which are stages known to magnify racial disparities.26,32,40 We note that the data we use in this study is not equipped to determine the causes of the observed population-level patterns, but observing these patterns nevertheless helps to quantify how demographics change along the professional pathway.
Past analysis found that computer science faculty tend to come from highly educated families and are between 14.5 and 28.8 times more likely to have at least one parent with a Ph.D. than the general U.S. population. Faculty in the category of high socioeconomic status are also more likely to hold a position at a prestigious institution (faculty at institutions ranked in the top 20% by U.S. News & World Report are 57.4% more likely to have a parent who holds a Ph.D. than faculty at the least-prestigious institutions.27) In the online appendix, we examine childhood socio-economic status, as measured by parental educational attainment, and find no significant differences across subfields (X2 = 9.91, N = 570, p = 0.99).
Faculty at the intersection of under-represented identities are noticeably absent within our faculty sample. Black, Hispanic, and Native men comprise 3.3% of male faculty, while Black, Hispanic, and Native women comprise only 0.2% of female faculty. These small proportions preclude a detailed inter-sectional analysis. We return to this point in the discussion.
Departments. Most faculty are hired via searches that focus on a particular subfield of computing—AI, for example. Subfield choice for a search may be driven by various factors, including practical needs related to the department’s curriculum or strategic goals related to its research ambitions—for instance, to build on existing areas of strength or to build up a less-established area. Grouping the faculty in our data by department, we find that subfields have varying representation within computing departments. The largest subfield overall focuses on systems (N = 1486) and tends to include the largest share of faculty within a typical department (mean 27%). In contrast, the smallest subfield focuses on programming languages (N = 181; see the accompanying table) and likewise, it tends to include the smallest share of faculty within a typical department (mean 3%). While most departments have some representation in each of the eight subfields, there are nevertheless departments that exhibit an unusually high degree of subfield specialization (see Figure 2), particularly when universities form departments dedicated to specific subfields. For instance, Carnegie Mellon University’s Machine Learning department has the highest concentration of faculty in computational learning among all departments with 10 or more faculty in our dataset. Similarly, the University of Washington’s Human Centered Design and Engineering (UWHCDE) department has the highest concentration faculty studying HCI.
Figure 2. Subfield representation across computing departments.
Some of the most gender-diverse departments heavily specialize in subfields that have more women researchers. For example, the UWHCDE and Rochester Institute of Technology’s Interactive Games and Media School are among those with the highest representation of women in our dataset and are also the most specialized in HCI and interdisciplinary computing—the two subfields with the highest proportion of women faculty (see the accompanying table). These examples highlight a connection between a department’s particular subfield hiring strategy and the observed gender compositions of their faculty. In the online supplementary material, we show that department subfield compositions can meaningfully improve predictions of their gender compositions.
Prestige. Computing subfields correlate with prestige. On the high end, faculty in programming languages are 5.9 times more likely to be found in the most-prestigious departments than in the least-prestigious departments, while on the low end, software engineering faculty are only 2.3 times more concentrated at high-prestige departments (see Figure 3a). In fact, subfield prestige correlates with subfield gender representation (see Figure 3b), such that more male-dominated subfields tend to have a greater share of their researchers located at higher-prestige departments. In other words, men tend to be overrepresented within more prestigious subfields, and women are more likely to be working in less-prestigious subfields.
Figure 3. Analysis of subfield faculty prestige.
Reflecting this pattern, we find strong correlations between a field’s fraction of women faculty with both the average departmental prestige for a faculty working in a given subfield (Pearson’s R = –0.95, p = 0.0003) and the fraction of faculty in the top 50 ranked institutions (Pearson’s R = –0.88, p = 0.004, see Figure 3b). Even after adjusting for a professor’s Ph.D.-granting institution’s prestige, publication productivity, and gender, a multiple linear regression in the online appendix shows that faculty who study more male-heavy topics are still more likely to hold positions at higher-prestige departments (Pearson’s R = –0.82, p = 0.01), such that faculty fully specialized in the most prestigious subfield (programming languages) are expected to be located 12 ranks higher than faculty fully specialized in the least prestigious subfield (HCI). Because the most prestigious institutions train the majority of future faculty,8,38 the under-representation of gender-diverse subfields among these institutions may act as a structural barrier to the gender diversity of computing as a whole.
We note that in this model, the coefficient associated with faculty gender is not statistically distinguishable from 0 (p = 0.13). This fact suggests that both women and men in subfields with more women are expected to hold faculty positions lower in the prestige hierarchy. See the online supplementary material for more details on these regression findings, including regression tables.
Trends. Subfield size and demographics have changed substantially over the past 40 years. We can estimate the temporal dynamics of these variables by assigning current computing faculty to cohorts, according to the year they received their Ph.D., and then tracking how demographic and subfield distributions change over cohorts. Many faculty do not start their first faculty position until several years after completing their Ph.D., a pattern which would induce systematic undersampling of the most recent cohorts. To control for this effect, we report size and demographic estimates only up to the 2012 cohort. We then forecast subfield sizes and demographics 15 years into the future by extrapolating the historic trends in subfield faculty hiring over time and the yearly gender compositions of new hires.
Analyzing this data, we find that subfields’ relative sizes have remained relatively stable over time (see Figure 4a), even as the field has grown substantially in absolute terms. Between 1990 and 2012, the largest increase in relative size is in HCI (+1.5%), and the largest decline is in theory of computer science (−2.0%). Despite enormous parallel changes in the field of computing since the year 2000, trends in relative subfield size appear largely stable over the past 20 years (Figure 4a).
Figure 4. (a) Yearly subfield size relative to computing for 1990 to 2012 and (b) cumulative fraction of faculty who are women from 1990 to 2012 for each of the eight subfields, along with 95% confidence-interval forecasts, projected out to 2027, showing that we may expect the bimodal distribution of gender across subfields to continue into the foreseeable future.
All computing subfields have increased their representation of women faculty over time, though at varying rates, and some subfields are substantially closer to parity than others (see Figure 4b). Women’s representation in each annual faculty cohort has increased an average of 0.43% per year. However, because of the generational nature of the field’s composition, these annual increases in each additional cohort’s gender diversity accumulate to a more modest field-level average increase of 0.2% per year.16 These rates of change are in close agreement with past estimates for computing.18,38 Between 1990 and 2012, programming languages and theory of computer science have increased women’s representation by the lowest relative amounts—3.4% and 3.7%, respectively—while interdisciplinary computing and HCI have increased by 5.9% and 6.5%, respectively. While increases in women’s subfield representation have been relatively steady over time, the slow pace predicts only two fields—interdisciplinary computing and HCI—as being likely to reach 25% women faculty by 2027, assuming historical trends continue.
Our data indicates that women’s representation in the four most-diverse subfields (HCI, interdisciplinary computing, software engineering, and computational learning) is about 25 years ahead of women’s representation in the four least-diverse subfields, and this gap is projected to persist over the next 15 years.
Discussion
Using comprehensive data on the education, employment, research sub-field, and demographic variables of tenure-track faculty at U.S.-based, Ph.D.-granting computing departments, we quantified the intersection of multiple forms of demographic diversity by computing subfield, producing a detailed picture of past and likely future trends and inequalities.
Although we find little variation in racial representation across subfields, our analysis reveals several interesting patterns about the pathway to becoming faculty by comparing racial representation among current faculty to that of computing Ph.D’s and the U.S. population as a whole (Figure 1b). These comparisons divide the faculty pathway into two stages. The first spans all steps prior to obtaining a Ph.D. in computing, including doctoral training, and the second spans all steps between obtaining a Ph.D. and becoming faculty in a computing department. Over the first stage, White and Asian representation is enriched, while Black, Hispanic, and Native representation is depleted. Over the second stage, White representation is depleted, while Asian representation is further enriched, and Black, Hispanic, and Native representation remains low without substantial change.
These patterns are consistent with racialized factors influencing retention at multiple points in the pathway to becoming faculty in computing, and the direction and magnitude of that influence is not necessarily uniform across stages. For instance, the up-and-then-down pattern of White representation indicates a substantial decrease in White retention after the Ph.D. One plausible explanation for this decrease is the availability of attractive non-academic careers for computing Ph.D’s— for example, in the computing industry. White Ph.D’s may also be more likely to pursue tenure-track jobs at non-Ph.D.-granting institutions, which are not included in our data. The down-and-then-stable pattern of Black, Hispanic, and Native representation indicates that a large portion of the systemic effects occur prior to receiving a Ph.D. in computing, for example, in graduate school and college—a pattern that is well-documented by studies of race in academia.24 In contrast, Asian representation in the second stage (post-Ph.D.) increases by about the same amount that White representation decreases, suggesting additional racialized differences in achieving a faculty career after a Ph.D.
Patterns underlying the underrepresentation of women computing faculty are similar to patterns of Black, Hispanic and Native underrepresentation, where the largest share of depletion occurs prior to receiving a Ph.D. in computing (Figure 1). In contrast to racial diversity, we find that gender diversity varies substantially across subfields, even as overall gender diversity also remains low (16.7%, see Table), and is increasing at only about 0.2% per year. The subfields of HCI, interdisciplinary computing, computational learning (which includes AI), and software engineering exhibit substantially greater gender diversity among faculty (19.0%). At the current rate of gender diversification, this level of gender diversity places them roughly 25 years ahead of the remaining four subfields (14.2%).
Past work has developed a number of interacting explanations for gender, racial, and intersectional underrepresentation among U.S. computing faculty, including culturally pervasive gendered and racialized stereotypes, which may shape career decisions;1,22 inhospitable educational and professional climates;15,33,34 structural disparities in education and socioeconomic status;23,27 and the unequal impact of parent-hood.7 Our results do not identify any specific underlying mechanisms for differential representation and instead quantify patterns in ways that support further research in this direction. Our results suggest that more work is needed to understand how interactions between industry and academia shape the demographic diversity of computing faculty. These interactions are likely important early in the faculty pathway and later—for example, where gendered or racialized hiring rates of senior faculty into industry can effectively increase the demographic diversity in academia.19 The four most gender-diverse subfields represent fully half (50.4%, see the accompanying table) of all computing faculty. They are also substantially underrepresented among high-prestige departments (see Figure 3), which exert substantial influence over field-level norms, culture, and research agendas due to their status and their role in training the majority of computing faculty.8,28 This difference holds even after controlling for factors such as doctoral institution prestige, productivity, and gender itself, such that faculty working in more gender-diverse subfields work at institutions, on average, 12 ranks lower than faculty working in less gender-diverse subfields. This gender-prestige pattern illustrates a kind of systemic devaluing of women’s contributions to computing overall, and the substantial size of the more diverse but less prestigious group of subfields raises the question of whether they are adequately represented among departmental curricula and degree requirements. Realigning institutional practices to reflect the true diversity of computing’s subfields may help institutionalize efforts to broaden participation.
Faculty diversity changes slowly because it takes many years for diversity increases at the earliest stages of training to propagate up to more senior levels.
Our retrospective analysis of subfield growth and gender shows that gender diversity is increasing at similar rates across all eight computing subfields. However, current gender diversity is essentially bimodal, with four of the eight subfields (HCI, interdisciplinary computing, software engineering, and computational learning) being substantially more gender diverse than the other four (systems, numerical and scientific computing, programming languages, and theory of computer science). Our forecasting exercise indicates that these differences are likely to continue into the foreseeable future (Figure 4), even as some of the less gender-diverse subfields appear to be shrinking (theory of computer science) while some of the more gender-diverse subfields are growing (computational learning and HCI). As a result, the overall trend of slow gender diversification is highly robust to minor changes in hiring patterns among subfields. Our methodology for analyzing demographic patterns and trends among computing subfields is general and could be applied to any other academic field, given an appropriate subfield taxonomy. Applied to many fields, this approach could elucidate the systemic role that subfields play in driving field-level demographic patterns and help identify new insights into field-specific systematic barriers to broadening participation.
There are limitations to our methodology. Although DBLP provides good general coverage of computing publications, our analysis inherits DBLP’s publication inclusion bias over areas of computing, which is largest in older and in more interdisciplinary areas of research.39 We also use the year in which a faculty received their Ph.D. to estimate the relative sizes and gender compositions of subfields over time. This assignment assumes that faculty in our sample started their faculty jobs immediately after their Ph.D’s and that they represent faculty who left jobs prior to 2010, the first year observed in our data. As a result, we are likely underestimating the historic participation of women in computing (Figure 4b), because women faculty have historically left their positions at higher rates than men.7 This historical underestimation would imply that our estimates of gender-diversification rates are likely upper bounds. Our data is limited to tenure-track faculty employed by Ph.D.-granting institutions and do not support an analysis of contingent faculty, who make up a growing share of faculty, or faculty at non-Ph.D.-granting institutions, who may exhibit different demographic compositions. We do not separately analyze faculty who hold multiple minority identities. Past research shows that people at the intersection of multiple identities often experience discrimination and exclusion beyond what would be expected from simply adding the individual elements of their identities.9 The small sample of faculty for whom we have race data limits our ability to conduct a detailed quantitative analysis of the least-represented groups, in particular, Black, Hispanic, and Native faculty, or to conduct intersectional analyses.
We now return to the idea that explanations for slow rates of diversification in computing can be divided into categories. On one hand, generational problems introduce a lag in faculty diversity, where, if the pathway to faculty positions were to suddenly become equitable, it would still take many years for this change to manifest as equitable representation among faculty.16 On the other hand, there are structural and social climate problems that tend to push or pull members of underrepresented groups away from faculty positions, sometimes in different magnitudes and directions depending on the career stage.1
Our findings identify and quantify a third type of explanation, where the diversity of computing is driven by diversity differences across its subfields. The computing community must explore several questions before these findings can be translated into concrete policy recommendations. For example, the differences in diversity and prestige that we find across the subfield structure of computing suggest a simple departmental strategy for enhancing the probability of hiring women faculty: increase hiring in the subfields with greater gender diversity, such as HCI and interdisciplinary computing (20% women). While this strategy may be an effective way to increase women’s representation for computing as a whole, it is unlikely to reduce the heterogeneity in gender diversity across subfields.
Future research could help shape how we design policy to increase diversity in computing, by identifying the causal mechanisms driving gender differences across subfields. On one hand, some subfields may be particularly inhospitable to women, effectively pushing them away. In this case, policy should aim to make these subfields more accessible and inclusive. On the other hand, women may, on average, be more interested in topics belonging to some subfields over others—that is, some subfields exert stronger pulls.21 In this case, policy should respect the validity of women’s interests by expanding the subfields that have greater pulls instead of pushing to increase representation where there is less interest.
An additional causal understanding of the relationship between subfield gender diversity and subfield prestige would provide further context for policy recommendations. The tendency for male-dominated areas of work to be assigned greater prestige, and hence for areas of work with greater gender diversity to be less valued, is not a phenomenon special to computing. Gendered patterns are also observed in medical subspecialties, in different areas of law,4,11 and even in less-specialized positions.3 One explanation of this pattern posits a direct causal relationship between an occupation’s diversity and its prestige.13 If this explanation applies to computing, then it may not be feasible to simultaneously increase both a subfield’s prestige and its gender diversity without making foundational changes to collective values and beliefs. This relationship remains untested in computing but is an important question for diversity because the departments at the top of the prestige hierarchy tend to train the majority of future computing faculty.8
A subfield-focused hiring strategy alone is unlikely to increase racial or socioeconomic diversity, as we find that these faculty characteristics do not appear to correlate with subfield in our sample. Different approaches will be needed to improve representation along these dimensions, and our findings suggest these should include interventions that increase representation among Ph.D. recipients. Some programs are available as models for future work in this direction, including the Distributed Research Experiences for Undergraduates (DREU) and the Collaborative Research Experiences for Undergraduates (CREU), two funded research programs intended to broaden participation in computing, with participants twice as likely to attend graduate school than standard REU participants.35 Academic institutions are also turning to the University of Maryland, Baltimore County’s Meyerhoff Scholars Program as a model for their own scholarship programs, which have been shown to markedly improve undergraduate retention and STEM graduate-school matriculation for underrepresented minorities.12 Doctoral programs can additionally establish partnerships with minority-serving institutions (MSIs) as modeled by the highly successful Fisk-Vanderbilt Masters-to-Ph.D. Bridge Program, which substantially contributes to the number of Ph.D’s earned by underrepresented minorities in a number of STEM fields, but has yet to expand to computing.32 These are a few examples of programs that can be implemented or expanded to additional academic institutions to increase accessibility for underrepresented groups, in conjunction with other efforts to mitigate the social climate problems in computing.26
For computing departments to benefit from the innovative scientific research that diverse scientists produce,17 diversity and inclusion efforts must contend with generational, social climate, and subfield problems. For example, structural improvements to recruitment, such as those suggested here, are by themselves no guarantee that diverse faculty will be adequately included and supported once they begin their faculty jobs.34,36 Cultural change can also be slow and does not guarantee diverse representation among faculty. The empirical patterns and trends shown in this article provide new insights that can inform and support multifaceted efforts to make computing more diverse, equitable, and inclusive.
Acknowledgments
The authors thank Bor-Yuh Evan Chang, Leysia Palen, Ben Shapiro, Huck Bennett, Joshua Grochow, and Jed Brown for helpful comments, and all survey participants for providing their valuable time. Funding: This work was supported in part by National Science Foundation Award SMA 1633791 and an Air Force Office of Scientific Research Award FA9550-19-1-0329. Competing interests: None.
Figure. Watch the authors discuss this work in the exclusive Communications video. https://cacm.acm.org/videos/subfield-prestige
Join the Discussion (0)
Become a Member or Sign In to Post a Comment