Teaching Data Science Research Methods to Human Resources Practitioners

Due to the increasing importance of data science in human resources management (as expressed by the emerging of new research topics such as people analytics), the Department of Labor Studies at Tel Aviv University decided to integrate the topic of data science research methods into its graduate-level course “Research Methods for Human Resources”. Up until this decision, the course, which is studied by human resources practitioners (managers and recruiters) from various organizations in all three sectors of Israeli industry (public, private and non-profit), focused mainly on survey design, statistical investigations and qualitative research methods. This recent decision opened the opportunity to revisit the course structure and content, and to add new content: data science research methods.

This broader content enables to distinguish between “from theory to practice” research methods (e.g., statistical tests and supervised learning algorithms, such as KNN) and “from practice to theory” research methods (e.g., qualitative methods in general and grounded theory in particular, and unsupervised learning algorithms, e.g., K-means). Such discussions highlight the fact that data science applies the two research paradigms using supervised learning and unsupervised learning, respectively. Furthermore, we discussed interrelationships between research paradigms and kinds of data (quantitative and qualitative), asking questions such as, which research methods are suitable for the analysis of qualitative data and which are suitable for the analysis of quantitative data?

The challenge was to teach data science concepts to students who, on the one hand, have gaps in their computer science background and, on the other hand, are experts in human resources management and practice.

Figure 1 reflects this situation, indicating that computer science is a challenge (red), the application domain is an advantage (green). In this equation, mathematics and statistics are neutral (yellow), since the required mathematics and statistics knowledge was partially addressed in the part of the course that focuses on statistics.

Figure 1. Students’ knowledge in the three components of data science:
Computer science – a challenge,
mathematics and statistics – neutrality,
application domain – an advantage.

In this blog we present the course and explain the importance of teaching data science to human resources practitioners. In the next two blogs, we illustrate how we taught it to this cohort of students with this mix of knowledge levels, and how we coped with this challenge while taking advantage of the interdisciplinarity of data science, as an integration of mathematics and statistics, computer science, and the application domain, in our case – human resources management.

The course objective: The course aimed to expose human resources practitioners to a variety of research methods as well as to their possible applications in their professional life. Thus, for example, with respect to data science, we aspire to teach the human resources practitioners a language that would help them communicate with their scientist and engineer peers and enable them to take an active role in conversations on the design of human resources tools that are based on data science. Such a discourse can improve both the organizations’ performance as well as the position of human resources practitioners in the organization.

The course structure: The course was a yearly course, consisting of 2 weekly hours in the first semester and 3 weekly hours in the second semester (or more accurately, 6 weekly hours for half of the second semester). This format led us to integrate most of the active learning part of the course into the second semester. The first semester was divided into three parts: statistical research methods, qualitative research methods, and data science research methods.

The course requirements: The final project of the course was a relatively large-scale research project on a topic that the students chose according to the needs of their organization. The project was conducted in teams and was developed gradually through several smaller tasks, in each of which the students designed one research tool used to investigate their research topic. While the design of the research tools took place in the first semester, data collection and data analysis are planned to be executed in the second semester.

The course participants: Sixty students, holding a variety of human resources related positions, were enrolled in the course. Only 10 percents of students are men, reflecting the female-dominated demographics of human resources practitioners in Israel. Most of them had no previous programming experience (specifically, in Python), and two thirds of them had either never heard of data science before or had only heard about it but had no practical experience with it. 30% of the students had been employed in their organization for less than a year, 30% – between one to five years, and the rest had been in their organization for over 5 years. One third of the students came from small organizations (less than 50 employees), about one third came from large organizations with more than 1,000 employees, and the rest – from organizations with 50-1000 employees (30% from organizations with 51-200 employees and 10% from organizations with 200-1000 employees).

To meet the course objectives, we present a list of topics that illustrate the relevance of data science to human resources management. Due to time restrictions, only some of them were mentioned in the course for interested students who wish to further explore them.

What is data science? The interdisciplinarity of data science, data science as a research method, the importance of the application domain and the context in which data science is used, and the profession of human resources from the perspective of data science.
The data science professions – data scientists, data engineers and data analysts: What does each professional do? Differences between these professions, and kinds of positions require data science knowledge.
Needed skills for data scientists, e.g., thinking on different levels of abstractions, storytelling (see Rakedzon and Hazzan, 2022), critical thinking, and teamwork.
Cognitive biases in data science, the importance of the application domain for overcoming these biases, and the importance of being aware of their existence for the sake of interpretability. For a comprehensive review of possible cognitive biases in data science, see Kliegr, Bahník and Fürnkranz (2021). For specific implications of the base rate neglect cognitive bias, see Mike and Hazzan (2022) and of the domain neglect cognitive bias, see Mike and Hazzan (in press).
Computational thinking: What is an algorithm? Algorithm performance metrics and basic data science algorithms. Specifically, the focus is placed on the KNN algorithm as a simple machine learning algorithm for teaching core principles of machine learning (Hazzan and Mike, 2022).
Ethics in data science, in general, and in data science applications in human resources management, in particular.
Data culture and data-driven organizations (see Díaz, Rowshankish, and Saleh, 2018).
Human resources topics for exploration with data science tools: People analytics and organizational networks analysis.
The importance of data science for the professional development of human resources practitioners, e.g., by improving the talent acquisition process of their organization and consequently, their potential contribution to their organization.
Data science in a changing world: Industrial revolutions, Industry 4.0 and IOT.

To illustrate the relevance of data science to the professional development of the students (Topic #9 above), we examine data science as a research method in human resource management on two levels – personal and organizational:

Personal level: For example, in our next blog, we will focus on students’ programming anxiety, which is related to the computer science component of data science. For this purpose, we use students’ own data which we collected via questionnaires and shared with the students anonymously.
Organizational level: On this level, we base on students’ expertise in human resources (the application domain component of data science). For example, in our second next blog, we will describe how, based on students’ suggestions for relevant research topics for their own organizations, we facilitated a class discussion about the KNN algorithm.

With this background in mind, in the next two blogs, we describe course activities we conducted that helped us cope with the students’ mixed levels of knowledge of the different components of data science.

References

Hazzan, O. and Mike, K. (2022). Teaching core principles of machine learning with a simple machine learning algorithm: The case of the KNN algorithm in a high school introduction to data science course, ACM Inroads 13(1), pp. 18-25. https://dl.acm.org/doi/10.1145/3514217

Kliegr, T., Bahník, Š. and Fürnkranz, J. (2021). A review of possible effects of cognitive biases on interpretation of rule-based machine learning models, Artificial Intelligence, Volume 295. https://doi.org/10.1016/j.artint.2021.103458

Mike, K. and Hazzan, O. (in press). What is common to transportation and health in machine learning education? The Domain Neglect Bias, the IEEE Transactions on Education.

Orit Hazzan is a professor at the Technion’s Department of Education in Science and Technology. Her research focuses on computer science, software engineering, and data science education. For additional details, see https://orithazzan.net.technion.ac.il/. Dafna Gelbgiser is a lecturer (tenure track) at the Department of Labor Studies at Tel Aviv University’s Faculty of Social Sciences. Her research examines the sources and patterns of inequality in education and labor market outcomes by race, immigrant status, gender, and social class background. For additional details, see https://english.tau.ac.il/profile/dgelbgiser.