Teaching Data Science Research Methods to Human Resources Practitioners: Part Three


In our previous blog, we described how we used the interdisciplinarity of data science in a graduate course on research methods that was offered by the Department of Labor Studies at Tel Aviv University. Specifically, we shared how we handled the students' expression of programming anxiety. In this blog, we illustrate how we took advantage of the students' expertise in the application domain component of data science, which in our case was human resources and management.

In the questionnaire distributed at the onset of the first semester, we asked the students, among other things, to describe three challenges they were facing in their work at the time, for which they would be interested in learning new research tools. The students suggested challenges related to the organization and the individual,

Examples of challenges suggested by the students that focus on the organization include:

  • The role of human resources: Perception of the role of human resources in the organization; The role of the recruiter as a key to success and the changes this role underwent in recent years; Recruiting managers to cooperate with human resources processes in the organization; The positioning of human resources in the organization.
  • Organizational culture: Home-work balance; Reinforcing the organization's image in the eyes of its employees; Managing "crises" between managers and employees; Employer branding for small companies; Employee connectivity; Streamlining of processes; Change management.
  • Work mode: Remote, Hybrid, Work from home — yes/no

Examples of challenges suggested by the students that focus on the individual include:

  • Employee evaluation: Employee evaluation (3); Feedback meetings, employee development; Dealing with the wave of silent resignations; Tracking behavior; Employee maintenance and retention; Team performance evaluation methods; Improving employees' achievement of KPI.
  • Employee and manager development: Employee development program; Manager development program; Bettering of human capital in an old and corrupt public organization; Creating mechanisms for performance-based promotion of employees (as opposed to seniority-based promotion); Team development; Employee development (2); Capability enhancement, retention and improvement of employee experience; Salary and benefits; Effective employee remuneration methods; Integration of new occupations into the traditional work world; Management consultation.
  • Workforce diversity: Gender inequality in recruitment; Workforce diversity; Diversity in organizations and its effect on the organization's success; Percent of women in senior positions in the economy.
  • Recruitment: Recruitment for primary positions; Employee recruitment; Tailoring the screening process to the candidate; Recruitment challenges; Employee recruitment and retention; Employee retention (4); Employee screening and recruitment (2)

Not surprisingly, many of the issues the students mentioned addressed employee perseverance in the company from different perspectives and, specifically, the need to develop a model for the prediction of a candidate's suitability for the organization.

So, with respect to this research topic, in a class activity, the students were first asked to define a research problem, research objectives, research questions, and data collection tools. We present two examples of research frameworks that students developed. As can be seen, the students expressed their expertise in human resources by explicitly describing the need to develop a tool for such a predication and the challenges and price associated with the recruitment of candidates who are not suitable for the organization.

Example 1:

  • Research problem: We know of no analytic tool currently available that provides information on the suitability for the organization of newly hired employees.
  • Research objective: To design an ML tool that predicts the suitability of an employee for the organization, with respect to his or her intake process.
  • Research question: Is it possible to predict an employee's suitability for the organization?
  • Data collection tools:
    • Inter-organizational information on every candidate who is hired and found to be suitable for the organization, as well as on those found unsuitable for the organization. This includes the candidates' dry data; dry data on the positions the candidates were hired to fill, such as years of experience, education;
    • Data from the recruiting process, such as grades on interviews and technical exams, suitability for the team, suitability for the manager;
    • Data on the intake process, from both the organization's and the employee's perspectives:
      • From the organization's perspective: learning required to fulfill the role, output, assimilation into the team, relations, etc.;
      • From the employees' perspective: job satisfaction, relations with the manager and team, etc.;
    • Data from feedback conversations held during the intake process: gaps that were revealed, specific areas of responsibility allocated to employees;
    • Data from colleagues who worked with the employee on joint tasks;
    • Transverse data—how long have suitable employees been working in the organization;
    • Data on promotions.

Example 2:

  • Research problem: During the screening process, we hire candidates for certain positions and reject others. We can evaluate the success or failure of those we have hired, but we have no indication regarding those we rejected and whether they would have succeeded in the position to the same or greater extent.
  • Research objective: Create a tool to minimize the risk of rejecting worthy candidates.
  • Research question: How can we avoid rejecting worthy candidates during the screening process, and how can we do so in an effective and rapid manner?
  • Data collection tools: Approach candidates who have been rejected and ask them to complete a questionnaire that inquires whether they have been hired elsewhere and how successful they are in their position there. Data that exists in the company will be used in parallel to estimate the success of the candidates who were hired by the company.

Next, the developing of such a prediction tool suggests using predictive data science modeling techniques. We decided to introduce the KNN algorithm, since it is a simple algorithm that enables the introduction of many data science concepts (Hazzan and Mike, 2022).

Following the introduction of KNN, a small database was constructed in class, whose purpose was to predict the suitability for the organization of individual employees, as well as the probability of their perseverance in the company. To enable the students to work with KNN in a 2D plane, only two of the proposed features were selected.

In the discussion that took place during the construction of this database, the students expressed their expertise in human resources management by suggesting relevant employee characteristics as well as by addressing details that only they, as experts in human resources, can discern.

For example, the students decided they should specify the kind of organization they are dealing with since different characteristics are relevant for different kinds of organizations. They chose a startup with up to 100 employees that was funded three years ago. Since the two chosen features were the number of monthly work hours and the number of absence days (times 9, so that both features are expressed in hours), another working assumption was that the database was constructed to predict employee suitability for a full-time position (182 monthly hours). A lively class discussion ensued on the question of the time period for which the perseverance prediction should be made. It was decided that a 2-year prediction is suitable for this kind of industry and for typical employee preferences.

Table 1 presents the structure of the database constructed in class. As soon as the database is constructed, the suitability of a specific employee (or candidate) to the organization can be predicted based on his or her personal data.

Table 1: The database constructed in class for the prediction of an employee's perseverance in the company

Monthly work hours

Number of absence days

Perseverance in the company

Probability of persevering in the company over the next two years

Working assumptions










Up to 100 employees





Funded three years ago





Full-time position – 182 hours

The construction of such a database clearly requires an understanding of the application domain, which in our case was human resources. Indeed, this fact was highlighted in class to further emphasize two messages:

  1. The students' expertise in human resources is important in order to cope with any problem related to human resources encountered by their organization;
  2. It is crucial that the students have a common language with their scientist and engineer peers in order to discuss, in a meaningful manner, human resources challenges encountered by their organization.


In this blog and the previous one, we reported on activities that we facilitated in a graduate course for human resources practitioners that focused on research methods. The challenge we set out to overcome was the fact that, on the one hand, almost none of the students had any programming experience, while on the other hand, they were experts in human resources management. These activities not only highlight the interdisciplinarity of data science, but further, illustrates the importance of the application domain in data science research; in such cases, we argue, situtated learning—that is, learning through goal-directed activity situated in circumstances which are authentic, in terms of the intended application of the learnt knowledge (Billett, 1996)—is relevant.

Following these anecdotal observations reported on in our last two blogs, we embarked on a comprehensive research project whose objective is to characterize the conceptions and feelings of students in this course with respect to the integration of data science in their graduate degree. Our research will use insights from the sociological and social psychological literature on women in STEM in order to understand the emotional response data science course may ignite among human resources managers. We believe that our observations will contribute to the integration of data science into other application domains in the social sciences as well. 



Billett, S. (1996). Situated Learning: Bridging Sociocultural and Cognitive Theorising. Learning and Instruction 6(3), pp. 263-280.

Hazzan, O. and Mike, K. (March 2022). Teaching core principles of machine learning with a simple machine learning algorithm: the case of the KNN algorithm in a high school introduction to data science course. ACM Inroads 13, pp. 18–25.


Orit Hazzan is a professor at the Technion's Department of Education in Science and Technology. Her research focuses on computer science, software engineering, and data science education. For additional details, see Dafna Gelbgiser is a lecturer (tenure track) at the Department of Labor Studies at Tel Aviv University's Faculty of Social Sciences. Her research examines the sources and patterns of inequality in education and labor market outcomes by race, immigrant status, gender, and social class background. For additional details, see

Join the Discussion (0)

Become a Member or Sign In to Post a Comment

The Latest from CACM

Shape the Future of Computing

ACM encourages its members to take a direct hand in shaping the future of the association. There are more ways than ever to get involved.

Get Involved

Communications of the ACM (CACM) is now a fully Open Access publication.

By opening CACM to the world, we hope to increase engagement among the broader computer science community and encourage non-members to discover the rich resources ACM has to offer.

Learn More