Constructivist learning theory says learning is grounded in and constructed from prior understanding and belief. In order to explore the prior understanding and belief of beginning computer science students, the Commonsense Computing Project (http://commonsensecomputing.cs.siue.edu) asks them to answer CS questions on the first day of their first course. Here, we report on the related commonsense knowledge as demonstrated in the area of concurrency. Long considered an advanced topic in the CS curriculum, concurrency is rarely5,16 considered appropriate for introductory students; on the other hand, it is increasingly at the forefront of CS; obvious instances involve multicore processors on the desktop, distributed computing resources for computationally and data-intensive problems, and network-based games. Whether through implicit structures in programming languages or explicit structures in design, university students must demonstrate an understanding of the various approaches to and implications of concurrency.
Given the importance of the topic, what relevant knowledge and abilities do students without prior computing instruction (“beginners”) bring to their first class? Do they differ from more advanced students with no explicit teaching about concurrency?
Along with their commonsense understanding of concurrency, this article explores the solutions they provided on the first day of an introductory CS course to a problem devised by Yi-fat Ben-David Kolikant of Hebrew University of Jerusalem2 to more advanced students on the first day of a concurrency course in an Israeli high school. We’ve used the responses to address two questions:
- Are beginners able to recognize the key concurrency issue regarding use of a shared resource? and
- How do answers given by beginners compare with answers given by more advanced students, as reported by Ben-David Kolikant?
Of the 66 students in our 2006 study, 97%, or 64, could identify a race condition and 71% provided a solution we considered reasonable. The most common technique they reported for avoiding race conditions was subdividing the resources. Our study provides a basis for a constructivist approach to teaching concurrency, allowing instructors to build on these ideas. Moreover, the study was independent of technological assumptions, making it relevant regardless of technology or pedagogical practice used.
Background
We have conducted a number of commonsense computing projects over the past five years, all with the same basic goal of identifying the commonsense knowledge beginners bring to the study of CS. Our foundation was the constructivist theory about how people learn, starting with what they already know and building knowledge on that foundation, rather than receiving it passively from an instructor. Each learner’s background, culture, and previous knowledge define his/her starting point. Bransford et al.4 argued that learning must engage students’ preconceptions to be effective.
The importance of constructivism is recognized in the computing-education community. Ben-Ari1 compared it with other fields, highlighting several differences; for example, in CS education, students need a model of the computer, and the computer then provides an “effective ontological reality,” or verification of whether a program works or not.
Students’ prior understanding and beliefs about a particular topic are also considered preconceptions, as explored by several researchers:
- Miller10 analyzed “natural language” programs by students who previously had not taken a formal programming course, covering the idea of writing computer programs in natural language, and found that a number of standard programming concepts showed up in these natural-language descriptions;
- Onorato and Schvaneveldt11 also looked at natural-language descriptions of a programming task, comparing survey subjects drawn from a variety of student categories: naïve, with no programming experience; beginner, currently taking a first programming course; and expert, with significant programming experience. Along with the differences between experts and novices, they also found differences between the naïves and the beginners, though both groups lacked programming experience;
- Studying the misconceptions of novice programmers, Bonar and Soloway3 focused on preprogramming knowledge, calling it “step-by-step natural language programming knowledge”; they distinguished it from knowledge of the programming language Pascal the students were learning in their introductory course. They found that many of the observed bugs in novice-programmer-written code could be explained by a mismatch in students’ knowledge in these domains; and
- Gibson and O’Kelly8 looked at a variety of search problems (with pre-college students) and Towers-of-Hanoi problems (with beginners), finding both groups showed “algorithmic understanding” of how to solve them.
In our own prior work beginning in 2005, we sought to identify student preconceptions that could be leveraged in teaching beginning computing concepts. To investigate student preconceptions about sorting, we asked 118 beginners to describe in words how they would sort a list of numbers into ascending order.14 A majority (69%) described a coherent algorithm, with many giving versions of selection or insertion sort. However, most treated numbers as strings, manipulating them digit by digit. Many used iteration; to our surprise, most iteration involved post-test loops. In 2008,13 we described our investigation of students’ commonsense knowledge of debugging, giving beginners one of four questions designed to elicit their knowledge of debugging strategies. The questions asked them to describe the advice they would give people whose lights did not turn on when they flipped the switch; how they would locate the moment things go wrong in the children’s game “telephone”; how they would find a Starbucks if they were in a strange city where they did not speak the language; or an experience of their own involving troubleshooting.
In general, we found beginners had less commonsense knowledge of debugging than of sorting, and some of their preexisting knowledge did not serve them well. For example, real-world fixes are often easy to undo, unlike programming changes. Likewise, the real world is nondeterministic in ways CS1 programs generally are not; for example, if your car doesn’t start, should you immediately turn the key a second time?
A substantial body of work in computing and in other scientific disciplines, including physics and mathematics, involves misconceptions, or incorrect concept understandings, that must be replaced with correct ones. Clancy6 provided a survey of this work in CS, and the National Academy’s Committee on Undergraduate Science Education7 gave a more general overview. Smith et al.15 challenged this view in the context of math and science education, arguing that misconceptions are limited mental models that can be built onto, ultimately providing a correct understanding of the concept being studied.
Most relevant to the commonsense project discussed here is Ben-David Kolikant’s 2001 work examining student preconceptions about concurrency.2 She collected data from about 135 Israeli 12th-grade students in six classes at three different schools. The students previously studied computing and at the time of the study were taking an advanced class in concurrent and distributed computing.
At the beginning of the course she gave them a critical-section problem in which multiple agents share a common resource—two ticket offices selling tickets for the same movie. As shown in the figure, the question was posed in a detailed pseudocode format. It expects a relatively sophisticated answer, including a hardware system specification for supporting multiple machines servicing sales requests, pseudocode for the solution, and an explanation of the answer and how it avoids duplicate ticket sales. Student answers were graded as part of the course.
Ben-David Kolikant divided the responses into two major categories: centralized, involving a solution in which communication and control for the solution are centralized; and decentralized, implicitly or explicitly involving the sellers communicating with one another to achieve a concurrent solution.
The centralized solutions involved three subcategories: (C1) with a central entity, essentially a master computer making all decisions; (C2) in which the solutions involved an assumption either of a constant rate of operations or of operations happening in a particular order; and (C3) in which solutions assume the sellers have private resources, each selling tickets for a separate area of the theater. Ben-David Kolikant described categories C2 and C3 as solutions that “attempted to solve a similar, but different problem.”2 She divided decentralized solutions into (D1) in which communication is implicit and (D2) in which communication is explicit.
Ben-David Kolikant’s discussion focused on three aspects that together contribute to the solutions she evaluated, describing them as “(a) the algorithmic goal of the system, (b) synchronization goals, and (c) reason-ableness.”2
The algorithmic goal was the problem the system was intended to solve, in this case, selling the ticket for the best available seat. The cinema-ticket problem was well constrained, with student responses all sharing this goal.
The synchronization goals of the cinema-ticket problem were “to coordinate…access to a common resource (the database) in order to avoid selling the same ticket twice”2 and “the prevention of the interleaving of the access to the database.”2 Ben-David Kolikant found that students did not identify the problem of interleaving in their written answers. Follow-up interviews indicated they assumed the key actions were inseparable, writing, “They assume that the two critical actions of checking and updating the database are always executed successively.”2 As a result, they assumed the key issue was communication, making sure all sellers were aware of the seats already sold. One-third of the 135 students presented centralized solutions, and the rest presented decentralized solutions.
A final aspect of the students’ solutions Ben-David Kolikant considered was “reasonableness,” or solving the problem “in a reasonable or realistic way…according to the context of the problem.”2 She found that students were influenced by their real-world experience with concurrency and networks. For example, C3 solutions were workable real-world solutions but did not ensure buyers would receive the best seat available.
Our 2006 study modified Ben-David Kolikant’s problem for presentation to beginners, allowing us to explore commonsense concurrency knowledge and the difference in this knowledge between advanced students and beginner students.
Methodology
Here, we describe our process of data collection and analysis:
Data collection. In the first week of the first semester of the 20062007 school year, in a CS1 class, students were randomly assigned one of two tasks: the concurrency task explored here completed by 66 students from five different institutions; the other was unrelated to the study. The participating students were all beginners between the ages of 18 and 20. All but eight completed the questions online (outside of class) by typing English answers into a text box. Eight students (all at the same institution) completed the questions on paper in a laboratory setting. All were given credit for completing the assignment, though solution quality was not evaluated for credit. (The participant/subject identifiers we cite here are renumbered and do not reflect institutional affiliation; the institutions’ characteristics, which vary significantly, are summarized in Table 1.)
The task. Students were asked to address the following: Suppose we sell concert tickets by phone in the following way: When a customer calls and asks for a number (n) of seats, the seller (1) finds the n best seats available; (2) marks those n seats as unavailable; and (3) deals with customer-payment options (such as credit- and debit-card number) or sends the tickets to the will call window for pickup. Now suppose more than one seller is working at the same time. What problems would we see, and how could we avoid them?
Our goal was to determine whether the students would note the race condition between two sellers and suggest solutions to resolve it. There are several differences between this task and Ben-David Kolikant’s counterpart. We modified the question to refer to concert tickets, since our students likely had more experience ordering concert tickets. We also removed the restriction that each customer be able to buy only one ticket. And finally, due to our students’ lack of background, we phrased the question less technically, asking for responses in English paragraphs, without pseudocode or detailed hardware and software specifications.
Analysis. After collecting the data, we read through all 66 responses for a general sense of their content. We then organized categorizations for the responses based on the categorizations used by Ben-David Kolikant.2
We considered all decentralized solutions in a single category (D), as the less-explicit question led students to provide less-explicit descriptions of this strategy, making it unclear if the communication was explicit or implicit. We also considered C2 and C3 answers (along with C1 and D) as “reasonable” solutions, because our ticket-ordering problem was more open-ended and did not ask students to consider hardware issues that would be necessary to make a solution work.
Several types of responses were considered “non-reasonable.” Some were ambiguous (AS), neither centralized nor decentralized. In others (BS), the response could not reasonably be said to solve the problem. Some responses (NS) did not offer a solution. And some responses (NP) provided solutions to problems that were not our central focus, though they may have been interesting, with some even involving concurrency. We recorded the problems (and the suggested solutions) that were not our central focus so we could examine them for commonalities across students.
Unlike Ben-David Kolikant’s participants, the students in our study often gave more than one possible solution to a problem, and we counted and coded each of them. After determining the proper categories, five of us coded the data, resolving coding conflicts through discussion.
Results
Here, we offer a sense of the student solutions by viewing them from three perspectives: per student, per categorizations used by Ben-David Kolikant,2 and through a qualitative look, with characteristic examples of responses highlighting important aspects of the responses.
Per student. We addressed three questions: How many solutions did the student provide? Was the problem identified? Did the student’s solution seem reasonable?
The 66 students in the study collectively produced a total of 97 identifiable solutions (see Table 2). Due to the open-ended nature of the solution we requested, many discussed multiple issues they saw in the problem statement (not all necessarily concurrency-related) or outlined several solutions to the concurrency problem of trying to sell the same seat to more than one person at a time. The majority (70% of the 66) of the students discussed only a single solution, but 20% identified two solutions, and 10% identified three or more solutions, with six solutions being the most identified by any one student.
Of the 66 students, 97%, or 64, identified our main problem of interest—that it may be possible to sell a given seat to more than one person—good evidence that even novice computing students are able to identify this critical concurrency issue. Additionally, six students explicitly noted the problems of interleaving access to a database that could result in one customer reserving but another customer buying the tickets.
In all, 71%, or 47, of the 66 students (73% of those identifying the main problem) identified at least one “reasonable” solution. Moreover, many of the beginners gave more than one type of answer. Of the 47 students who gave a reasonable solution, 12 (or 26%) gave both centralized and decentralized solutions.
Per categorization of solution. As many students provided multiple solutions, it is useful to look at the diversity of responses (see Table 3) out of the total number of solutions provided (97). We found that 69%, or 67, of them were reasonable solutions to the multiple-seat-selling problem; 31%, or 30, were not reasonable solutions; the majority of them (16) involved students describing the problem (often correctly) but without a solution.
Of the 67 reasonable solutions, 55%, or 37, described a centralized solution where the selling entities passed the responsibility of making a seat assignment to some central resource, and 45%, or 30, described a method by which individual sellers made decisions about seat assignments as individual entities.
The centralized solutions were further broken down into three categories, as defined by Ben-David Kolikant: (C1) with a central entity, essentially a master computer making all decisions; (C2) in which the solutions involved an assumption either of a constant rate of operations or of operations happening in a particular order; and (C3) in which solutions assumed that sellers have private resources, each selling tickets for a separate area of the theater. Of the reasonable solutions, 10%, or 7, were of type C1, relying on implicit communication between “dummy” sellers passing the request to a master computing resource to make assignments. For example, one study participant said, “These problems might be avoided by having a computer system automatically (to the second) input the seat reservation for that customer.” [ID438] (Each participant was given an anonymous label.)
Of the 97 reasonable solutions, 13%, or 13 were of type C2, using the same master resource but requiring explicit ordering of communication or steps in the process, including lock-stepping and pipelining the process. An example of a C2 solution: “In order to avoid this [possibility of selling the same seat twice], we could set up the database so only one person could access the database at a time. This would slow sales significantly but is the safest setup.” [ID440]
By far the most popular centralized solution (31%, or 30 responses) was also the most restrictive (C3) and involved distributing or dividing resources, either by portioning out seats in the concert venue to different sellers or serializing or otherwise pipelining the selling process.
Qualitative results. A qualitative analysis of the responses is a useful way to examine other interesting aspects of the 97 student solutions, revealing the range of approaches taken, as well as the depth of student understanding of the computational issues.
Algorithmic goals. Most students (58) did not further refine the goal of their algorithms, either explicitly or implicitly using a goal of “best seats available” in their responses. Some elaborated further, explaining why they chose a particular solution or the focus of their algorithms.
A number of students were concerned less about choosing seats than about handling seats that are given up. For example, one said, “If the seats are marked unavailable as soon as they are requested by the customer, other sellers cannot access the seats for their own customers at that time. This is a bad thing [less-than-optimal solution] because the better seats reserved by the first customers could potentially still be open should the customers change their minds about the purchase or if payment information cannot be validated. If the seats are marked unavailable, and the payment does not come through for whatever reason, the seats would remain unavailable and be empty during the concert.” [ID415]
Another said, “One obvious problem is two sellers giving up the same seats at the same time.” [ID405]
Some participants changed their algorithmic goals when they realized they could not simply reserve and sell as a single atomic action. For example, one said, “In a very unlikely situation, the sellers could mark the seats ‘unavailable’ at the same time. However, in a more likely situation, one of them would mark seats as unavailable and the other would mark and see that the seats are unavailable, but that seller was not the one reserving them. Then there will be multiple tickets sent to will call for the same seats.” [ID412]
Citing concurrency, other students expressed concern about dealing with group sales, with one saying, “First of all, there could be an issue of finding group seating. Finding n best available seats will not necessarily do, if they have to be n best available seats together. In such a situation, each seat should be labeled with how many seats are available in front, behind, and to the left and right of it.” [ID425]
Another said, “Scalpers and other ticket-selling agencies will buy tickets and sell them at an increased price; with no limit on n, the number of tickets the caller is purchasing, a single caller could buy every ticket to the concert. This problem is easily fixed by putting an upper limit on n of eight to 12 tickets; large groups can call a special hotline and speak to an operator to purchase more.” [ID430]
The notion of “best seat” attracted further attention, with one participant handling the double-selling problem by dividing the seats among sellers. Most algorithmic attention was then focused on the following problem, as the participant described it: “First of all, there would no longer be a first-come-first-serve basis, and problems would arise over who actually occupies the good seats first. If it’s only one seller, she would be able to take one customer at a time. Two or more sellers would make it hard to decide which seller’s customer actually received the seats first.” [ID431]
Identifying the main synchronization problem. The degree to which study participants were able to identify the problem varied, with most giving a fairly standard “Sellers could mark the seats unavailable at the same time.” [ID412] or “There could be double-booking.” [ID106] Saying this scenario is unlikely was not uncommon, with some participants identifying computers or technology as the source of the problem. For example, one said, “One computer may be operating slower than another, causing the seats one seller saw to be taken by another seller.” [ID406]
Others might not have addressed technology specifically but did identify the key concept of time, with one saying, “One major issue is when, and how long, it takes to mark a seat as unavailable.” [ID410]
A few others gave more detailed problem descriptions that hinted at the kind of analysis that will eventually be needed to construct a genuinely effective solution, including recognition of the interleaving problem. For example, one said, “The first, most obvious problem is overlap. If all sellers are working at the same time, then the system might display to Seller A that certain seats are open, when, in fact, they have already been reserved by Seller B. Thus A will have to find different seats, which might have, in the intervening time, been reserved by Seller C.” [ID417]
Identifying other problems. As noted when we discussed algorithmic goals, study participants identified group sales as a particularly knotty problem, and payment and how to cancel orders were big issues. For example, one said, “Another problem would be if the seats are marked unavailable before they are sold, customers might change their minds before payment and possibly hinder the sale of the seats to other customers who might have wanted them at the same time.” [ID420]
Another said, “If a customer cancels an order, how is the information transmitted to the other seller in a reasonable amount of time? If the customer doesn’t pay for the tickets at will call, what happens to them?” [ID 422]
And another said, “At the moment of receiving the payments for the tickets, problems might come up, such as miscommunication between the sellers and charging the customer double.” [ID419]
Participants also mentioned reliability, with one saying, “The computers may malfunction, and the seller may not be able to key in the requested seats.” [ID406]
Centralized solutions. The three variants of the centralized solutions reflected significant distinguishing characteristics. For example, C1 solutions relied on implicit communication between sellers and a central system making the reservation or selection on their behalf. A common characteristic of these implicit communications, as exemplified by ID438 cited earlier, concerning automatic customer ordering, was that they be “fast.” Some were less specific but still involved implicit communication. For example, one participant said, “The program would have to temporarily mark seats being looked at during a transaction as unavailable so vendors couldn’t sell seats simultaneously.” [ID313]
Others were more specific, imposing additional restrictions while prompting doubt as to the participant’s true understanding of concurrency. In one case, we saw evidence of an attempt to move the potential point of concurrent access in an expressed solution. “The only real way to avoid this and still have multiple sellers is to run the booking on a computer network with a master list of available seats. The process would then go something like this: A caller requests n seats. The master list can be ordered in such a way that it fills the seats front to back, left to right; when a seller requests n seats, it gives the next n seats on the list. Cancelled seat orders are inserted at the top of the available list, in order of precedence. The seller can reserve the seats, ask if they are acceptable to the customer, and, if they are, proceed with the transaction. This would avoid double-booking, because during the time the seller offers the seats to the customer, they are withheld from the list, and the other sellers drawing from the list would not have access to those seats.” [ID130]
C2 solutions differed from C1 in their use of explicit communication with a centralized resource making seat assignments, sometimes identified as a database. The quote from ID440 earlier on explicit ordering is an example of this explicit communication.
Another variant of explicit communication involved a particular ordering required to ensure a safe process, including lockstepping and pipelining. For example, one participant said, “These problems [multiple bookings] might be avoided if instead of multiple people selling tickets and being involved in every step of the process, the selling process was divided between two employees. This way, while the second employee was taking care of the payment of the first caller, the first employee could start to deal with the next sale, then transfer the call to the payment employee.” [ID120]
Another variant does not require a computer solution for concurrency at all, with one participant saying, “A possible solution to this problem [multiple sales of the same ticket] would involve a stagger-start approach when more than one worker is on the phone. For example, when the first caller calls, worker A picks up the phone right away. The second caller calls right after the first has called. Worker B then waits until the phone rings three times, then picks up and starts the process.” [ID121]
C3 solutions (the most common) involved distributing resources in a way that avoids simultaneous access. The most common resource was the seats to be sold. Some participants commented on potential problems with this approach, with one saying, “Perhaps if each vendor were responsible for a section of the concert hall, finding the best seats within their section would solve this problem. But this solution also means some vendors will fill the ‘good’ seats in their section faster, and certain customers won’t get the absolute best seats they could. Chances are good, however, that customers wouldn’t be aware that there are better seats available, rationalizing that the concert filled up quickly.” [ID303]
Other participants were more specific about the technique they would use—assigning a seller to a particular type of seat—to distribute resources. One said this could simplify or uncomplicate things, pointing out a possible benefit: “One way we could fix this problem [of multiple sellers of the same ticket] would be to assign a section of seats to each seller. This way no seats would be sold twice, and it would be more organized. One seller would be in charge of one price and one section, making the selling of seats faster and more efficient.” [ID404]
Decentralized solutions. Decentralized solutions are distinguished from centralized solutions by whether sellers themselves make decisions and seat assignments are based on communications with other sellers. If they are, the solution is decentralized. A common decentralized answer could reference a shared resource (such as a database or document), but sellers make decisions individually based on the resource, rather than on a centralized entity.
Our motivation was the constructivist theory about how people learn, starting with what they already know and building knowledge on that foundation, rather than receiving it passively from an instructor.
One participant said, “To resolve this issue [of multiple sellers of the same seats], there should be communication between the sellers. Ideally, they would mark the seats as unavailable on the same documents, so there could never be any doubling.” [ID101]
Other examples of nonspecific communication among distributed sellers included participants saying, “drawing off of the same information that was updated with each transaction” [ID304], “inform the other sellers of this by some form of communication” [ID437], and “using a program that is constantly updated” [ID434].
Speed was a common theme in the proposed solutions, with words like “instantaneously” [ID425], “instantly” [ID426] [ID402], “constantly” [ID434], “continuously” [ID410], and “real time” [ID417].
Some responses were more specific about how communication must occur, with some realizing the problem might not be completely solved. For example, one participant said, “A much easier way would be to use a computer program that networks each seller. This way, each seller has access to each available seat. As soon as a booking is made, it will automatically register on every seller’s screen, and the chance of there being a double booking will be close to impossible.” [ID323]
One participant provided explicit communication directives, saying, “I would change the order of operations so the two or more people booking seats would be required to check with each other while booking so as not to book the same seats, in that way adding another step and alleviating the two problems [booking the same seats and selling more tickets than are available].” [ID409]
Noncomputing-oriented solutions. Several solutions were distinguished by their noncomputing and nontechnological approaches, though they could be classified as either centralized or decentralized. For example, one participant said, “We could mark the same seat map with different colored markers for each sell.” [ID106] This participant’s solution is decentralized, since one can imagine individual sellers, each with their own colored marker, racing to mark off seats on a large map.
Another said, “This problem [of selling the same seat twice] could be avoided by allowing only one vendor at a time into the concert hall. But this would be unreasonable if the hall were too large or too many vendors were working to reserve seats. Perhaps if each vendor were responsible for a section of the hall, finding the best seats within their section, this problem would be solved.” [ID303] This response provides two solutions that eliminate potential conflicts, the first by enforcing exclusive access between the selection and the marking of seats, the other by dividing the resource (seats). Here, we could imagine sellers with cellphones dashing around the hall, placing markers on actual, physical seats. Note that scaling issues are mentioned in the proposed solution.
Common errors. The two most common errors in the student-proposed solutions were in thinking that a problem could be solved with a faster system and in devising solutions that simply moved concurrency to another point in the algorithm. As noted when we described decentralized solutions, the surveyed students suggested speed was necessary to avoid many problems. For example, one said, “To avoid this problem we could have very high ‘refresh’ rates or have a way of reserving n seats, as the process is still going through.” [ID 423] Another said, “As soon as n seats are marked unavailable, even before payment processing, the seats need to be marked unavailable. This way, another seller cannot try to reserve a seat that is already ‘reserved.’ …the system (and screen) would need to be refreshed every time a reservation is made.” [ID 425]
Many proposed solutions moved the point of concurrency. In one, it was moved to a preview step, with the participant saying, “One more solution would be to have the computer show the n seats as unavailable as soon as any seller has them up on their screens. With this system, only one seller could see these seats as available at a time. If one seller (A) pulls up n seats for a customer, then another seller (B) searches for the best seats, and the seats seller A was looking at would not be shown to seller B.” [ID122]
Of the 66 students, 64 identified our main problem of interest—that it may be possible to sell a given seat to more than one person—good evidence that even novice computing students are able to identify this critical concurrency issue.
Another retargeting of the point of concurrency was to a graphical interface, with another saying, “Creating a visual representation of the concert hall through a computer would alleviate this problem. Sellers would mark a certain number of seats for their clients, letting other sellers know which seats are purchased (potentially) and which seats are free for booking.” [ID413]
Another realized his/her distributed, graphical interface only hid the concurrency problem, so proposed a novel solution that apparently used the inherent randomness of human interaction to deal with the problem of multiple sellers claiming the same seats at the same time, saying, “Sellers would each have their own computers, and all of them would be connected, so once a seat is claimed, all other sellers would see it. If two sellers happen to click at the same time, a separate window opens, and both will have to try again.” [ID402]
One interpretation of this solution is that a separate window would open when the computer detects a conflict and forces the sellers to back off and retry, assuming it unlikely that the sellers would try again at precisely the same moment. However, this idea leaves many concurrency issues unresolved, including how the conflict is detected and whether or not other sellers could still get in and reserve the seat(s) targeted by the two original sellers.
Discussion
The proposed solutions of beginner students provide evidence of CS problem-solving skills through their ability to identify the problem and suggest reasonable, though relatively unsophisticated, solutions. By replicating the Ben-David Kolikant study, our study provided additional perspective in both analysis and results. In addition to finding that her categorizations of student-proposed solutions and goals can be used to analyze beginner responses, our study helped compare beginners and students with significant CS skills but no significant experience with concurrency. This comparison gave a sense of how much sophistication was gained by the more experienced students and how much problem-solving sensibility was already available to students on the day they enter their first CS course.
We focused on two themes—reasonableness and synchronization goals—in the Ben-David Kolikant study:
Reasonableness. Using student interviews, Ben-David Kolikant found that students sometimes solved a simpler problem than the one she assigned, referring to these solutions as not fulfilling the goal of being “reasonable” solutions because they assumed something unrealistic, given the problem description. We saw the same phenomenon in our student solutions. Their wording sometimes made it clear the proposed C2 (constant rate or ordering of operations) and C3 (division of resources) solutions were “easy” or “simple.”
Some students in our study showed their ability to reason about the quality of their solutions. For example ID303, quoted earlier in the context of centralized solutions, recognized the limitations of his/her solution, without suggesting a better one. ID122 noted that moving the point of concurrency in his/her solution as superior to flawed, possibly incorrect alternatives, saying, “The obvious solution would be to fire one seller and have just one working at a time (only kidding). But the thing to do would be to ‘assign’ each seller a section of the hall where the concert is taking place. One seller would have control of half the seats, and the other the other half. There would be no conflicting seats. Or they could just switch systems to general admission. One more solution would be to have the computer show n seats as unavailable as soon as any sellers pull them up on their screens—a first-come-first-served system.” [ID 122]
Synchronization goals. A third of the solutions in the Ben-David Kolikant study were centralized. Our study involved an even larger percentage of centralized solutions (55%, or 36). This was consistent with Resnick,12 whose studies in the mid-1990s found that managing a centralized solution to a problem is easier than managing entities in a decentralized way. Moreover, despite the proliferation of decentralized entities, particularly the Internet, in the 15 years since Resnick’s studies, the students in our study were still more inclined to pursue a centralized solution. Ben-David Kolikant noted that Resnick believed increased exposure to decentralized entities would increase the likelihood of using decentralization, but we found no evidence this happened. (All participants in our study were experienced Internet users.)
The larger percentage of centralized solutions in our study stemmed from two factors: First, 82% of our centralized solutions were C2 or C3 solutions; only 33% of Ben-David Kolikant’s centralized solutions were C2 or C3 solutions. As Ben-David Kolikant brought up in her discussion of reasonableness, students often made simplifying assumptions, not because they are reasonable, but because they allow them to more easily solve the problem posed to them. Given our more open-ended problem description, it seemed more “reasonable” to students to suggest C2 or C3 solutions to our problem about concurrency in concert-ticket sales. And second, given our less well-defined problem statement and reduced direct engagement with the ticket-sales problem in the classroom setting, our students were not necessarily prodded to consider or outline a more complex decentralized solution the same way Ben-David Kolikant’s students might have been.
Consistent with Ben-David Kolikant’s study, we found students concentrated on sharing information across sellers rather than preventing interleaving access to the database, with only six discussing the interleaving of operations. However, it may well be that the nature of our task was simply not as suggestive of database issues as the pseudocode in Ben-David Kolikant’s problem statement, particularly given the lack of computing experience by our CS1 students.
While Ben-David Kolikant was able to show students “exaggerate the grain of an atomic action” by assuming that checking and updating the database is atomic, we found that a description of even this level of granularity was present in only the most explicit and detailed of our student-generated solutions. Many of the responses did not clarify this level of description of granularity of interaction, leaving it ambiguous as to how well they really understood the concurrency issue at hand.
Algorithmic goal of solution. While Ben-David Kolikant’s study posed a well-constrained problem with a clear algorithmic goal, our study included an unconstrained problem description, and our student participants provided a number of goals for their algorithms, with 97% identifying the main goal of not letting the same seat be sold to two different customers, indicating a commonsense ability to identify concurrency conflicts.
Many of our students (41%, or 27) identified goals beyond not selling the same seat, including group sales of a large block of tickets, letting seats be reserved without being paid for, identity theft, choosing seats by price rather than by best available, payment-transaction delays, and data tracking and storage. Two of these problems—group ticket sales and reserving without selling to allow seats to be made available again—are notable for how they influenced approaches to achieving the main algorithmic goal. From an instructional point of view, these additional goals noted by students suggest they may need help prioritizing from among the goals they find in open-ended questions, unless the goals are given explicitly.
Concurrency techniques. Most students introduced techniques directly related to concurrency, with solutions including centralized techniques and distributed techniques. Within them, they introduced scaling, locks, pipelining, and methods for distributing resources. Even in proposed solutions with errors, we found concepts ripe for leverage. For example, many of our students chose to “pass the buck” by pushing concurrency from buying a seat to reserving it. Others gave solutions that assumed a system fast enough to eliminate race conditions. Though incorrect, they provide a starting point for understanding atomic operations and the interleaving of instructions.
Conclusion
In both their correct and incorrect preconceptions, the 66 students in our 2006 study apparently began their first computing course with essentially the same level of intuition as they began their first course involving concurrency. This similarity suggests students do not gain a deeper understanding of concurrency as they advance through the curriculum. As we have no data indicating that taking nonconcurrency courses provides skills that help one learn concurrency-related material more quickly, it may be there is no advantage to delaying the introduction of concurrency. Given the prevalence of concurrency and its increasing relevance at all levels and applications of CS, we suggest it may be wise to include concurrency earlier in the curriculum.
No matter which course introduces concurrency, the problem and student-proposed solutions in our study suggest ways to leverage student preconceptions. For example, instructors could conduct an exercise like this and choose student-generated solutions for further discussion to explicitly address common errors. In particular, they could:
- Emphasize the real-world nature of the problem, pointing out related concurrency problems;
- Demonstrate that race conditions come up even in “fast” systems;
- Use responses that pass the buck (appearing to solve the concurrency problem by moving it to another operation without actually solving it) to help discuss the notion of atomic operations;
- Use a centralized solution to discuss interleaving instructions and pipelining technique; and
- Use responses that do not scale well to discuss scaling.
Here, we’ve addressed a rich set of student responses that represent a starting point for asking students to critique proposed solutions, emphasizing concepts we know are at the edge of their preconceptions. All such concepts could be discussed early or at least introduced in CS1.
Instructors can use the question we asked—What’s the best way to organize the sale of tickets to a popular concert?—to help identify student preconceptions about CS. By documenting beginner-student preconceptions, instructors gain leverage for using a constructivist model, building on this commonsense knowledge through student preconceptions. Our study did not depend on a particular technology, pedagogy, or philosophy and can be replicated to study how or if students’ commonsense knowledge is changing.
Acknowledgments
This material is based in part on work supported by the National Science Foundation under grants DUE-0736343, DUE-0736700, DUE-0736572, DUE-0736738, DUE-0736859, and DUE-0736958. Any opinions, findings, conclusions, or recommendations expressed here are those of the authors and do not necessarily reflect the views of the NSF. We thank the many students who responded to the questions used in this and our other studies; Sally Fincher, Josh Tenenberg, and the NSF (through grant DUE-0243242) who provided us with workspace at the SIGCSE conferences in 2006 and 2007; Renee McCauley, Sara Miner More, and Suzanne Westbrook for help with fall 2006 data collection and others who collected data for us in other commonsense projects; and the anonymous reviewers for their comments and suggestions. An earlier description of the study, with expanded analysis and discussion, is in the Proceedings of the Third International Computer Science Education Research Workshop.9
Tables
Table 1. Institutional breakdown of respondents; n = number of students answering the concurrency question.
Table 2. Number of concurrency solutions and problems identified by students; n = 66.
Table 3. Solution breakdown by type: column 2: considered 97 solutions; column 3: considered 67 reasonable (C or D) solutions.
Table. Yifat Ben-David Kolikant’s cinema-tickets assignment.
Join the Discussion (0)
Become a Member or Sign In to Post a Comment