News
Computing Applications News

CS and Biology’s Growing Pains

Biologists can benefit from learning and using the tools of computer science, but several real-world obstacles remain.
Posted
  1. Introduction
  2. A High-School Solution?
  3. Real-World Dilemmas
  4. Author
  5. Footnotes
  6. Figures
high school student at Rutgers University lab computer
As part of a National Science Foundation-funded project, New Jersey high school students conduct bioinformatics research on lab computers at Rutgers University.

The compatibility of computer science and biology—two disparate yet increasingly symbiotic branches of knowledge—is becoming a hot topic among academic scientists. Recent publications in popular and academic journals have called for mandating stronger computer and mathematics courses for undergraduate biology majors. Those treatises have been met by equally ardent responses among some biologists claiming that mandating additional background in computer science and math will not necessarily advance a budding biologist’s academic and career success.

“To grossly oversimplify it, computer science is all about the binary, and in biology, things don’t lend themselves to binary distinction,” says John Timmer, the science editor of Arstechnica.com, who has a Ph.D. in molecular and cell biology. Timmer recently wrote an opinion piece, “Should Biologists Study Computer Science?”, that took to task advocates of increased emphasis on undergraduate computer science and math. Timmer argued that knowing how to use a given tool, and having enough domain knowledge to be able to flag outlying results, should be sufficient for most biologists.

“Obviously, computer scientists can do things that are far more subtle than binary logic,” Timmer says, “but the fact that the most basic concepts in biology, like genes and species, exist along a full spectrum and can often be defined using different definitions doesn’t lend itself to definitive computerized analysis very cleanly.”

Computer scientist Nir Piterman, a research fellow at Imperial College, says Timmer may be right, but that “the central role of the computer in our lives” will mandate that biologists learn some foundational basics of computation such as algorithmic thinking and some sort of formal expression.

“The advantages are not only in being able to do the things that are required in order to do modeling or more computational biology,” says Piterman, “but this way of thinking can help many fields of biology to communicate better, and to harness computing better, by being able to share information more formally. Maybe it’s less natural to do it in biology, but the power of computing makes it less than optimal to avoid this.”

Back to Top

A High-School Solution?

The goal to strengthen biologists’ computer science and math backgrounds faces a major obstacle within college curricular structures. For instance, trying to design a quantitative thinking and computer science offering that would satisfy all fields of biology is extremely difficult. Also, students’ schedules are already filled with existing requirements. Adam Siepel, assistant professor of biological statistics and computational biology at Cornell University, says the university is grappling with this issue.

“There’s such a broad spectrum of activities going on under the rubric of biology, from what is essentially physiology to organismal biology, to ecology,” Siepel says. “These disciplines have almost nothing to do with one another. I was part of a task force last year that was reviewing the undergraduate curriculum for biology and it was really a struggle.”

Siepel says the math requirements were examined closely, but the faculty concluded that sending biology students out of the department for math, computer science, and statistics survey courses was unpopular and counter-productive.

“There was general agreement the students should have something that really connects better with biology, maybe less calculus, more statistics and computer science, maybe something about computational sequence analysis or something along those lines,” Siepel says. “But it’s a struggle. The students already have a full set of requirements and any time you add a new one, you have to bump something else. We didn’t get very far on that issue. You get in a situation where you almost have to require a five-year instead of a four-year degree if you’re really going to educate them in the physical sciences and math and statistics and computer science as well as all the biology requirements.”

Siepel says he has had numerous students who want to take an upper-level course and express an interest in some aspect of computational biology, only to discover they lack a sufficient background in math or computer science to really pursue that interest. And, Siepel reiterates, their schedules are already too full.

“To be frank,” says Siepel, “part of it is the failure of high schools to be providing basic education in mathematics and sciences before the students get to universities.”

That shortcoming may be addressed soon. In 2009, the College Board released the draft of its revised Advanced Placement (AP) biology curriculum for high school seniors in response to the National Research Council’s 2002 report Learning and Understanding: Improving Advanced Study of Mathematics and Science in U.S. High Schools. The new curriculum includes significant changes in four areas, including quantitative and computational thinking. According to the College Board draft, “Students will be encouraged to develop their ability to apply mathematics to wide sectors of biology so that they can better test hypotheses, model biological phenomena, interrogate complex data sets, and represent and interpret visualizations of relationships.”

Raina Robeva, chair of the mathematical sciences department at Sweet Briar College, says the new AP curriculum should have a profound effect on incoming students’ capabilities. “Whether we like it or not, the College Board drives a lot of this, so if they are saying they are changing all of this, the AP exams will change to reflect this and all those students will have more of a quantitative background when they get to college, and will take those skills to higher-level biology courses.”

Back to Top

Real-World Dilemmas

Even if fundamental concepts are added to advanced secondary school curricula and undergraduate courses, the workaday problem of reconciling the principles of computer science and math with the realities of biologic research remains. Sarah Killcoyne and John Boyle, senior software engineer and senior research scientist, respectively, at the Institute for Systems Biology, co-authored “Managing Chaos: Lessons Learned Developing Software in the Life Sciences,” in the November-December 2009 issue of Computing in Science and Engineering.

In their paper, Killcoyne and Boyle pointed out that biology, due to its descriptive nature, lacks the grand underlying mathematical theory, and hence formalized body of expression, that is present in physics. This makes software development far more difficult in life sciences, and the two communities remain struggling to communicate their needs. Boyle says teaching biologists and computer scientists an appreciation for each other’s discipline might be more useful than trying to convince biologists they need a certain amount of computing and math proficiency to do their jobs.

“You hate to say this, but a lot of people don’t care, and rightly so,” Boyle says. “They’re busy people. Should they know the ins and outs of how to use a bioinformatics tool? In a perfect world, yes, they should. But is it something that’s holding back scientific progress? Can they go to someone else and get that person to help them? Yes. Can they get by without it? Sometimes.

“We tend to be a little bit pragmatic here. ‘Is it something that’s holding us back doing research?’ is always going to be the fundamental question,” says Boyle.

Perhaps the debate over exactly how computationally savvy the majority of biologists should be will devolve simply due to the fact that certain areas of biology will naturally lend themselves to more computationally intensive approaches than others. Boyle says the contention that a number of computationally skilled biologists specializing in these areas will advance the cross-pollination of the disciplines in a kind of natural selection process may have credence. A researcher at Microsoft Research Cambridge, Jasmin Fisher is a pioneer of this sort of “executable biology,” which she says will not only winnow out false steps in the process of evaluating an idea, but also illuminate hypotheses for which noncomputational calculations would be prohibitively difficult or missed altogether.


Part of the problem, says Adam Siepel, “is the failure of high schools to be providing basic education in mathematics and sciences before the students get to universities.”


“Serious biological research with living material takes a long time,” Fisher says. “The thing we’re trying to say here is this kind of modeling will help to focus and direct the next experiment and save time and resources. This is the key point.”

One example of such an approach is work Fisher and colleagues, including Piterman (who is married to Fisher), computer scientist Tom Henzinger (who is president of the Institute of Science and Technology Austria), and University of Zurich biology professor Alex Hajnal, performed while studying earthworm vulva development.

“While modeling the crosstalk between two signaling pathways operating in the cells that eventually become the worm’s egg-laying system, we predicted a very specific order of events related to this particular developmental process,” Fisher says. “This then led to the design of an experiment that was performed in the lab, and validated experimentally the prediction provided by the modeling work. The point here is that, one, without the modeling work this prediction would not have been thought of, and, two, without the prediction, the experiment would not have been designed and performed in the lab. I think this is a beautiful example of how this kind of knowledge from computer science can be channeled to direct lab experiments and shed new light on the biological system that we study.”

Whatever approach the two disciplines’ practitioners ultimately decide upon to create a more seamless interaction between them, Robeva says the heightened level of discussion, disagreements and all, is beneficial for both disciplines in crafting a more compatible future.

“It used to be the case that biology needed math, and mathematicians would answer a biologist’s problem out of a sense of community service,” she says. “But now biology problems are generating way more math questions than mathematicians can answer. It seems at this juncture that momentum is going for both the biologists and the mathematicians, so it seems the stars are aligning.”

*  Further Reading

Fisher, J. and Henzinger, T.A.
Executable cell biology. Nature Biotechnology 25, 11, November 2007.

Pevzner, P. and Shamir, R.
Computing has changed biology—biology education must catch up. Science 325, 5940, July 2009.

Robeva, R. and Laubenbacher, R.
Mathematical biology education: Beyond calculus. Science 325, 5940, July 2009.

Siepel, A.
Computational education for molecular biology and genetics. Transform Science: Computational Education for Scientists, Yan Xu (ed.), Microsoft Corp., Redmond, WA, 2009.

Boyle, J., Cavnor, C., Killcoyne, S., Shmulevich, I.
Systems biology driven software design for the research enterprise. BMC Bioinformatics 9, 295, June 2008.

Back to Top

Back to Top

Back to Top

Figures

UF1 Figure. A New Jersey high school student works in a Rutgers University lab as part of a research project on decoding a DNA sequence.

UF2 Figure. As part of a National Science Foundation-funded project, New Jersey high school students conduct bioinformatics research on lab computers at Rutgers University.

Back to top

Join the Discussion (0)

Become a Member or Sign In to Post a Comment

The Latest from CACM

Shape the Future of Computing

ACM encourages its members to take a direct hand in shaping the future of the association. There are more ways than ever to get involved.

Get Involved

Communications of the ACM (CACM) is now a fully Open Access publication.

By opening CACM to the world, we hope to increase engagement among the broader computer science community and encourage non-members to discover the rich resources ACM has to offer.

Learn More