Opinion
Letters to the editor

From Syntax to Semantics For AI

Posted
  1. Introduction
  2. Steve Jobs Very Much an Engineer
  3. For Consistent Code, Retell the Great Myths
  4. Diversity and Competence
  5. Authors' Response
  6. Overestimating Women Authors?
  7. Authors' Response
  8. Footnotes
Letters to the Editor

Concerning Moshe Y. Vardi’s Editor’s Letter "Artificial Intelligence: Past and Future" (Jan. 2012), I’d like to add that AI won’t replace human reasoning in the near future for reasons apparent from examining the context, or "meta," of AI. Computer programs (and hardware logic accelerators) do no more than follow rules and are nothing more than a sequence of rules. AI is nothing but logic, which is why John McCarthy said, "As soon as it works, no one calls it AI." Once it works, it stops being AI and becomes an algorithm.

One must focus on the context of AI to begin to address deeper questions: Who defines the problems to be solved? Who defines the rules by which problems are to be solved? Who defines the tests that prove or disprove the validity of the solution? Answer them, and you might begin to address whether the future still needs computer scientists. Such questions suggest that the difference between rules and intelligence is the difference between syntax and semantics.

Logic is syntax. The semantics are the "why," or the making of rules to solve why and the making of rules (as tests) that prove (or disprove) the solution. Semantics are conditional, and once why is transformed into "what," something important, as yet undefined, might disappear.

Perhaps the relationship between intelligence and AI is better understood through an analogy: Intelligence is sand for casting an object, and AI is what remains after the sand is removed. AI is evidence that intelligence was once present. Intelligence is the crucial but ephemeral scaffolding.

Some might prefer to be pessimistic about the future, as we are unable to, for example, eliminate all software bugs or provide total software security. We know the reasons, but like the difference between AI and intelligence, we still have difficulty explaining exactly what they are.

Robert Schaefer, Westford, MA

Back to Top

Steve Jobs Very Much an Engineer

In his Editor’s Letter "Computing for Humans" (Dec. 2011), Moshe Y. Vardi said, "Jobs was very much not an engineer." Sorry, but I knew Jobs fairly well during his NeXT years (for me, 1989–1993). I’ve also known some of the greatest engineers of the past century, and Jobs was one of them. I used to say he designed every atom in the NeXT machine, and since the NeXT was useful, Jobs was an engineer. Consider this personal anecdote: Jobs and I were in a meeting at Carnegie Mellon University, circa 1990, and I mentioned I had just received one of the first IBM workstations (from IBM’s Austin facility). Wanting to see it (in my hardware lab at the CMU Robotics Institute), the first thing he asked for was a screwdriver, then sat down on the floor and proceeded to disassemble it, writing down part numbers, including, significantly, the first CD-ROM on a PC. He then put it back together, powered it up, and thanked me. Jobs was, in fact, a better engineer than marketing guy or sales guy, and it was what he built that defined him as an engineer.

Robert Thibadeau, Pittsburgh, PA

Back to Top

For Consistent Code, Retell the Great Myths

In the same way cultural myths like the Bhagavat Gita, Exodus, and the Ramayan have been told and retold down through the ages, the article "Coding Guidelines: Finding the Art in Science" by Robert Green and Henry Ledgard (Dec. 2011) should likewise be retold to help IT managers, as well as corporate management, understand the protocols of the cult of programming. However, it generally covered only the context of writing educational, or textbook, and research coding rather than enterprise coding. Along with universities, research-based coding is also done in the enterprise, following a particular protocol or pattern, appreciating that people are not the same from generation to generation even in large organizations.

Writing enterprise applications, programmers sometimes write code as prescribed in textbooks (see Figure 11 in the article), but the article made no explicit mention of Pre or Post conditions unless necessary. Doing so would help make the program understandable to newcomers in the organization.

Keeping names short and simple might also be encouraged, though not always. If a name is clear enough to show the result(s) the function or variable is meant to provide, there is no harm using long names, provided they are readable. For example, instead of isstateavailableforpolicyissue, the organization might encourage isStateAvailableForPolicyIssue (in Java) or is_state_available_for_policy_issue (in Python). Compilers don’t mind long names; neither do humans, when they can read them.

In the same way understanding code is important when it is read, understanding a program’s behavior during its execution is also important. Debug statements are therefore essential for good programming. Poorly handled code without debug statements in production systems has cost many enterprises significant time and money over the years. An article like this would do well to include guidance as to how to write good (meaningful) debug statements as needed.

Mahesh Subramaniya, San Antonio, TX

The article by Robert Green and Henry Ledgard (Dec. 2011) was superb. In industry, too little attention generally goes toward instilling commonsense programming habits that promote program readability and maintainability, two cornerstones of quality.

Additionally, evidence suggests that mixing upper- and lowercase contributes to loss of intelligibility, possibly because English uppercase letters tend to share same-size blocky appearance. Consistent use of lowercase may be preferable; for example, "counter" may be preferable to "Counter." There is a tendency to use uppercase with abbreviations or compound names, as in, say, "CustomerAddress," but some programming languages distinguish among "CustomerAddress," "Customeraddress," and "customerAddress," so ensuring consistency with multiple program components or multiple programmers can be a challenge. If a programming language allows certain punctuation marks in names, then "customer_address" might be the optimal name.

It is also easy to construct programs that read other programs and return metrics about programming style. White space, uppercase, vertical alignment, and other program characteristics can all be quantified. Though there may be no absolute bounds separating good from bad code, an organization’s programmers can benefit from all this information if presented properly.

Allan Glaser, Wayne, PA

Back to Top

Diversity and Competence

Addressing the lack of diversity in computer science, the Viewpoint "Data Trends on Minorities and People with Disabilities in Computing" (Dec. 2011) by Valerie Taylor and Richard Ladner seemed to have been based solely on the belief that diversity is good per se. However, it offered no specific evidence as to why diversity, however defined, is either essential or desirable.

I accept that there should be equal opportunity in all fields and that such a goal is a special challenge for a field as intellectually demanding as computer science. However, given that the authors did not make this point directly, what exactly does diversity have to do with competence in computing? Moreover, without some detailed discussion of the need or desirability of diversity in computer science, it is unlikely that specific policies and programs needed to attract a variety of interested people can be formulated.

It is at least a moral obligation for a profession to ensure it imposes no unjustifiable barriers to entry, but only showing that a situation exists is far from addressing it.

John C. Bauer, Manotick, Ontario, Canada

Back to Top

Authors’ Response

In the inaugural Broadening Participation Viewpoint "Opening Remarks" (Dec. 2009), the author Ladner outlined three reasons for broadening participation—numbers, social justice, and quality—supporting them with the argument that diverse teams are more likely to produce quality products than those from the same background. Competence is no doubt a prerequisite for doing anything well, but diversity and competence combined is what companies seek to enhance their bottom lines.

Valerie Taylor, College Station, TX and Richard Ladner, Seattle

Back to Top

Overestimating Women Authors?

The article "Gender and Computing Conference Papers" by J. McGrath Cohoon et al. (Aug. 2011) explored how women are increasingly publishing in ACM conference proceedings, with the percentage of women authors going from 7% in 1966 to 27% in 2009. In 2011, we carried out a similar study of women’s participation in the years 1960–2010 and found a significant difference from Cohoon et al. In 1966, less than 3% of the authors of published computing papers were women, as opposed to about 16.4% in 2009 and 16.3% in 2010. We have since sought to identify possible reasons for that difference: For one, Cohoon et al. sought to identify the gender of ACM conference-paper authors based on author names, referring to a database of names. They analyzed 86,000 papers from more than 3,000 conferences, identifying the gender of 90% of 356,703 authors. To identify the gender of "unknown" or "ambiguous" names, they assessed the probability of a name being either male or female.

Our study considered computing conferences and journals in the DBLP database of 1.5 million papers (most from ACM conferences), including more than four million authors (more than 900,000 different people). Like Cohoon et al., we also identified gender based on author names from a database of names, using two methods to address ambiguity: The first (similar to Cohoon et al.) used the U.S. census distribution to predict the gender of a name; the second assumed ambiguous names reflect the same gender distribution as the general population. Our most accurate results were obtained through the latter method—unambiguous names.

We identified the gender of more than 2.6 million authors, leaving out almost 300,000 ambiguous and 1.1 million unknown names (with 220,000 limited to just initials). We performed two tests to validate the method, comparing our results to a manual gender identification of two subsets (thousands of authors) of the total population of authors. In each, the results were similar to those obtained through our automated method (17.26% and 16.19%).

Finally, Cohoon et al. compared their results with statistics on the gender of Ph.D. holders, concluding that the productivity of women is greater than that of men. Recognizing this result contradicts established conclusions, they proposed possible explanations, including "Men and women might tend to publish in different venues, with women over-represented at ACM conferences compared to journals, IEEE, and other non-ACM computing conferences."

However, the contradiction was due to the fact that their estimation of "unknown" or "ambiguous" names overestimated the number of women publishing in ACM conference proceedings.

José María Cavero, Belén Vela, and Paloma Cáceres, Madrid, Spain

Back to Top

Authors’ Response

The difference in our results from those of Cavero et al. Likely stems from our use of a different dataset. We conducted our analyses exclusively on ACM conference publications. Cavero et al. included journals, which have fewer women authors. We carefully tested and verified our approach to unknown and ambiguous names on data where gender was known, showing that women were overrepresented among unknown and ambiguous names. We could therefore construct a more accurate dataset that corrected for the miscounting of women.

J. McGrath Cohoon, Charlottesville, VA, Sergey Nigai, Zurich, and Joseph "Jofish" Kaye, Palo Alto, CA

Back to Top

Join the Discussion (0)

Become a Member or Sign In to Post a Comment

The Latest from CACM

Shape the Future of Computing

ACM encourages its members to take a direct hand in shaping the future of the association. There are more ways than ever to get involved.

Get Involved

Communications of the ACM (CACM) is now a fully Open Access publication.

By opening CACM to the world, we hope to increase engagement among the broader computer science community and encourage non-members to discover the rich resources ACM has to offer.

Learn More