Opinion
Computing Applications Viewpoint

Spam, Spim, and Spit

Abolishing spam, and its morphing forms, will take more than public protest and current filtering and legislative quick fixes.
Posted
  1. Introduction
  2. How Did We Get Here?
  3. What Can We Do About Spam?
  4. Acceptable Use Policies, Commercial Speech, and the First Amendment
  5. Conclusion
  6. Author

I hate spam! No, not the canned meat product; unwanted email! I’ve spoken against this (ab)use of the Internet for many years. Eliminating it, however, is not as simple as one could wish.

Most email users—and even those who don’t use it (are there any?)—are familiar with the neologism "spam" to refer to unwanted (usually commercial) email. Typically, it comes in large quantities, is originated by programmed systems using software that generates enormous volumes of traffic, and frequently uses fraudulent or misappropriated source identification so as to hide its real origin. With the rising use of instant messaging (IM), we may experience similar, unwanted IM sessions originated by programs that have registered automatically as IM users. Some people are now referring to this as "spim." As voice over the Internet becomes increasingly feasible and popular, we may be treated to yet another experience that we might call "spit" for "spam over Internet telephony." I suppose spamming of the mobile telephony Short Message system is next!

How has all this happened? What can we do about it? This brief article, though by no means comprehensive, seeks to provide some answers or at least some ideas for consideration.

Email has been around in one form or another for a long time. Its emergence in networked environments was most visibly marked by Ray Tomlinson in 1971 when he formulated the now iconic split between mailbox identifier and host name as in "user@hostname" on the ARPANET. As the Domain Name System came into operation in the mid-1980s, this convention was extended hierarchically to read, for example, "user@hostname.cs.ucl.ac.uk."

The community of networked email users in the 1970s and early 1980s ranged from workers involved in the ARPANET project to others exploring similar ideas with academic systems such as CSNET, BITNET, EARN, NETNORTH, USENET, and FidoNet among others. Indeed, by the early 1980s, commercial email services were in use, such as MCI Mail, CompuServe, Telemail, and OnTyme, to name a few. For the most part, these communities were either largely homogeneous (academics in most cases) or closed systems in which users had to log into payable accounts to send and receive email. Linkages were constructed among the academic systems so that email could be exchanged, but it was understood that these systems were not for commercial activity.

In the commercial systems, users paid to use the networked email service and charges were often on a per-message basis. Now they pay not to have email delivered by registering for filtering of unsolicited commercial and virus-laden email from their incoming email streams.


Given the public’s widespread hatred of spam, it is not surprising that spammers work hard to obscure the origins of their email.


Back to Top

How Did We Get Here?

Email in the academic community has long been a free service. Simply another thing you can do on the Net. Standards were developed in the early 1970s and have evolved since then to permit easy exchange of email among parties through the use of Mail Transfer Agents (MTA), Mail Stores (MS), and Mail User Agents (MUA).

The commercial systems were eventually connected to the public Internet starting in 1989 (with MCI Mail and CompuServe among the first to accomplish this). This interconnection changed the economics of email exchange because senders on the Internet side could issue email to targets in the commercial side without cost beyond the cost of being on the Internet in the first place. In this same year, commercial Internet service providers emerged. By 1994, the World Wide Web burst on the public Internet scene and the rapid growth of users on the public system continued. Eventually, Web-based email services were developed that added to the ways in which users could originate and receive email.

Moreover, the homogeneity of the email user community gave way to the demographics of the general public and the shelter of closed, commercial systems gave way to global connectivity with the public Internet users. These twin effects prepared the ground for what we now call spam.

Unlike commercial postal mail, which has a cost to the sender, commercial email is nearly cost free to the sending party. Of course, the recipient of vast quantities of spam pays an implicit price in wasted time, storage, and possibly even expense in downloading unwanted email. The low cost to the sender in turn reduces the break-even response statistics to very small proportions. If only a tiny number of recipients actually respond, the cost of sending the spam can be recovered and after that, it is all profit.

Contributing to the problem is the openness of the Internet. Peer-to-peer interactions lie at the heart of the network’s adaptability and opportunity for the creation of new services. New protocols and applications can be developed without the need to obtain permission from ISPs. In the benign early days of the Internet, MTAs would often provide a relay service to anyone presenting the correct protocols (Simple Mail Transport Protocol). In today’s world, non-authenticating open relays represent an avenue for abuse. This leads to a significant tension between the valuable open environment of the Net and its abuse by spammers.

The vulnerability of networked hosts to infection by viruses, worms, and Trojan horses is another major problem. Millions of compromised hosts have become "zombie armies" capable of generating enormous volumes of anonymous spam.

Back to Top

What Can We Do About Spam?

One of the first ideas for spam elimination is filtering at the receiving end. It should be apparent that attempts to recognize spam by analyzing content has its limitations. A spammer can easily make a message look as close to "legitimate" email as desired. A trivial example of this is copying an actual email message and then adding either an attachment or incorporating essentially uninterpreted images in HTML-encoded email. Another common tactic is to "scrape" a screen image from a legitimate Web site, incorporate this HTML-encoded object in the body of the email message, then add whatever other messages are desired. Moreover, filtering on receipt does not defend against denial-of-service "email bombs" that add to recipient costs.

Although filtering is not perfect, it is useful. Email services and even email client software commonly incorporate some form of filtering based on the content in the body of the message, the subject field, even the apparent origination of the message. For a time, spammers were using false information in the "from" field of email messages to render the message unreplyable and to hide the actual source of the message. More recently, spammers simply harvest legitimate email addresses from a variety of sources including popular email distribution lists and use these to make the email appear legitimate.

The latter practice renders another method of filtering, the so-called "white list" or "black list" less effective. White lists are lists of source email addresses that you will automatically accept as legitimate. Black lists are lists of addresses from which you will automatically drop email upon receipt. Because the "from" field is not authenticated, the ease of falsifying it renders source address filtering of limited utility.

Some spamming is accomplished by exploiting vulnerabilities in email clients that will execute scripts that can reference the personal email address books of the compromised user. All of these mechanisms increase the difficulty of filtering email based on content, but filtering remains a useful if not entirely effective tactic.

Given the public’s widespread hatred of spam, it is not surprising that spammers work hard to obscure the origins of their email. Software is available that is intended to do exactly that, and there are services on the Net that are intended to "anonymize" the origins of email (and Web clients). The latter are sometimes justified on the grounds that whistle-blowing may require anonymity and while this is not an unreasonable argument, it is obviously exploitable by spammers who have a different reason for obscuring their identities.

Authentication. Closed email systems require all users to authenticate themselves to the service provider before sending email. Of course, spammers can always sign up as legitimate users but then they would lose the ability to obscure the origins of the email. The utility of authentication is weakened when virtually anyone can sign up, as is common with Web-based email services. Attempts have been made to create barriers for automated programs to establish temporary accounts by presenting Web pages with visually complex images and fonts that require a human to interpret the images and recognize the required authentication login strings visually. Unfortunately, while these methods have had some success they can be overcome simply by having a person create the account(s), so spammers can still use these accounts to originate spam.

Digital signatures have been suggested as another way to implement the white list or black list concept. Correspondents require that originators digitally sign their email. Only digitally signed email is accepted. Unfortunately, the mechanics of maintaining lists of public keys for authentication and associating them with originating email addresses is not as straightforward as one would like. And since these keys (or the passwords permitting their use) are often forgotten, keys change and keeping up is a problem. Moreover, just because a message is signed does not mean you know where it came from. More elaborate procedures and services are required to achieve this objective. While such practices can be made to work in small scales or within hierarchically managed organizations, making this work for the general public has been a challenge that has not been fully overcome.

Some email service providers refuse to accept the injection of email from parties whose Internet addresses are not recognized by the email service provider as coming from their community of users. This practice is sometimes cause for frustration for users who are mobile and connect to the Internet through public access mechanisms such as WiFi hot spots or hotels wired with high-speed Internet service or through dial-up accounts offered by other than the email relay service. The use of authentication between the email client and the MTA, such as through an authenticated and possibly encrypted tunnel to a virtual private network, can overcome this barrier. Combined with refusing to relay email from unauthenticated parties, this can be an effective way to block anonymous spam.


I have argued, though not always successfully, for revision of and strong enforcement of acceptable use policies as a tool for protecting public interest. It must be recognized that ISPs have a business to run and must assure fair and equitable treatment to all customers.


Other attempts to use authentication involve the creation of a set of mail transport agents that know about each other, can authenticate their email sessions and can confirm that they have rejected transfer of email from parties unknown. For someone operating his or her own MTA, this may prove a source of genuine complaint and frustration unless an avenue is available to authorize personal MTA operation through publicly accessible relays. A number of ISPs have responded to the spam problem by shutting down access to TCP port 25 (email SMTP port) for any hosts other than their own mail relays. Obviously, this measure has the side effect of interfering with peer-to-peer transfer of email, frustrating at least a portion of the Internet user community that legitimately operates its own mail relays, but it does interfere with abusive access to public email relays by spammers who may also be operating their own mail relays.

Another user-based authentication scheme that attempts to weed out automatically generated spam requires the sender to confirm that he or she originated the email. While some senders find this onerous and even insulting, it does have some positive effect on the filtering of email by holding it until confirmed. Of course, this introduces indeterminate delay in the delivery of email and consequently may degrade its utility. Furthermore, the method does not work well with distribution lists.

Legislation and law enforcement. In an effort to provide ISPs with additional tools and grounds for refusing service to spammers, a number of jurisdictions at local, state, and national levels have passed laws intended to outlaw spam. In the U.S., the so-called CAN-SPAM Act makes it unlawful, among other things, to originate spam in a manner intended to disguise or falsify the identity of the sender. CAN-SPAM also requires clear and effective opt-out mechanisms. Although CAN-SPAM will not eliminate spam (the Act does not outlaw spam sent in compliance with it), it may help to reduce the annoyance factor, by providing recipients with a clear ability to contact the sender and opt-out of future communications. Some U.S. states have more stringent limitations on spamming. In the state of Virginia, where I live, it is illegal to market or distribute software primarily intended to facilitate the sending of spam in a fraudulent manner. This goes a step beyond the federal CAN-SPAM Act, and is a potentially powerful tool, but it applies only within the state’s jurisdiction. Prosecution of parties offering such material outside the state of Virginia may not meet jurisdictional requirements. The problem is magnified dramatically when the offending parties may be anywhere on the global Internet.

Finally, the legislative process is generally a lengthy one, and is not well-suited for rapid changes in response to emerging technologies and abuses. We can’t rely on regulation alone to address the current and future issues of unsolicited communications. And in many instances, some regulation may be worse than no regulation if the existing legal requirements inhibit the ability of private entities to act expeditiously in the face of changing threats.

Back to Top

Acceptable Use Policies, Commercial Speech, and the First Amendment

Acceptable Use Policies (AUPs) are an effective tool for ISPs to combat various types of network abuse and unwanted activity. Based in contract law, AUPs are enforceable across political jurisdictions and can be adapted to meet new and emerging forms of network abuse.

However, in crafting AUPs, ISPs and other network providers must balance between protecting their networks from unwanted conduct on the one hand and serving as censors of objectionable content on the other. Clearly, there is much information and activity on the Internet that is objectionable to some segment of the viewing community. Common examples include pornography and many forms of political speech. Similarly, distributing various types of software, such as spamware, malware, DVD-copying software, encryption software, and personal spyware ("find out what your spouse is doing on the Net!"), raises the ire of various special-interest groups, privacy advocates, and law enforcement.

ISPs must strike the right balance if they are to maintain their ability to both "evict" bad actors and continue to serve as a neutral provider of global connectivity. And ISPs must remain vigilant—as various forms of abuse evolve, ISPs should reconsider their policies to ensure that they maintain the proper balance. Of course, this process can take time, especially if it involves changes to contractual arrangements with myriad customers, but it will certainly always be faster than the legislative process.

Finally, in many parts of the world, and particularly in the U.S., this balancing process must be undertaken in a cultural context that values free speech.

Whose judgment will prevail in determining what is acceptable and what is not? What is to be done about cross-jurisdictional boundaries, especially when they might be international? At what point does the practice cross the line from public protection to state-sponsored censorship? In the U.S., strong First Amendment laws limit the power of the state to bar speech.

The original intent of the First Amendment was to assure that the public could criticize the government (local, state, federal) with impunity. Thus, its initial interpretations tended to deal with political speech. In the ensuing several centuries, the concepts surrounding free speech have expanded to broader terms. One hears terms such as "protected speech" applied to adult discourse that may be unsuitable for children, to commercial speech in the form of bulk commercial postal mail and the like. Some speech is definitely not protected. Nor is all conduct the equivalent of "speech." Copyright violations are not protected. Nor is computer hacking, or distributing viruses, Trojans, or other malicious code. And, under virtually all ISP AUPs, neither is the sending of unsolicited email.

In my opinion, preservation of the openness of the Internet is an important objective. ISPs must be largely transparent with regard to their provision of access to and transport through the Internet. But there are limits to this doctrine and they can be drawn at illegal activities and at the point where the public interest is harmed by abuse. I have spent a significant part of my career looking for ways to stem abuse of the Internet while staying mindful of the importance of open access to the resources of this global system. While some, and perhaps even most, abuses may be best dealt with through law enforcement, some, such as spam, may be addressed through acceptable use policies.

I have argued, though not always successfully, for revision of and strong enforcement of acceptable use policies as a tool for protecting public interest. It also must be recognized that ISPs have a business to run and must assure fair and equitable treatment to all customers. These tensions must be resolved in sensible business practices.

Back to Top

Conclusion

While this essay can hardly be considered all-inclusive, it is intended to inspire thoughtful discussion among parties concerned about the abuse of the Internet. In particular, I hope organizations responsible for the provision of various Internet-based services will give serious consideration to strengthening the tools they have at hand to reduce the ability of spammers to inflict their traffic on unwilling recipients.

Back to Top

Join the Discussion (0)

Become a Member or Sign In to Post a Comment

The Latest from CACM

Shape the Future of Computing

ACM encourages its members to take a direct hand in shaping the future of the association. There are more ways than ever to get involved.

Get Involved

Communications of the ACM (CACM) is now a fully Open Access publication.

By opening CACM to the world, we hope to increase engagement among the broader computer science community and encourage non-members to discover the rich resources ACM has to offer.

Learn More