Research and Advances
Computing Applications Spyware

Web Browsing and Spyware Intrusion

The Web browsing habits of users influence the dissemination of spyware and—with enough savvy—will play a critical role in fighting it.
Posted
  1. Introduction
  2. Spyware Violations
  3. Research Procedure
  4. Results
  5. Limitations
  6. Conclusion
  7. References
  8. Authors
  9. Figures
  10. Tables

Spyware poses serious privacy and security issues to users of e-commerce and m-commerce [3, 5, 8]. Microsoft claims that half of all computer crashes reported by its customers were caused by spyware and its equivalents. Spyware is also responsible for about 12% of all technical support calls and accounts for the biggest category of customer complaints, according to Dell [1].

Four spyware programs (Gator, Cydoor, SaveNow, eZula) were found to affect nearly 5.1% of active hosts on the University of Washington campus [6]. These are only four of the thousands of spyware programs prevalent today. Indeed, the list is constantly growing despite the availability of advanced security and anti-spyware software. Here, we assess spyware dissemination on a PC by different kinds of Web sites.

The Internet provides access to billions of files in formats for text, image, music, or video. Spyware can use any of these file types for installation. Drawing on the terminology used in the Spyware Guide [10], we present six categories of commonly found spyware in Table 1.

Adware displays advertisements while the browser is running. It may track personal information (via data miners) as well as gather information anonymously or in aggregate [10].

Browser changers change browser settings such as the start page and search functionality. Typically, the start pages are lowly search portals with lots of pay-per-click advertising, and may transmit the viewed URLs to a server [10].

Browser plug-ins typically appear in the form of search toolbars (for example, search assistants) in browsers. Not all browser plug-ins are harmful; however, many have been known to transmit user data to their creators and are installed using covert means [10].

Bundleware, software included with other software, may be installed via drive-by download in which it tries to install from an unrelated Web page, often triggered by third-party advertisements. Bundleware can be installed when users click “Yes” for download, believing it is needed to load the Web page. Aggressive drive-by download uses coercion tactics to try to get the user to click “Yes,” such as by repeatedly opening error windows when the user refuses to allow downloads [2].

Keyloggers are used to monitor keystrokes and may include logs of Web sites visited, instant messaging sessions, and programs executed [6].

Dialers change dial-up settings on a computer to connect to another location, which may result in expensive long-distance charges [10]. A stealth dialer makes its calls without any prompting from the user. A hijacking dialer changes the default Internet dial-up connection, so that all future Internet connections are routed through numbers that incur expensive charges [2].

Spyware creators are constantly using new ways to exploit old vulnerabilities and are quick to take advantage of newly found vulnerabilities. Many instances of spyware can carry out self-update automatically, which allows new functions to be introduced and can evade spyware signatures contained in anti-spyware tools [6].


Spyware infections are widespread, even among highly visited and trusted Web sites. User browsing is bound to increase with the growth of IT, and Web services shall continue to seek to exploit user browsing behavior patterns.


Back to Top

Spyware Violations

Even in computing environments that encrypt data, spyware remains a threat because its keystroke-logging components can capture input before the encryption takes place. Many Web sites and pop-up windows take advantage of popular cultural, musical, and artistic trends to attract users and spread spyware on their PCs. For example, users may install popular free game software that contains spyware bundled with it.

Spyware succeeds today because information (such as demographics and behavior of Internet users) creates value to advertisers and product vendors. Gathering keystrokes or introducing backdoor vulnerabilities on a host is also of value to attackers, such as virus writers.

We use the classifications of violations defined by Ad-Aware [4] to categorize the effects of spyware. These categories give a broad framework for classifying spyware violations:

Removal violation occurs when there is either no uninstaller provided for the software (spyware) or the uninstaller is nonfunctional. The uninstaller may be poorly coded or could be a buggy uninstaller [4].

Distribution violation is indicated by stealth installation or lack of clear evidence of intention. Examples include undisclosed bundled installs (ad-supported components, toolbars), confusing end-user license agreement (some installers display this information in an unnecessarily small scrolling box, making it difficult to review), and drive-by downloads [4].

Integration violation refers to the effect that a given spyware candidate has on the stability of a computer system in terms of system performance. This can be in the form of browser-related or system crashes, utilization of processing power, memory consumption, and bandwidth consumption [4].

Behavior violation is indicated by an intentional behavioral change in a computer not made known to the user. For example, a spyware program may carry out another task in the background (hidden from the user). The spyware programs may affect the browser—browser hijack, browser redirection, or replacement of text or graphics [4].

Privacy violation is indicated by a connection to a remote system without the user’s awareness to transmit/receive usage statistics or personal information. Data miners typically cause this type of violation [2].

These violations include the effects of spyware prevalent on computers today, many of which occur concurrently and their significance can vary depending on the characteristics of the user, location, or context.

Back to Top

Research Procedure

Here, we assess the types of spyware dissemination and their associated violations among the most commonly visited Web sites across four categories: e-commerce; recreation and entertainment (R&E); download and directory/search; and news and education (N&E).

A complete classification of spyware programs is a mammoth task with many practical limitations. Thus, we focus on the most commonly occurring spyware (see Table 1). Note that malware (for example, viruses, trojans, and worms) is excluded because it can be detected by most anti-virus software and has a more malicious intent. Also, cookies are not included in this study because they are not considered spyware.

Our sample comprises 600 Web sites identified from one of the major ranking sites—www.trafficranking.com—which uses an index called trust gauge (TG) to determine the trustworthiness of Web sites [12]. The TG of a Web site, as defined by trafficranking.com, is determined based on the following factors:

Web site content and business availability, which includes the availability of information such as email addresses, feedback forms, postal addresses, physical locations for clients to visit, customer service or sales office phone numbers, toll-free phone numbers, and the existence of a privacy policy on the Web site.

Web site features and popularity, referring to site features and general popularity among Web users. Some of the criteria used include the presence of secure billing pages and their overall traffic ranking.

Third-party validation checks for the presence of third-party validation. Web sites and businesses that invest in third-party validation and verification are awarded additional points for their effort in attempting to build consumer confidence on the Web.

The TG of Web sites is given in integer form, from 0 to 10. Table 2 presents the TG scoring chart [12].

The sample of 600 Web sites comprises the most frequently visited Web sites across the three TG categories (Table 2) and the four Web site categories presented earlier. In other words, each of the four Web site categories comprises 150 Web sites of which 50 were from each TG category (that is, low, medium, high). These 50 Web sites represented the most highly visited sites in a TG category.

Trafficranking.com regards a visit as a trip made to a site by an individual. If a visitor went to a site (a unique visit) and looked at various pages, one visit was generated. If he or she came back several hours later, another visit was generated. The number of visits was used to determine the relative popularity ranking of Web sites, which was used to determine the Web sites included in this study—conducted from mid-October to mid-November, 2004. The study involved visiting each of the 600 sites individually where the user simulated the behavior of naïve users by clicking “yes” when drive-by downloads were encountered and allowing codes to be downloaded (ActiveX controls and JavaScript) when prompted.

The study was carried out on dedicated PCs using a security setting of medium (the default on most computers) in Internet Explorer v6, with Symantec anti-virus software installed and Zone Alarm as the firewall. This configuration is considered typical in most organizations. Our choices for these settings and software were also based on popularity, effectiveness, and availability. Ad-Aware was the spyware tool used in this study for detecting and identifying the spyware. We cross-checked our results with another popular spyware removal tool, Spybot [9], and found a very high degree of consistency in the detection of spyware. Zone Alarm was used to monitor spyware programs. Existing spyware databases were also used to verify the spyware categories identified by anti-spyware tools.

Back to Top

Results

Table 3a presents the extent of distribution of different types of spyware from the sample sites. The data shows that adware is the largest category of spyware disseminated by these Web sites. In fact, the adware count seems to follow the advertising scope in these Web site categories—with R&E having the largest scope followed by e-commerce.

Browser plug-ins, browser changers, and bundled software present the next largest categories of spyware found. Across all Web site categories, 12 kinds of browser plug-ins were search related (such as search toolbars). There were fewer occurrences of dialers and keyloggers compared to other types of spyware. The highest count of dialers appears in the R&E sites. The three instances of dialers in the low TG sites of R&E categories were related to online games that offered free sign up.

The R&E sites also have the relatively highest concentration of browser plug-ins and browser changers. Of the 11 instances of browser changers found in the R&E category (six in high TG, three in medium TG, and two in low TG sites), eight instances changed the home page settings to some external search Web pages—two of which induced the installation of Japanese script fonts needed to view the forced home page.

The N&E sites have scattered occurrences of spyware in the high TG category. The instances of spyware found in the N&E category primarily forced the installation of software to receive alerts about latest events. Three of the six instances of spyware in the medium TG sites were related to weather updates. In the low TG sites, all the three instances of the bundled software were related to stock alerts.

Table 3b provides data on the extent of spyware infection or violation caused by visiting these Web sites. The total number of violations occurring across the different TG ranges of the different categories is shown in the figure1, which illustrates that Web sites with a lower TG are responsible for the highest rate of spyware dissemination across all categories. The number of violations decreases from low TG sites to high TG sites across all the categories.

Most of the low- and medium-TG sites responsible for spyware infection have as advanced features (layout, aesthetic appeal, and so on) as high TG sites. They also offer features such as online billing and the facility to store user information. Many Web sites make it mandatory to sign up (usually for free) for an account before any relevant information can be viewed. The confirmation password is sent to the user email account specified, thus forcing the user to give an authentic email address. After signing up, the user may be coaxed into signing up for numerous free offers already checked or included on the sign-up form, usually in an obscure manner such as after a large paragraph (with a small scrollbar) offering disclaimer information that users generally ignore.

Even Web sites with a high TG are responsible for spreading spyware. People visit these trusted Web sites repeatedly and are thus continuously infected with spyware. It is interesting to note that many users treat spyware as legitimate software needed to access these Web sites.

Back to Top

Limitations

Although we have tried to be as generic as possible in our classification of Web sites and have assessed spyware dissemination from a fairly large sample of popular sites, certain spyware instances may still have been missed. Another potential problem is our spyware removal tool could potentially miss some spyware traffic. Finally, our sample size did not include some sites “rich” in spyware such as pornographic sites.

Back to Top

Conclusion

The problem concerning spyware has a large scope—with user browsing behavior (for example, types of Web sites visited and their frequencies) responsible for much of the spyware dissemination on computers. This article illustrates that spyware infections are widespread, even among highly visited and trusted Web sites. User browsing is bound to increase with the growth of IT, and Web services shall continue to seek to exploit user browsing behavior patterns. The following are some guidelines to minimize spyware infections:

  • Enforcing the usage of high security settings in Internet Explorer that can override the default security settings (medium) or user-defined settings. This can be used to prevent and track sites that repeatedly cause spyware infection.
  • Using Web site blockers and Internet filters whenever possible, particularly for low- to medium-TG sites, and on computers used by many people such as those in computer labs.
  • Controlling and monitoring downloads from Web sites—especially those from low- to medium-TG sites—because many of these downloads are automatic.
  • Spam controls (for example, SpamJam in the Lotus Notes email client) should be enforced more stringently to prevent incoming email that can initiate spyware dissemination (via hyperlinks in the email).
  • Using pop-up blockers that lead to low-TG sites. Employing alternate browsers such as Mozilla Firefox, which have built-in pop-up blockers and download managers, can be an effective measure.
  • A firewall is a strong defense mechanism against spyware. If used effectively, it can block spyware from sending information over the Internet and block the self-update features of spyware.
  • Providing real-time protection against spyware with active updates is an effective measure against spyware although it occasionally interferes with legal pop-ups.

Spyware shall remain an important problem to deal with in the future. Major IT companies such as Microsoft and Symantec have realized the threats posed by spyware and are developing anti-spyware tools: Microsoft’s AntiSpyware scanner (Beta version as of April 2005) and Symantec’s integrated anti-virus, firewall, and anti-spyware (Norton Internet Security 2005 AntiSpyware Edition, launched in May). Also, industry and lawmakers must establish policies and pass new legislation to police the acceptance use of the Internet [7]. User awareness and browsing habits shall continue to play an important role in fighting this problem.

Back to Top

Back to Top

Back to Top

Figures

UF1 Figure. Spyware violations.

Back to Top

Tables

T1 Table 1. Types of spyware.

T2 Table 2. Description of trust gauge (TG) categories and ranges.

T3A Table 3a. Analysis of spyware categories.

T3B Table 3b. Analysis of spyware violations

Back to top

    1. Asaravala, A. Sick of spam? Prepare for adware. Wired News. (May 07, 2004); www.wired.com/news/print/0,1294,63345,00.html.

    2. Doxdesk. Definition of parasite related terms; www.doxdesk.com/parasite/definitions.html.

    3. Galanxhi-Janaqi, H. and Nah, F. U-commerce: Emerging trends and research issues. Industrial Management and Data Systems 104, 9 (2004), 744–755.

    4. Lavasoft; www.lavasoft.com

    5. Nah, F., Siau, K., and Sheng, H. The value of mobile applications: A study on a public utility company. Commun. ACM 48, 2 (Feb. 2005), 85–90.

    6. Saroiu, S., Gribble, S.D., and Levy, H.M. Measurement and analysis of spyware in a university environment. In Proceedings of the First Symposium on Networked Systems Design and Implementation (San Francisco, CA, Mar. 29-31, 2004), 141–153.

    7. Siau, K., Nah, F., and Teng, J. Internet abuse and acceptable Internet use policy, Commun. ACM 45, 1 (Jan. 2002), 75-79.

    8. Siau, K. and Shen, Z. Building customer trust in mobile commerce, Commun. ACM 46, 4 (Apr. 2003), 91–94.

    9. Spybot; security.kolla.de.

    10. SpywareGuide; www.spywareguide.com.

    11. Stafford, T.F., and Urbaczewski, A. Spyware: The ghost in the machine. Commun. AIS 14, (2004), 291–306.

    12. TrustGauge; trustgauge.com/?xcmpx=1474.

Join the Discussion (0)

Become a Member or Sign In to Post a Comment

The Latest from CACM

Shape the Future of Computing

ACM encourages its members to take a direct hand in shaping the future of the association. There are more ways than ever to get involved.

Get Involved

Communications of the ACM (CACM) is now a fully Open Access publication.

By opening CACM to the world, we hope to increase engagement among the broader computer science community and encourage non-members to discover the rich resources ACM has to offer.

Learn More