Extensions written with benign intent can have subtle security-related bugs, called vulnerabilities, that expose users to devastating attacks from the Web, often just by viewing a Web page. Firefox extensions run with full browser privileges, so attackers can exploit extension weaknesses to take over the browser, steal cookies or protected passwords, compromise confidential information, or even hijack the host system, without revealing their actions to the user. Unfortunately, dozens of extension vulnerabilities have been discovered in the last few years, and capable attacks against buggy extensions have already been demonstrated.11
eval statement that execute dynamically generated code).
We show that VEX identifies five previously known vulnerabilities, and identifies other flows that led to the discovery of seven previously unknown vulnerabilities, including vulnerabilities in the extensions WIKIPEDIA TOOLBAR, MOUSE GESTURES, and KAIZOU.
In this article, we focus on finding security vulnerabilities in buggy browser extensions. We do not try to identify malicious extensions, bugs in the browser itself, or bugs in other browser extensibility mechanisms, such as plug-ins. We assume that the developer is neither malicious nor trying to obfuscate extension functionality, but we assume the developer could write incorrect code that contains vulnerabilities.
According to the Mozilla developer site, Mozilla has a team of volunteers who help vet extensions manually. They run new and updated extensions isolated in a virtual machine to test the user experience. The editors also use a validation tool, which uses grep to look for key indicators of bugs. Many of the patterns they search for involve interactions between extensions and Web pages, and they use their understanding of these patterns to help guide their inspection of the code. Our goal is to help automate this process, so that analysts can quickly hone in on particular snippets of code that are likely to contain security vulnerabilities. Figure 1 shows our overall work flow for using VEX: when extensions are subject to analysis by VEX, it reports precise code paths from untrusted sources to executable sinks in the extensions' code, which an expert must manually examine to check whether they can be used to mount an attack.
Firefox has two privilege levels: page for the Web page displayed in the browser's content pane, and chrome for elements belonging to Firefox and its extensions. Page privileges are more restrictive than chrome privileges. For example, a page loaded from
illinois.edu can only access content from
illinois.edu. Firefox code and extensions run with full chrome privileges, which enable them to access all browser states and events, OS resources like the file system and network, and all Web pages. Extensions also can include their own user-interface components via a chrome document, which can run with full chrome privileges.
3.1. Untrusted sources
(document.popupNode) when the user right-clicks on document object model (DOM) elements. If this DOM element is part of the page content, then it includes untrusted page content.
One API that extensions use to access persistent state is the Resource Description Framework (RDF). RDF is a model for describing hierarchical relationships between browser resources17 and is used by the browser to store persistent data, like bookmarks. Extension developers can store persistent extension data in an RDF file, or access browser resources stored in RDF format. However, RDF data can come from untrusted sources. For example, when a user stores a bookmark, Firefox records the un-sanitized title of the bookmarked page, which is controlled by the Web page, in an RDF file. Extensions can also access un-sanitized bookmark URLs using the
nsILivemarkService interface and the
Extensions access Firefox preferences through the
nsIPrefService interface. Any extension can set values in the preferences, and extensions have unchecked access to all preference settings. Some extensions use this service to store untrusted strings obtained from Web page content; hence using this service is also treated as an untrusted source.
In summary, the VEX treats the following as untrusted source objects:
window.content.document,document.popupNode,BookmarksUtils, and access to the new instances of the objects
nsIRDFService,nsILivemark Service, and
3.2. Executable sinks
Each HTML element in a page has an
innerHTML property that defines the text that occurs between that element's opening and closing tags. Extensions can change the
onload attribute) into
innerHTML can lead to code injection attacks.
Extensions can add a new DOM element to a content page or chrome page by using the
appendChild method. This method causes the browser to parse and process the data within the element, similar to the
innerHTML property. Therefore, this feature can also be used to execute injected code.
In summary, the executable sinks that we consider in VEX are calls to the functions
AppendChild, and assignments to
In this section, the abstract heap is described in detail, followed by a description of the data structures used for the static analysis. The high-level ideas behind VEX'S static analysis are also described.
loc_Global has five properties
ObjectProt, FunctionProt, Array, ArrayProt, and
array_instance pointing to the nodes
loc_ObjProt, loc_FunProt, loc_1, loc_ArrayProt, and
loc_4 respectively. Every node in the heap is associated with a taint value,
HIGH or LOWHIGH representing the untrusted objects and LOW representing the trusted objects. High taints and low taints are represented by red and blue nodes, respectively, in the figures (all nodes in Figure 2 are
LOW). Figure 3 shows the initial abstract heap representation of the
window.content.document object and the
window.document object; notice that one of the nodes
loc_document has a high taint.
@Proto property, which is used to specify inheritance chains. Additionally, every function (that can be used as a constructor in
new) has a
prototype property. This
prototype property is used to instantiate the
@Proto property when a new object is created using the function constructor. An object inherits all the properties of its
@Proto and of all the objects in the prototype's
Figure 2 illustrates how VEX handles prototype-based inheritance. The
loc_1 in the figure. Since the
Array object is a constructor, which can be used to create new instances of the array, it has a
prototype field pointing to the object,
ArrayProt, represented in the graph by the node
loc_ArrayProt. A new
array_instance object, is created in the program using the statement:
array_instance = new Array (). In Figure 2,
loc_4 represents the
array_instance object. The
@Proto field of this object points to the object
loc_ArrayProt. Therefore, the push method is accessible to the
array_instance object and can be called using the
Array object is defined in Figure 4 to be a function object with the
@Proto properties initialized to the string "Function", identifier
ArrayProt, and identifier
FunctionProt, respectively. The variables
ArrayProt point to the prototype objects, which contain the various functions like
Browser's DOM API and XPCOM components: VEX treats most of the browser's DOM API, and XPCOM components as uninitialized variables, fields, and functions. However, VEX provides explicit function summaries for the API components and objects that VEX needs to keep track of in order to trace the flows to and from the objects. VEX analysis sets the taint of the objects that represent insecure sources or those that are dependent on insecure sources to HIGH.
Dynamically generated code: The
eval statements. VEX analysis performs a constant-string analysis for strings and string operations. If the actual parameters to the
eval statement evaluate to a constant string, VEX'S static analysis engine parses these constant strings and inserts them into the program flow just after the
eval statement. This ensures that these newly parsed statements are included in the computation of the taint. In most correct extensions, an
eval-ed statement is dynamically chosen from a set of constant-strings or taken from trusted sources, and hence evaluate to a constant string on the path explored (and tracked accurately by VEX). Parameters to
eval, whose exact string values are not statically inferred by VEX along the path explored, are tested to check if they are tainted. If there is a flow from an untrusted source to an
eval, VEX will report this flow, as it corresponds to a vulnerable flow pattern.
push method of an array can be called with any number of arguments and the arguments will be appended to the end of that array. To handle this in VEX, the object representing the
push method has a special property indicating that it can take a variable number of arguments and when the method is called, VEX analysis conservatively appends the taints of all the arguments to the
push method to the array object on which the method is called.
4.3. A note on soundness
Most static analysis tools, such as those used in compilers and those used in abstract interpretation, over-approximate the concrete semantics, and hence are sound. In the context of flow analysis, a sound tool never reports that a program has no flows when it has one. Soundness often entails a large number of false positives, i.e., flows that are reported by the tool but may not actually ever happen during execution.
Instead of aiming for soundness, we concentrated on making VEX fairly accurate on paths in the program, without collapsing (merging) the nodes of the heap in any way. However, since VEX can only analyze a finite number of paths in the program (obtained by unrolling recursion a bounded number of times) in this accurate manner, the analysis VEX performs is inherently not sound.
False positives are also, of course, still possible in VEX, i.e., VEX may report flows that actually do not exist in the program. This stems from the fact that the analysis uses an abstraction. In particular, not having precise enough information for evaluating conditionals, not precisely being able to determine the values of strings being subject to
eval statements, etc. are common sources for false positives. Compared to classical heap analysis in programs that merges nodes in heaps, VEX performs a much more accurate analysis that reduces the number of false positives considerably. In experiments, we found that VEX produces very few false positives.
During the execution of the program using the abstract operational semantics outlined in Section 4, if the program reaches a vulnerable sink, it checks if the inputs or assignments to the sink are tainted. If they are tainted, VEX reports the occurrence of the flow along with the source objects and sink locations in the code. The source objects are the objects described in Section 3 and the sink locations are the points where the sinks described in Section 3 are encountered during the execution. The rest of this section summarizes our results.
The number of loop unrollings can be set as a parameter in the VEX analysis engine (in our experiments, a bound of just one was used). The VEX implementation has a number of optimizations to improve memory usage and speed. To save memory, abstract heaps are freed when backtracking in the depth-first search. But to save time, abstract heaps at join points are cached and compared when other paths hit these points, to avoid exploring paths unnecessarily.
5.1. Evaluation methodology
The extensions we analyzed were chosen as follows. First, in October 2008, we built a suite of extensions using a random sample of 1827 extensions from the Mozilla add-ons Web site, by downloading the first extensions in alphabetical order for all subject categories. This extension suite had two extensions with known vulnerabilities. In November 2009, we downloaded 699 of the most popular extensions and 8 extensions with known vulnerabilities. The random sample and the popular extensions had 74 extensions in common, for a total of 2460 extensions. Our suite includes multiple versions of some extensions, allowing cross-version comparisons. For instance, we found a new version of the FIZZLE (see Bandhakavi et al.2), to be vulnerable even though its authors tried to fix the vulnerabilities in the previous version.
To evaluate the effectiveness of VEX, we perform two kinds of experiments. First, we run VEX on the downloaded extensions and check if any of them have one of the malicious flow patterns. Second, we check if VEX can detect known extension vulnerabilities.
5.2. Experimental results
Finding flows from injectible sources to executable sinks: Figure 5 summarizes the experimental results for flows that are from injectible sources to executable sinks (flows for which the sinks are
innerHTML). Of the 2460 extensions analyzed by VEX, a
grep showed that a total of 977 extensions had the occurrence of either the string "
eval" or the string "
innerHTML" or both.
The first column of Figure 5 indicates the exact source to sink flow pattern checked by VEX. The second column indicates the number of extensions on which VEX reports an alert with corresponding flows. On an average, VEX took 11.5 s per extension. It took about a week to analyze all the extensions with flows from untrusted sources to
To look for potential attacks, we manually analyzed the extensions with suspect flows found by VEX, spending about 20 min per extension on average. The next column reports the number of extensions on which we could engineer an attack based on the flows reported by VEX. We were able to attack nine extensions, of which only two extensions (FIZZLE VERSION 0.5 and BEATNIK V-1.0) were already known to be vulnerable. The rest of the attacks are new.
The next column shows the extensions where the source is provided either by the extension user or the extension developer or computed from the system parameters by the extension. The values are either stored in the preferences or in a local file. Since we trust the users and extension developers in our trust model, these extensions are considered to be non-vulnerable. However, if the preferences file or the local file system is corrupted in any way, these extensions can be attacked.
The fifth column shows the extensions where the source is code from a Web site, and where an attack is possible provided the Web site can be attacked. In other words, these extensions rely on a trusted Web site assumption (e.g., that the code on the Facebook Web site is safe). We think that these are valid warnings that users of an extension (and Mozilla) should be aware of; trusted Web sites can after all be compromised, and the code on these sites can be changed leading to an attack on all users of such an extension.
New vulnerabilities discovered: The number of security vulnerabilities discovered is shown in column 3 in Figure 5, of which 7 are new. WIKIPEDIA TOOLBAR versions V-0.5.7 and V-0.5.9 have flows from
eval, which leads to attacks. MOUSE GESTURES REDOX V-2.0.3 has flows from
eval, which also led to an attack. BEATNIK V-1.2, FIZZLE V-0.5.1, and FIZZLE V-0.5.2 are also attackable, and have flows from
innerHTML. KAIZOU V-0.5.8 has a flow from
innerHTML which leads to attacks. Section 5.3 gives some details about the flows and the attacks in some of the vulnerable extensions. Details about FIZZLE (and BEATNIK) vulnerabilities can be found in the previous version of this article.2
Known vulnerabilities detected: Apart from the new vulnerabilities found by VEX, there are several extensions that have been reported to be vulnerable in the past. In the course of our research, we found 18 unique extensions that were reported to be vulnerable in various databases like CVE, Secunia, etc. Of these 18, we did not find the source code for 5 extensions (GREASEMONKEY v 0.3.5, WIZZ RSS v < 220.127.116.11, SKYPE v 18.104.22.168, MOUSEOVERDICTIONARY v < 0.6.2, POW v < 0.0.9), so we did not analyze them. Of the remaining 13 extensions, we found that 10 of them can potentially be found using explicit information flow analysis techniques, like VEX.
Currently, VEX can detect 5 of the above 10 known extension that have flow-based vulnerabilities: FIZZLE V-0.5, BEATNIK V-1.0, COOLPREVIEWS V-2.7, 2.7.2, INFORSS V-<=22.214.171.124, and SAGE V- < 1.3.9, <=1.4.3. COOLPREVIEWS has flows from
appendChild. INFORSS has flows from
appendChild. SAGE has flows from
BookmarksUtils to an object accessing the local file system using the
Finally, there were three extension vulnerabilities (for which we had the source) that cannot be found by VEX because they are not flow vulnerabilities. These vulnerabilities include attacks on a file server (e.g., FIREFTP V < 0.97.2, < 1.04), and directory traversal attacks (e.g., NAVIGATIONAL SOUNDS version-1.0.2, AJAX YAHOO MAIL VIAMATIC WEBMAIL version-0.9) when a chrome package is "flat" rather than contained in a jar. In both the above cases, an attacker can escape from the extension's directory and read files in a predictable location on the disk. Since such attacks are not related to chrome privilege escalations, and VEX does not handle them.
5.3. Successful attacks
Attack scripts: All our attack scenarios involve a user who has installed a vulnerable extension who visits a malicious page, and either automatically or through invoking the extension, triggers script written on the malicious page to execute in the chrome context. Figure 6 illustrates an attack payload that can be used in such attacks: this script displays the folders and files in the root directory.
The attack payloads could be much more dangerous, where the attacker could gain complete control of the affected computer using XPCOM API functions. More examples of such payloads are enumerated in the white-paper given in Freeman and Liverani7 In this section, we illustrate a few attacks on extensions with previously unknown vulnerabilities.
Wikipedia Toolbar, up to version 0.5.9: If a user visits a Web page with the directory display attack script in its <head> tag, and clicks on one of the Wikipedia toolbar buttons (unwatch, purge, etc.), the script executes in the chrome context. The attack works because the extension has the code given in Figure 7 in its toolbar.js file.
The first line gets the first <script> element from the Web page and executes it using
eval. The extension developer assumes the user only clicks the buttons when a Wikipedia page is open, in which case <script> may not be malicious. But the user might be fooled by a malicious Wikipedia spoof page, or accidentally press the button on some other page. VEX led us to this previously unknown attack, which we reported to the developers, who acknowledged it, patched it, and released a new version. This resulted in a new CVE vulnerability (CVE-2009-41-27). The fix involved inserting a conditional in the program to check if the URL of the page is in Wikipedia's domain and evaluating the script only if this is true.
Kaizou v-0.5.8: Kaizou is a Web development extension that allows users to open the source of any Web page in a separate window, modify the contents and render it again in the current window by pressing a button. However, this separate window has chrome privileges, and when the user saves the changes he made to the page source, the scripts in the page are executed with chrome privileges. A malicious Web page can have an attack script, which could result in an attack when modified using KAIZOU.
Mouse Gestures Redox v-2.0.3: The MOUSE GESTURES REDOX extension allows users to create shortcuts for frequently used commands without using keyboard, menu, or toolbars. The users can either create new gestures or download them from an online source. The new gestures are scripts, which are stored in the browser's preferences file. When the gestures are enabled, they are retrieved from the prefs.js file and sent as arguments to the
eval () function, thereby activating the gestures. If any of the gestures downloaded from the internet contain attack scripts, they would be executed in the chrome context when
eval is called.
5.4. Flows that do not result in attacks
Figure 8 gives several examples of the suspect flows that we manually analyzed and for which either trusted sources were assumed by the extension or we could not find attacks.
The first set has extensions accessing values from Web sites or sources it trusts, and the values flow to
innerHTML. Of course, if the trusted sources are compromised, then the extensions may become vulnerable. The second set illustrates examples where the input was sanitized between the source and the sink. We do not know for sure that the sanitization is adequate, but we were unable to attack it. The third set of extensions had non-chrome sinks. The last set has two examples that show false positives where the flows reported by VEX do not exist in the code.
Louw et al.12 highlight some of the potential security risks posed by browser extensions, and propose run time support for restricting the interactions between browsers and extensions. Our analysis technique is complementary to their restrictions since even restricted interfaces can still be susceptible to security vulnerabilities.
We have presented VEX, a tool for detecting potential security vulnerabilities in browser extensions using static analysis. VEX helps in automating the difficult manual process of analyzing browser extensions, by identifying and reasoning about subtle and potentially malicious flows. Experiments on thousands of extensions indicate that VEX is successful at identifying flows that indicate potential vulnerabilities and greatly reducing the number of flows that must be vetted manually. Using VEX, we identified seven previously unknown security vulnerabilities and five known vulnerabilities, together with a variety of instances of unsafe programming practices.
An interesting future direction is to develop automatic ways to synthesize attacks that exploit flows reported by VEX. A technique based on constraint solving to generate attack inputs that satisfy the path constraints in the flow seems appropriate.
In the broader context, there is an increasing number of settings where small software teams (consisting of even one or two people) write software that is downloaded and used by hundreds of thousands of people. Browser extensions fall in this category, but several others have emerged, including mobile phone applications (for iPhone/Android/Windows) and Facebook applications. The teams writing these software do not always think about security carefully, leaving their users with potential privacy and integrity risks. We believe that precise static analysis tools, such as the one presented in this paper, combined with more precise and adaptable access control policies, can help address this security concern.
We thank Chris Grier and Mike Perry who directed us to the Firefox extension vulnerabilities. This research was funded in part by NSF CAREER award #0747041, NSF grant CNS #0917229, NSF grant CNS #0831212, grant N0014-09-1-0743 from the Office of Naval Research, and AFOSR MURI grant FA9550-09-01-0539.
1. ANTLR Parser Generator. http://www.antlr.org, 2008.
2. Bandhakavi, S., King, S.T., Madhusudan, P., Winslett, M. VEX: Vetting browser extensions for security vulnerabilities. In Proceedings of the 19th USENIX Conference on Security, USENIX Security '10 (Berkeley, CA, 2010), USENIX Association, 339354.
3. Boodman, A. The Greasemonkey Firefox extension. https://addons.mozilla.org/en-US/firefox/addon/748, 2005.
6. Djeric, V., Goel, A. Securing script-based extensibility in web browsers. In Proceedings of the 19th USENIX Conference on Security, USENIX Security '10 (Berkeley, CA, 2010), USENIX Association, 355370.
7. Freeman, N. Liverani, R.S. Exploiting cross context scripting vulnerabilities in Firefox (April 2010). http://www.security-assessment.com/files/whitepapers/Exploiting_Cross_Context_Scripting_vulnerabilities_in_Firefox.pdf
15. Maone, G. NoScript Firefox extension. http://noscript.net/
16. Ramalingam, G. ed. Programming Languages and Systems, In Proceedings of the 6th Asian Symposium, APLAS 2008 (Bangalore, India, December 911, 2008), volume 5356 of Lecture Notes in Computer Science. Springer, 2008.
17. Waterson, C. RDF in fifty words or less. https://developer.mozilla.org/en/RDF_in_Fifty_Words_or_Less (June 9, 2008).
A previous version of this paper was published in the USENIX Security Symposium, Aug. 2010.
©2011 ACM 0001-0782/11/0900 $10.00
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and full citation on the first page. Copyright for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or fee. Request permission to publish from firstname.lastname@example.org or fax (212) 869-0481.
The Digital Library is published by the Association for Computing Machinery. Copyright © 2011 ACM, Inc.
No entries found