Most current and historic problems in computer and network security boil down to a single observation: letting other people control our devices is bad for us. At another time, I will explain what I mean by “other people” and “bad.” For the purpose of this article, I will focus entirely on what I mean by control. One way we lose control of our devices is to external distributed denial-of-service (DDoS) attacks, which fill a network with unwanted traffic, leaving no room for real (“wanted”) traffic. Other forms of DDoS are similar—an attack by the Low Orbit Ion Cannon (LOIC), for example, might not totally fill up a network, but it can keep a Web server so busy answering useless attack requests the server cannot answer any useful customer requests. Either way, DDoS means outsiders are controlling our devices, and that is bad for us.
Surveillance, exfiltration, and other forms of privacy loss often take the form of malicious software or hardware (so, “malware”) that somehow gets into your devices, adding features like reading your address book or monitoring your keystrokes and reporting that information to outsiders. Malware providers often know more about our devices than we as users (or makers) do, especially if they have poisoned our supply chain. This means we sometimes use devices we do not consider to be programmable, but which actually are programmable by an outsider who knows of some vulnerability or secret handshake. Surveillance and exfiltration are merely examples of a device doing things its owner does not know about, would not like, and cannot control.
Because the Internet is a distributed system, it involves sending messages between devices such as computers and smartphones, each containing some hardware and some software. By far the most common way that malware is injected into these devices is by sending a message that is malformed in some deliberate way to exploit a bug or vulnerability in the receiving device’s hardware or software, such that something we thought of as data becomes code. Most defense mechanisms in devices that can receive messages from other devices prevent the promotion of data that is expected to contain text or graphics or maybe a spreadsheet to code, meaning instructions to the device telling it how to behave (or defining its features). The failed promise of anti-virus software was that malware could be detected using pattern matching. Today we use anti-virus tools to clean up infected systems, but we know we cannot count on detecting malware in time to prevent infection.
So we harden our devices, to try to keep data outside from becoming code inside. We shut off unnecessary services, we patch and update our operating systems, we use firewalls to control who can reach the services we cannot shut off, we cryptographically sign and verify our code, and we randomize the placement of objects in program memory so if data from the outside does somehow become code inside our devices, that code will guess wrong about where it landed and so fail to hurt us. We make some parts of our device memory data-only and other parts code-only, so a successful attack will put the outsider’s data into a part of our device memory where code is not allowed to be executed.
We log accesses to our systems, hits on our firewalls, and flows on our networks, trying to calibrate the normal so as to highlight the abnormal. We buy subscriptions to network reputation systems so other devices known to be infected with malware cannot reach our services. We add CAPTCHAs to customer registration systems to keep botnets from creating fake accounts from which to attack us from inside our perimeters. We put every Internet-facing service into its own virtual machine so a successful attack will reach only a tiny subset of our enterprise.
Inviting the Trojan Horse Inside
And then, after all that spending on all that complexity for defense, some of us go on to install a Dynamic Content Management System (DCMS) as our public-facing Web server. This approach is like building a mighty walled city and then inviting the Trojan horse inside, or making Achilles invulnerable to harm except for his heel. WordPress and Drupal are examples of DCMSs, among hundreds. DCMSs have a good and necessary place in website management, but that place is not on the front lines where our code is exposed to data from the outside.
The attraction of a DCMS is that nontechnical editors can make changes or additions to a website, and those changes become visible to the public or to customers almost instantly. In the early days of the World Wide Web, websites were written in raw HTML using text editors on UNIX servers, which means, in the early days of the Web, all publication involved technical users who could cope with raw HTML inside a UNIX text editor. While I personally think of those as “the good old days,” I also confess the Web was, when controlled entirely by technical users, both less interesting and less productive than it is today. A DCMS is what enables the Web to fulfill the promise of the printing press—to make every person a potential publisher. Human society fails to thrive when the ability to speak to the public is restricted to the wealthy, to the powerful, or to the highly technical.
A Dynamic Content Management System is extremely dangerous—to the operators who use it.
And yet, a DCMS is extremely dangerous— to the operators who use it. This is because of the incredible power and elasticity of the computer languages used to program DCMSs, and the power and elasticity of the DCMSs themselves. A DCMS gives us a chance to re-fight and often re-lose the war between data on the outside and code on the inside. Most of the computer languages used to write Web applications such as DCMSs contain a feature called eval
, where programming instructions can be deliberately promoted from data to code at runtime. I realize that sounds insane, and it sort of is insane, but eval
is merely another example of how all power tools can kill. In the right skilled hands, eval
is a success-maker, but when it is left accessible to unskilled or malicious users, eval
is a recipe for disaster. If you want to know how excited and pleased an attacker will be when they find a new way to get your code to eval
their data, search the Web for “Little Bobby Tables.”
But even without eval
in the underlying computer language used to program a DCMS or in the database used to manage that program’s long-term data, such as student records, most DCMSs are internally data driven, meaning that DCMS software is often built like a robot that treats the website’s content as a set of instructions to follow. To attack a DCMS by getting it to promote data to code, sometimes all that is needed is to add a specially formatted blog post or even a comment on an existing blog post. To defend a DCMS against this kind of attack, what is needed is to audit every scrap of software used to program the DCMS, including the computer language interpreter; all code libraries, especially Open SSL; the operating system including its kernel, utilities, and compilers; the Web server software; and any third-party apps that have been installed alongside the DCMS. (Hint: this is ridiculous.)
Distributed Denial of Service
Let’s rewind from remote code execution vulnerability (the promotion of outsider data into executable code) back to DDoS for a moment. Even if your DCMS is completely non-interactive, such that it never offers its users a chance to enter any input, the input data path for URLs and request environment variables has been carefully audited, and there is nothing like Bash installed on the same computer as the Web server, a DCMS is still a “kick me” sign for DDoS attacks. This is because every DCMS page view involves running a few tiny bits of software on your Web server, rather than just returning the contents of some files that were generated earlier. Executing code is quite fast on modern computers, but still far slower than returning the contents of pre-generated files. If someone is attacking a Web service with LOIC or any similar tool, they will need 1,000 times fewer attackers to exhaust a DCMS than to exhaust a static or file-based service.
Astute readers will note that my personal website is a DCMS. Instead of some lame defense like “the cobbler’s children go shoeless,” I will point out the attractions of a DCMS are so obvious that even I can see them —I do not like working on raw HTML using UNIX text editors when I do not have to, and my personal Web server is not a revenue source and contains no sensitive data. I do get DDoS’d from time to time, and I have to go in periodically and delete a lot of comment spam. The total cost of ownership is pretty low, and if your enterprise website is as unimportant as my personal website, then you should feel free to run a DCMS like I do. (Hint: wearing a “kick me” sign on your enterprise website may be bad for business.)
At work, our public-facing website is completely static. There is a Content Management System (CMS), but it is extremely technical—it requires the use of UNIX text editors, a version control utility called GIT, and knowledge of a language called Markdown. This frustrates our non-technical employees, including some members of our business team, but it means our Web server runs no code to render a Web object—it just returns files that were pre-generated using the Ikiwiki CMS. Bricolage is another example of a non-dynamic CMS but is friendlier to non-technical WYSIWYG users than something like Ikiwiki. Please note that nobody is DDoS-proof, no matter what their marketing literature or their annual report may say. We all live on an Internet that lacks any kind of admission control, so most low-investment attackers can trivially take out most high-investment defenders. However, we do have a choice about whether our website wears a “kick me” sign.
There is a hybrid model, which I will call mostly static, where all the style sheets, graphics, menus, and other objects that do not change between views and can be shared by many viewers are pre-generated and are served as files. The Web server executes no code on behalf of a viewer until that viewer has logged in, and even after that, most of the objects returned on each page view are static (from files). This is a little bit less safe than a completely static website, but it is a realistic compromise for many Web service operators. I say “less safe,” because an attacker can register some accounts within the service in order to make their later attacks more effective. Mass account creation is a common task for botnets, and so most Web service operators who allow online registration try to protect their service using CAPTCHAs.
The mostly static model also works with Content Distribution Networks (CDNs) where the actual front end server that your viewers’ Web browsers are connecting to is out in the cloud somewhere, operated by experts, and massively overprovisioned to cope with all but the highest-grade DDoS attacks. To make this possible, a website has to signal static objects such as graphics, style sheets, and JavaScript files are cacheable. This tells the CDN provider that it can distribute those files across its network and return them many times to many viewers—and in case of a DDoS, many times to many attackers. Of course, once a user logs into the site, there will be some dynamic content, which is when the CDN will pass requests to the real Web server, and the DCMS will be exposed to outsider data again. This must never cease to be a cause for concern, vigilance, caution, and contingency planning.
As a hybrid almost-CDN model, a mostly static DCMS might be put behind a front-end Web proxy such as Squid or the mod_proxy feature of Apache. This will not protect your network against DDoS attacks as well as outsourcing to a CDN would do, but it can protect your DCMS’s resources from exhaustion. Just note that any mostly static model (CDN or no CDN) will still fail to protect your DCMS code from exposure to outsider data. What this means for most of us in the security industry is that static is better than mostly static if the business purpose of the Web service can be met using a static publication model.
So if you are serious about running a Web-based service, don’t put a “kick me” sign on it. Go static, or go home.
Related articles
on queue.acm.org
Finding More Than One Worm in the Apple
Mike Bland
http://queue.acm.org/detail.cfm?id=2620662
Internal Access Controls
Geetanjali Sampemane
http://queue.acm.org/detail.cfm?id=2697395
DNS Complexity
Paul Vixie
http://queue.acm.org/detail.cfm?id=1242499
Join the Discussion (0)
Become a Member or Sign In to Post a Comment