Well-funded groups in China are gathering sensitive information by breaking into U.S. government networks [7]. The extent of these intrusions and the nature of data exposed are not fully known, and are raising national security concerns. At the same time, well-organized criminals are targeting credit card numbers and other sensitive data via the Internet, creating major security and privacy concerns. For instance, in 2005, intruders gained unauthorized access to 40 million credit card numbers from CardSystems [10]. The increase in organized criminals, foreign governments, and non-state actors1 breaking into computer systems is raising the stakes of computer crime, and is compelling organizations to treat security breaches more seriously.
Network intrusions are among the most challenging kinds of computer crime to investigate, especially when dealing with sophisticated, highly motivated intruders. Given the dynamic nature of networks, investigators must act quickly to locate and preserve potential evidence before it is lost or altered, all without disrupting operations of the organization. Investigators also have a very compressed timeframe to answer complex questions, including what sensitive information was exposed, how and when the intruders gained access, and where they can be apprehended. To answer these questions, it is necessary to sift and correlate large amounts of data quickly in various formats from systems in multiple time zones.
Locating and preserving evidence is even more difficult when intruders are actively attempting to conceal or destroy evidence. In addition, when intruders use customized toolsets, the response and investigation may only be formulated as evidence of the intrusion is uncovered. Consequently, investigations of these intrusions are highly reactive, and out-of-the-box forensic products are generally insufficient—we must combine various existing tools and methods, and develop custom tools and solutions for the specific case. For instance, investigators may need to perform advanced program analysis and create antidotes—specialized scanning and monitoring tools that detect and counteract the intruder’s tools.
This article describes how computer security professionals and digital investigators can work together to respond more effectively to major security breaches, and focuses on current challenges, recent advances, and future needs.
How Sophisticated Intruders Operate
Following the path of least resistance, even sophisticated intruders gain entry to networks through widely known vulnerabilities. Generally they only need to exercise their technical sophistication to maintain a foothold in the compromised network, conceal their presence, and pilfer valuable data.
Careful intruders attempt to hide or remove evidence of an intrusion by deleting logs, altering date-time stamps, and installing their own utilities to subvert the operating system. Programs like Hacker Defender (hxdef.czweb.org) alter the kernel and return false information to system calls, rendering useless most tools that incident responders have traditionally used to examine a live system for signs of compromise. In addition to hiding suspicious files, processes, listening ports, and other signs of compromise from trusted utilities that incident responders run from a compact disk, these newer kernel rootkits can even subvert tools specifically designed to detect earlier generations of rootkits. In addition, tools are being developed specifically to make forensic examination more difficult (www.metasploit.com/projects/antiforensics/).
Increasingly, intruders are using strong encryption to cloak their activities by encrypting data before stealing it, encoding communications between compromised hosts, and obfuscating executables. More sophisticated intruders use covert channel techniques to conceal their malicious activities within legitimate- looking network activities, such as DNS or HTML traffic.
Recovering evidence from compromised hosts is only half the battle. Locating the intruders is also becoming more challenging. Sophisticated intruders hide their location and work around firewall restrictions using time-activated backdoors that periodically “phone home,” initiating a connection from inside the victim network to a remote host that the intruder controls. Some of these backdoors create a tunnel through firewalls that the intruder can use to communicate with compromised hosts, even establishing a Windows Terminal Service session when this protocol is blocked by a firewall.
Careful intruders attempt to hide or remove evidence of an intrusion by deleting logs, altering date-time stamps, and installing their own utilities to subvert the operating system.
More sophisticated systems use encrypted packets and phone home techniques to maintain control over compromised hosts as shown in Figure 1. In this example, the backdoor program runs at regular intervals and obtains instructions from remote computers that are under the intruder’s control. Using one host (Repository 1) to store a configuration file that simply contains the IP address of the command and control (C&C) host enables the intruder to change the C&C host at any time. The intruder periodically updates the C&C host with new instructions and waits for the compromised host to phone home, obtain the instructions, and execute them. For instance, in one case the intruder first instructed the backdoor to deliver a full directory listing of the compromised host. Later, after reviewing the directory listing, the intruder instructed the backdoor to deliver certain files to another compromised host (Repository 2).
These layers of separation eliminate the need for the intruder to connect with compromised hosts directly, reducing the intruder’s risk of apprehension. Uploading instructions to the C&C host via a proxy can make it even more difficult for investigators to locate the intruder, but the intruder may connect to some systems directly, providing investigators with a solid lead.
Consider the difficulty of detecting these malicious network activities when they are designed to blend in with normal traffic and have encoded payloads. Even after investigators have detected and decoded the malicious traffic, it can be challenging to obtain all of the components of a C&C system that are spread across dozens of computers in dispersed jurisdictions. In addition, the purpose of each C&C system component may not be obvious, and reverse engineering is generally required to determine how the entire system functions.
Conducting High-Stakes Digital Investigations
A multidisciplinary investigative team with a range of skills is usually needed to apprehend sophisticated intruders. The ideal investigative team has expertise in information security, digital forensics, penetration testing, reverse engineering, programming, and behavioral profiling.
A capable and experienced team will select the best approach for each given task, sometimes developing custom programs. In addition to technical expertise, it is important to involve people who have experience interacting with law enforcement and intelligence agencies in multiple jurisdictions and managing digital investigations.
Because of the complexity and fast pace of high-stakes investigations, record-keeping and case management are critical to keep track of information flowing in, and to update team members with new discoveries that may help them reach a clearer understanding of the intrusion. There is a need for better tools to manage investigative leads, tasks, and associated documentation, and to help the investigative team share information and generate reports as needed.
Evidence Preservation: The Need for Speed
A successful digital investigation is heavily dependent on the logging and backup systems an organization has in place, and how quickly sources of evidence are located and preserved. The process of identifying and preserving potential sources of evidence on a compromised network includes acquiring the contents of hard drives and physical memory on hosts, freezing various logs, and capturing network traffic.
Since organizations are often unprepared for forensic investigations, and investigators rarely have full knowledge of a victim network, the speed and completeness of this preservation effort is heavily dependent on information gleaned from interviews with the IT and security staffs of an organization, including the actual end users or administrators of the systems or monitoring devices in question. All of this collection is performed in a forensically sound manner to ensure that complete and accurate copies are obtained, and the authenticity of the evidence is documented for future reference. Proper documentation enables anyone to verify the evidence originated from its purported source, and the contents have not been altered.
Although there is an increasing awareness of the need for digital forensics when dealing with critical security breaches, even some professional computer security consultants are ill-prepared from a forensic standpoint. In one high-stakes network intrusion case, the victim organization hired a firm with a solid reputation to help them respond to a security breach that had potentially exposed sensitive and proprietary defense industry data. The security consultants located systems that had been compromised, but failed to handle them in a forensically sound manner. For instance, rather than creating a forensic copy of a server the intruders had used as a C&C center on the victim network, the consultants booted the system and installed programs, including an unlicensed version of a forensic software application. These irresponsible actions cost the victim organization valuable time and resources, and caused untold loss of valuable evidence.
Specialized tools exist to preserve certain types of digital evidence, including hard drives, physical memory, and network traffic. To facilitate evidence gathering from live hosts on a network, remote forensic examination tools can provide access to many machines simultaneously from a central console [5]. These tools effectively bypass the operating system on a live system, and can search for and capture data from their hard drives and memory without shutting them down. One limitation of these remote examination tools is they do not currently enable examiners to inspect the memory contents of a process or acquire a complete copy of memory. Because these tools only acquire disk contents, any data that has been added to a file currently open for editing may not be preserved. Therefore, when examiners are interested in obtaining process memory or the full contents of memory is of interest, they must use other methods such as Helix (www.e-fense.com).
Preserving logs that may contain information relating to the intrusion can be quite challenging from a forensic standpoint because few logging systems are designed with evidentiary value in mind. Forensic specialists must apply the principles of evidence preservation creatively to each source of log data that an organization maintains, such as those from intrusion detection systems, routers, firewalls, authentication servers, and Web servers. Incomplete or inaccurate logs can be more harmful than helpful, diverting the attention of digital investigators without holding any relevant information. Such diversions increase the duration and raise the cost of such investigations, which is why it is important to prepare a network as a source of evidence [6, 12].
As the evidentiary value of log files becomes more important to organizations, more vendors will design their products with integrated forensic principles. Even as tools improve, organizations must identify their most valuable assets and develop a strategy to prepare the underlying systems from a forensic viewpoint. This type of knowledge management enables digital investigators to focus their initial efforts on the most important systems and quickly determine whether intruders gained access to critical systems.
Capturing network traffic can give investigators one of the most vivid forms of evidence, a live recording of a crime in progress [3]. This compelling form of digital evidence can be correlated with other evidence to build an airtight case, demonstrating a clear and direct link between the intruder and compromised hosts.
A common dilemma arises between the need for operational security and for monitoring network traffic and other evidence of ongoing intruder activities. When blocking intruder access to sensitive data eliminates a source of valuable evidence, investigators may decide to install a honeypot, enabling them to monitor intruder activities without exposing the target network to additional risk.
Advanced Analysis of Digital Evidence
Once a potential source of evidence has been preserved, digital investigators immediately begin dissecting it for information pertaining to the intrusion. A full description of the forensic examination of computers is beyond the scope of this article and is covered in various publications [2, 8, 11]. Forensic analysis methods relevant to high-stakes investigations are described here to give examples of challenges that currently exist.
In the traditional cat-and-mouse fashion, as intruders become more adept at concealing evidence, investigators are developing new techniques and tools to recover more evidence from computers and networks. Recently, tools for extracting information from physical memory dumps on Windows systems were developed in response to the Memory Challenge for the Digital Forensic Research Workshop (www.dfrws.org). By capturing the full contents of computer memory, investigators can now bypass rootkits and examine processes and their memory contents, including processes that would otherwise be hidden, as shown in the accompanying table.
Virtual memory can also be a rich source of information on compromised hosts, providing passphrases and remnants of data that was viewed and pilfered by intruders. Passphrases discovered in virtual memory may be useful for recovering data encrypted by the intruders or accessing their backdoors. Currently, no tools exist for interpreting the data structures in virtual memory files but the process of reconstructing memory structures as shown in the table may be expanded to extract additional information from the virtual memory on a system.
Names, timestamps, and MD5 hash values of files left by the intruder can be useful for locating similar signs of intrusion on other hosts that were not known to be compromised. Investigators also analyze these files to determine an intruder’s method of operation (MO). Complex systems such as those outlined in Figure 1 require in-depth analysis, including reverse engineering, to discover how they work and to uncover details that investigators can use to attribute the crime to an individual such as a unique nickname or passphrase. However, more sophisticated intruders take precautions to obfuscate distinctive characteristics in their tools, creating a need for enhanced “antidote” methods and tools to search a computer for customized intruder tools with minimal false positives. On a case-by-case basis, investigators develop specialized utilities to search in memory and on disk for distinctive signatures of the intruder’s tools, which is increasingly challenging as the C&C systems become more distributed and the footprint they leave on the compromised host shrinks.
In addition to extracting evidence from compromised hosts, digital investigators correlate and search application and network-level logs and traffic for intruder activities. These data sources are searched repeatedly as the investigation progresses and investigators find new information, such as IP addresses, non-standard ports, or communication methods used by the intruders. Network-level logs are useful for a variety of purposes, including locating additional compromised computers, developing a timeline of events, and discerning patterns of behavior. Individual pieces of digital data may not be useful on their own but patterns of behavior can emerge when the pieces of digital evidence are combined. An intruder might always strike at specific times or in particular ways, or may become more sophisticated over time. Discerning these patterns can be very challenging when digital data is involved because there is often a massive quantity of information, and the logs themselves may be unreliable. When dealing with the more skilled adversary investigators should always look at the logs for signs of subterfuge such as hardware clock tampering, missing entries, or time gaps.
Knowing that an intruder is highly skilled may lead investigators to search for concealed evidence or covert channels that would otherwise be overlooked.
Investigators use a variety of tools to facilitate correlation, rapid searching, data reduction, and pattern detection. Some tools are specifically designed to combine a variety of log file formats into a single database that can be queried for specific time periods, IP addresses, and other characteristics. Although these applications are predominantly Security Information Management solutions, such as CS-MARS from Cisco and nFX from NetForensics, and are not designed with forensic principles in mind, they have some features that support digital investigations. In addition to normalizing a wide variety of logs, applications like nFX have codified expert knowledge that can automatically condense multiple related log entries into a single event, simultaneously performing data correlation and reduction functions. This type of automation and data reduction saves investigators valuable time, helping them identify suspicious activities and assess the significance by drilling down into the details of underlying logs.
These tools also have some visualization capabilities such as link diagrams that can help investigators analyze large amounts of data and recognize patterns. Figure 2 demonstrates how a diagram of network-level logs relating to compromised hosts can help identify the attacker. The figure shows that all of the compromised hosts in this network intrusion were targeted by a single computer in the upper left of the link diagram. The links also show that some compromised hosts communicated with other remote computers around the time of the intrusion, providing investigators with leads to other systems that may have been involved.
More sophisticated analysis capabilities useful for analyzing log files include data modeling and 3D visualization features to facilitate pattern recognition (for example, Starlight), and analysis algorithms such as n-gram analysis that can be useful for isolating patterns or anomalies in large datasets (for example, eTrust Network Forensics).
Digital investigators are continually seeking more effective ways to process and visualize network logs to identify suspicious activities and recover data from covert, encoded network traffic.
Apprehending Sophisticated Intruders
Understanding the intruder’s MO can provide investigators with useful information, such as the location of additional sources of evidence, the skill level of the intruders, their knowledge of the victim network, and their motives. Knowing that an intruder is highly skilled may lead investigators to search for concealed evidence or covert channels that would otherwise be overlooked. An assessment of the types of systems being targeted by an intruder and the types of information being stolen can help investigators differentiate between someone who knows exactly what they are looking for and where to find it, versus someone who is looking for anything valuable they can find. For example, network logs and file system activities may show the intruder poking around many systems for items of interest. This exploratory behavior implies that the individual does not have much prior knowledge of the network and may not even know what he or she is looking for but is simply prospecting. Conversely, when a thief only targets the financial systems on a network, this directness suggests the intruder is interested in the organization’s financial information and knows where it is located [4]. When investigators determine an intrusion required inside knowledge of the victim network they may be able to narrow the suspect pool to a certain group of people within the organization.
Although “hacking back” may enable investigators to identify the intruders, this activity is risky because it may miss or alter evidence, alert the intruders, and break the law. Hacking back involves connecting to compromised hosts in other regions that the intruders are using. Entering a remote system without authorized access can be problematic from a legal standpoint since U.S. law bars intentional unauthorized access, no matter the motive. The Russian government initiated criminal proceedings against an FBI agent who was building a case against two hackers—Vasily Gorshkov and Alexey Ivanov—because the agent logged into the suspects’ Russian computers from the U.S. [1]. If the investigators do not use adequate remote forensics tools, there is also a risk a rootkit could be running on the remote system, hiding critical information.
It is important to realize that digital evidence is usually just one component of a solid investigation. Private investigators can help build a case by conducting surveillance, pretext communications, covert online research, and interviews. Behavioral profilers can help develop leads and assess dangers, and can instruct digital investigators to look for specific intruder behaviors. For instance, the perpetrator in one online extortion case was difficult to apprehend because he connected to the Internet via unsecured Wi-Fi wireless networks. In addition, he detected and disabled a Web bug that investigators set up to track him.
Ultimately, forensic specialists collaborated with a behavioral profiler to narrow the suspect pool and subsequently involved private investigators to perform surveillance of the suspects until the offender was caught in the act [9]. There is a need for tools that codify specific patterns of behavior unique to an intruder, and automatically find those patterns in large volumes of correlated log files and network traffic.
Conclusion
The ability to apprehend sophisticated intruders depends in large part on the ability of investigators to follow the cybertrail left by the culprits. Currently, intruders are exploiting the general lack of forensic readiness of our networks by relaying their traffic through multiple networks to obfuscate their cybertrails, effectively hiding behind a cloak of ill-prepared networks.
Integration of forensic principles into security tools will improve our ability to conduct network investigations. In addition, there is a need to improve training, tools, techniques, and liaison and intelligence gathering to help investigators determine when the intruders gained access, their method of operation, what sensitive information was exposed, their intent, and where they can be apprehended or at least mitigated.
Figures
			
				
			
		Figure 1. Example of intruder command, control, communications, and concealment system depicted using Analyst’s Notebook (provided by Jessica Reust).
			
				
			
		Figure 2. Link diagram generated using eTrust Network Forensics using NetFlow logs shows that all compromised hosts in this network intrusion were attacked by the computer in the upper left (provided by Mark Winston).
			
				
			
		Figure. Process list extracted from a physical memory dump using memparser from a compromised Windows system. Processes hidden by the Hacker Defender rootkit are highlighted in red in this table for illustrative purposes.



Join the Discussion (0)
Become a Member or Sign In to Post a Comment