Sign In

Communications of the ACM

Digital village

Wading Into Alternate Data Streams


View as: Print Mobile App ACM Digital Library Full Text (PDF) Share: Send by email Share on reddit Share on StumbleUpon Share on Hacker News Share on Tweeter Share on Facebook

The open-ended nature of ADSs makes them an extremely powerful Windows resource worthy of deeper exploration.

The concept of non-monolithic file systems is not new. "File forks" were an integral part of the original Macintosh Hierarchical File System. Apple separated file information into data and resource forks, where the resource fork contains a resource map followed by the resources, or links to resources, needed to use the data. Typical resources would include program code, icons, font specifications, last location of application window, and other file-related metadata that is not stored in the disk catalog (for example, file signatures). File forks may be null, and many files contain both forks, the accidental separation of which produces the dreaded "error 39." The Macintosh Resource Manager could control up to 2,700 resources per file, though the linear search, single-access control strategy makes managing more than a few hundred resources cumbersome and slow. File forks remained in use until Apple introduced OS X. Built on a Linux kernel, OS X stores resources in separate .rsrc files.

Microsoft introduced non-monolithic file structures in its New Technology File System (NTFS) along with the Windows NT operating system. Since the loss of a non-null resource fork renders the data fork useless, Microsoft had to accommodate non-monolithic files for compatibility with Macintosh files and Apple Talk. Among the many innovations in NT, Microsoft ported non-monolithic file structures from the Macintosh world over to the PC. In Microsoft parlance, non-monolithic file structures are called file streams.

Back to Top

Alternate Data Streams

One may make reasonable comparisons between the Macintosh data and resource forks and the Microsoft primary and alternate data streams, respectively. In NTFS the primary data stream (aka default data stream or unnamed data stream or, simply, file) is called $DATA. A large number of alternate data streams (ADSs) may be associated with a single primary data stream (PDS). We emphasize that ADSs are associated with, and not attached to, primary data streams. The associations are maintained in the Master File Table (MFT), and managed by a variety of application program interfaces (APIs). As a simple illustration, a right mouse click on any NTFS file and subsequent selection of the properties tab will recover ADS metadata through a standard Windows MFT API.

Microsoft's approach to non-monotonic file structures has some unusual properties:

  • Anything digital can become an ADS: thumbnails and icons, metadata, passwords, permissions, encryption hashes, checksums, links, movies, sound files, binaries, whatever. We'll return to the "whatever" in a moment.
  • NTFS supports a very large number of ADSs. The Purdue COAST project estimated an upper bound of 6,748 ADSs per primary data stream on Windows NT 4. (see www.cerias.purdue.edu/coast/ms_penetration_testing/v50.html).
  • Since ADSs are separate files, they appear hidden from applications that call routine Windows APIs. The use of ADS is completely transparent to standard Windows file management toolsit's as if they don't exist. This includes such key system applications as Windows Explorer and Task Manager, and core command line executables like DIR.
  • The normal data stream ADS API syntax in the Windows NT/2000/XP series is <filename>: <ADSname>:<ADStype> (where ADStype is optional).
  • ADSs form a tree structure that has a maximum depth of 1. The decision to prevent nesting of ADSs seems to us to be an unnecessarily arbitrary and shortsighted limitation.

The best way to get a feel for ADS is with some hands-on experience. Our example shown in the sidebar here may seem clumsier than necessary at first, but it has the benefit of simplicity because the entire demonstration can be completed within a single DOS command prompt window.

Back to Top

Phishing and Executable Streams

As previously stated, ADSs may contain anythingtext, images, sound and video filesanything. The most interesting type of "anything" is the binary executable. Presuming readers of this column have completed the example in our sidebar and are up for the challenge, we'll now attach an executable to an empty text file. In this case we're assuming the Windows XP system calculator is stored in the location C:\windows\system32\calc.exe. If not, create a path to the harmless executable of choice.

We now rename <calc.exe> as the ADS, <d.exe>, and associate it with the empty text file <test.txt>:

  • C:\...\test>type c:\windows\system32\calc.exe > .\test.txt: d.exe

and execute the ADS directly:

  • C:\...\test>start .\test.txt:d.exe

You should see the Windows calculator appear on your screen. A directory listing still reveals only primary data streams. The executable is entirely hidden, but very much thereand obviously activeas can be confirmed by looking at the active processes listing under Windows Task Manager by pressing the <CTL><ALT><DEL> keys simultaneously (see Figure 2).

It is apparent that with minimal effort one can sufficiently mask the hidden executable so that its function is obscured. Of course, masking an executable by just changing the filename is neither clever nor particularly deceptive, so a hacker might add the requisite stealth by invoking the native Windows Scripting Host with the control parameters set to execute files with non-executable file extents. In this way one could rename <trojan.exe> as something innocuous like <help.fil>, and execute it with WSH.

It is interesting to note that prior to Windows XP, the ADS didn't even appear in the process listing. Had we hidden the ADS behind something innocuous like <cmd.exe> or <notepad.exe>, the execution of the ADS would be undetected.

The hiding of the function behind an innocuous appearing executable is akin to Internet scams where the unsuspecting are lured to spoofed Web sites that appear harmless while actually harvesting personal or private informationa technique called "phishing." For lack of a better term, we may think of planting hostile executables in ADS as "file phishing": creating an environment in which things are not as they appear.

Before we proceed, let's clean up the data streams, directories, and files on your computer. ADSs can only be deleted if their associated primary data stream is deleted, so once you're done experimenting, erase all of the files in your test directory, go up one directory and remove the directory itself, or simply erase the entire <test> directory with Windows Explorer.

At this point you're back where you started, no worse for wear.

Back to Top

NTFS Master File Tables

To understand ADS, one must investigate the way the Windows MFT record works. The MFT is a relational database in which the records correspond to files and the columns to file attributes. Following 16 records of metadata files, there is at least one MFT record for each file and folder (subdirectory) on all hosted disk volumes. All records are 1Kb in size, allowing the data and attributes of very small files or folders to be stored in an MFT record itself. Larger files and folders are referenced by pointers within their records. Folders are externally organized as B-trees.

Each record contains basic file attributes relating to date and time stamps, permissions, filename and extent, security descriptors, and so forth. The utility of the MFT results from the fact that it is set up to automatically use external pointers for all data that cannot fit within the 1Kb record. Having this in place allows virtually unrestricted, scalable control over file management. All data streams (regardless of whether they're primary or alternate) maintain all of the necessary information for the Windows APIs to manipulate them: for example, allocation size, actual data length, whether they're compressed or encrypted, and so forth. Given this situation, the proliferation of data streams involves little more than the proliferation of pointers within the MFT. The only mitigating factor is that the ADSs cannot be nested, which means any ADS file organization beyond a one-level deep tree would have to be accomplished at the applications layer.

As we indicated, the low-level Windows APIs (such as CreateFile, DeleteFile, ReadFile, WriteFile) were designed to treat all data streams the same, ADSs or PDSs. Under NTFS, ADS support is completely transparent at that level. The weakness is that the higher-level utilities (DIR, Windows Explorer, Task Manager) were not intended to support ADS. This is where the truly baroque nature of Microsoft's ADS design logic makes itself known. One can use the low-level APIs to manipulate ADSs easily, but the higher-level utilities conceal their presence. From the end user's perspective, it's hard to delete a data stream without first being able to find it! Fortunately there are third-party utilities such as Lads, ScanADS, Streams, and Crucial that help locate ADS by working directly with the low-level APIs (especially, the "backup_" functions). Figure 3 illustrates their use on our <test> directory after we completed the experiment described previously. Note that Streams requires a separate test for folder ADS (remove the "-s" parameter). Crucial has a GUI interface and only scans entire drives, and will not be shown here.

Back to Top

Security Implications of ADSs

A Google search for the phrase "alternate data streams" will yield several thousand hits, most of an alarmist nature. This is unfortunate in many ways, because the power of ADSs has yet to be realized. While it is true that there is malware that takes advantage of ADSs (W2k.stream is one example), that malware has not proven to be as destructive as the more mainstream varieties that rely on buffer overflows, NetBIOS and RPC vulnerabilities, session hijacking, or address spoofing. As a datapoint, all W2k.stream threat vectors were assessed "low" by Semantec (see www.sarc.com/avcenter/venc/data/w2k.stream.html).

What created most of the alarm was the "hidden" nature of ADS combined with the absence of Microsoft utilities that supported direct access and control within native file utilitiesbut, then, that wasn't why Microsoft built ADS into NTFS in the first place. The mere mention of a hidden feature to anyone with even a slight anti-Microsoft bias is guaranteed to produce an animated response. Unfortunately, Microsoft added fuel to the fire by failing to include a "display ADS" checkbox as a Windows Explorer configuration option and direct ADS control at the utility level. Most users wouldn't be bothered with ADS management, but full file disclosure would have been comforting to those prone to anxiety attacks.

Sidebar.URL Pearls.

The facts are less alarming than the several thousand Google hits might lead us to believe. While it is true that ADS could be used as a conduit for malware executables, so can email attachments. Further, modern anti-virus scanners routinely scan for malware in all Windows data streams, including ADS, so the risk of intrusion should be no greater with ADS than PDS.

The same is true for covert channeling. ADS could be used for that purpose, but so could the options field in a normal ICMP packet. With the ease that malware such as Loki conducts encrypted covert data channeling at the IP level, why would a hacker become involved with the applications layer?

The claim that ADSs are difficult to delete is equally misleading. ADS file space gets reallocated in just the same way that PDS and directory space does. File-wiping products such as Cyberscrub (www.cyberscrub.com) even include ADS "scramblers" for extra safety.

By any reasonable measure, ADS vulnerability has been overstated.

Back to Top

Conclusion

Alternate Data Streams have demonstrated considerable potential in object-oriented OSs and application environments, or those that involve complex file and data structures. While Microsoft is officially committed only to the Object Linking and Embedding (OLE) 2.0 model of structured storage, ADS will likely remain with us as long as Windows OSs continue to use NFTS. To quote Microsoft:

"Alternate data streams are strictly a feature of the NTFS file system and may not be supported in future file systems. However, NTFS will be supported in future versions of Windows NT. [including Windows 2000 and XP] Future file systems will support a model based on OLE 2.0 structured storage (IStream and IStorage). By using OLE 2.0, an application can support multiple streams on any file system and all supported operating systems (Windows, Macintosh, Windows NT, and Win32s), not just Windows NT." (See the Microsoft Knowledge Base article "HOWTO: Use NTFS Alternate Data Streams" (number 105763), available at support.microsoft.com/default.aspx?scid=kb;en-us;105763.)

There is no question that ADSs are underutilized in Windows. Like previous major software houses, Microsoft felt compelled to opt in favor of backward compatibility. For example, to use <desktop.ini> files to parse the contents of a directory and <.tmp> files to hold transitory data, seems retrogressive at best when ADS could have accomplished the same thing in a far more straightforward manner. After all, neither file type has any meaning apart from the associated directory or primary data stream anyway, so using ADS is the natural way to handle them. But, that would have meant that such information could not be shared with Windows platforms with FAT16 and FAT32 file systems. To hobble the OS is less costly than dealing with 40 million additional hits at the help desk.

But the most unfortunate aspect of ADS is that the negative press and exaggerated claims of vulnerability have polluted the waters to such as degree that the true potential of ADS may never be realized.

Back to Top

Authors

Hal Berghel (www.acm.org/hlb) is a professor and director of the School of Computer Science and director of the Center for Cybermedia Research at the University of Nevada, Las Vegas.

Natasa Brajkovska (natasa@crlmail.i2.nscee.edu) is a research assistant at the Center for Cybermedia Research at the University of Nevada, Las Vegas.

Back to Top

Figures

F1Figure 1. Recovering hidden Alternate Data Streams.

F2Figure 2. Windows Task Manager's report of the Windows calculator executing as the Alternate Data Stream <test.txt:d.exe>.

F3Figure 3. Typical ADS reporting utilities at work.

Back to Top

Back to Top


©2004 ACM  0002-0782/04/0400  $5.00

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee.

The Digital Library is published by the Association for Computing Machinery. Copyright © 2004 ACM, Inc.


 

No entries found