Research and Advances
Architecture and Hardware Contributed articles

Offline Management in Virtualized Environments

How to run virtual machines together with physical machines, especially when sharing computational resources.
Posted
  1. Introduction
  2. Key Insights
  3. Applying Traditional Tools
  4. Virtualized Environment is Different
  5. Offline Operations Management
  6. Offline IT Management Model
  7. Offline Management Functional Model
  8. Virtual Machine Patching Framework
  9. Advantage and Disadvantages of Offline Management
  10. Conclusion
  11. References
  12. Authors
  13. Footnotes
  14. Figures
  15. Tables
network, illustration

Virtualization is prolific and heterogeneous, but, despite delivering unprecedented efficiency and dynamism, is also a challenge for traditional IT management tools, techniques, and processes. The problem is multifaceted; we want a common comprehensive management environment that leverages existing tools, applications, and IT management processes in both physical and virtual environments. However, IT management processes do not always work as expected in virtualized environments due to fundamental differences in the operation of physical and virtual machines.

Back to Top

Key Insights

  • Every offline virtual machine left unpatched is a potential threat when brought online for patching.
  • Virtual machines do not have to be online to be managed.
  • Using offline management techniques can improve management of virtual environments significantly.

Though all IT management functions are affected by virtualization, operations management deserves special attention, as it bears most of the load. Operations management must address both physical and virtual environments, and management of the virtualized environment must be as effective as management of the physical environment. IT management needs information, including device discovery, device inventory, software inventory, and operating system inventory. Together, this data defines the information perimeter, within which IT management principles, processes, and tools are applied, forming the IT management perimeter. This article examines these processes and proposes a new complementary IT management model.

Traditional physical IT management tools depend on close alignment between management perimeter and information perimeter; for example, in the physical world, device inventory information (such as MAC address, RAM, and HDD configuration) resides in the machine itself, deep in the operating system, and is typically read only through agent-based techniques that understand operating system interfaces. Remote technologies (such as Windows Management Instrumentation, or WMI) still need a service to run on the target. Some technologies (such as Intel Active Management Technology) use a management stack that is part of the hardware platform. The information perimeter and IT management perimeter overlap in physical environments (see Figure 1). This way, the information perimeter acts more like a barrier; needed is an agent on the inside to get management data to the outside and manipulate machine data on the inside. That is, IT management tools must be able to run on physical machines.

Traditional IT management tools follow two implicit assumptions:

Confined to hardware. The information required for IT management is confined to the computer hardware on which it runs; and

Only when running. The information is available only when the computer is running.a

They force IT management tools to be agent-based; even technologies purporting to be agentless require some kind of service to be running on the managed computer (such as WMI).

In typical environments such systems follow a client-server-based architecture in which a management server does back-end work (such as storing and analyzing inventory information and correlating with the services provided to enterprise IT users). Agents run on managed systems to do the real work of inventory collection, deployment of software and patches, scanning systems for viruses, and backing-up data.

The model is successful for two reasons:

Rarely available. The information collected by the agent is rarely available outside the physical boundaries of machines; and

Distributed. Processing is distributed to individual machines, reducing load on servers and promoting scalability.

Back to Top

Applying Traditional Tools

Many IT management vendors take a sensible approach in which physical and virtual computers are managed together in a similar way, extending their IT management products to understand and exploit virtualization while allowing their customers to apply the same or similar policies and operations to physical and virtual resources; for example, CA Client Automation extends its physical software deployment capabilities to address the virtual environment (such as managing virtual applications and virtual desktop infrastructures). This means new virtual-delivery options can be adopted while still maintaining the management paradigm already understood in the physical environment. CA Server Automation takes a similar approach regarding machine provisioning, bare-metal provisioning of physical servers, virtualization servers, and provisioning of virtual machines all under one roof.

Some enhancements are based on low-level hypervisor APIs that collect information (such as the virtual machine instance and the host machine that runs it), but for all other operations management activities (such as software asset management, security, and compliance) the traditional physical model is reused within the virtualized environment, usually with little or no modification. The middle component in Figure 1 labeled “Virtualized Environment” shows the information perimeter surrounds the virtualization server/host, though IT management tools are forced to run on the virtual machine instance. The advantage for vendors is they require no extensive effort to make their tools virtualization ready.

Not having to develop anything further represents a short-term advantage but exposes (in the long term) gaps that could be critical from the perspective of IT management; Table 1 lists aspects of operations management and their complications in virtualized environments.

Traditional patching, antivirus, and vulnerability management tools are agent-based and require managed machines to be running for proper maintenance. This can be a problem in virtualized environments where virtual machines are not expected to be online all the time. Moreover, patching a running virtual machine does little good unless the stored image of the virtual machine is also patched, which is not always the case with current technologies (such as nonpersistent virtual machines and linked clones).

Traditional patching of offline virtual machines includes virtual machine power on and snapshot commits. The resources needed for this activity can be nontrivial, especially when a large number of virtual machines must be patched, since much of the flexibility of virtualized environments derives from deploying many lightweight virtual machines with specific purposes that run when needed, rather than fewer physical machines that run constantly.7

The few commercially available products include Virtual Machine Servicing Tool10 and VMware vCenter Protect Essentials Plus, formerly Shavlik NetChk Protect,12 which services offline virtual machine images. However, none of them works in a truly offline fashion; for example, the Microsoft Offline VM Servicing Tool wakes up an offline virtual machine in a guarded environment, applies patches, and services the virtual machine, then returns the updated image to the library. VMware’s offering is quasi-offline; patch files are inserted into the offline image, while the actual patching is done when the virtual machine is powered up. In dynamic cloud environments, where virtual machines are provisioned in response to disaster recovery or load balancing, the startup delay is unacceptable.

Virtualization is a major enabler of cloud computing, and administrators must address how to update thousands of virtual machine images.b

Given the advantages and ease of deploying virtual machines, the rate of virtual machine provisioning could outpace the rate of applied security patch updates (see Figure 2). As a result, offline virtual machines may lag security updates, unless updated tools are deployed to specifically address offline virtual machines.

Back to Top

Virtualized Environment is Different

Virtualized environments add architectural complexity, and virtual systems allocate and use resources differently, more dynamically and less transparently. They have different operational and performance profiles, as well as more fluid configurations. IT management must operate both virtually and physically, and when virtual machines are able to move between physical machines in response to load and resource requirements, IT management based on physical boundaries is awkward, if not impossible.

The fundamental difference between virtual and physical machines is that while a physical computer and a virtual computer are both digital objects, the data comprising them is far less accessible and far more heterogeneous when working with a physical computer. Physical computer data is stored in volatile memory, as well as in BIOS ROMs, on peripheral cards, and on various vendors’ hard drive products. Most physical computer data-configuration information (such as software, accounts, users, groups, patches, services, packages, registry keys, MD5s, and configuration files) are accessible only through code executing within the address space of the machine itself, and almost all physical computer data is available only when the computer is running, since it is buried deep inside platform-specific configuration files and structures. Meanwhile, all virtual computer data is stored in a single file or multiple linked files; in cloud environments these files are typically stored in storage fabric. Nevertheless, virtual machines are not simply data objects to manipulate but actual computers with real workloads and to a large extent the same management requirements as physical systems.

Unlike physical systems, virtualized environments change dynamically and for the most part are server-centric, or composed of digital objects residing and executing on physical servers. The information perimeter may not necessarily be tightly associated with the hardware resource on which it runs. So, for a virtualized environment (unlike a physical environment) IT management does not have to be executed within the running machine.

Back to Top

Offline Operations Management

Virtualization technology is file-based, with files typically available at a central location for ready access by IT management tools. One aspect of file-based technology is that if the file format is known to an IT management tool, the tool should be able to interpret, extract, and update the “data” that is the virtual machine.

Virtualization vendors agree for the most part on interoperable and/or open virtual file formats. The Distributed Management Task Force (DMTF; http://dmtf.org/) released the Open Virtualization Format, or OVF,3 specification now being adopted by major virtualization vendors (such as Citrix, Microsoft, and VMware) and accepted in August 2010 as an American National Standards Institute (ANSI; http://ansi.org/) standard. In addition, other proprietary file formats (such as the Virtual Machine Disk Format, or VMDK,11 and Virtual Hard Disk Image Format, or VHD)9 store virtual machines as monolithic files or in multiple layers. These files may be mounted and accessed by external tools and utilities that understand them.c That is, the information perimeter once viewed as a barrier becomes a standardized enabler.

Virtual machine data does not have to be executing for its state to be managed but can instead be managed and manipulated offline. This enables a new model for IT operations management in virtualized environments—offline IT management. As explored here, managing a computer in an offline state offers many advantages.

Back to Top

Offline IT Management Model

The offline model for IT management of virtualized environments is straightforward, allowing information access or retrieval from virtual machine data, in addition to IT management executed directly on virtual machine data (see Table 2). Information access can be used for IT-asset-management activities (such as hardware asset inventory, software inventory, and software-license auditing and security compliance). IT management can include, but is not limited to, security patches, virus-definition updates, and software upgrades. The following section covers a model that enables offline information access and servicing.

Back to Top

Offline Management Functional Model

The offline management functional model involves a functional system model for offline IT management in virtualized environments (see Figure 3). The virtual file interpreter is the core of the system, responsible for mounting the virtual machine file and providing a common, abstracted, programmatic interface to various kinds of virtual machine data (such as hardware attributes, end-user licensing information, files, and configuration registry). The interoperable file format ensures a virtual machine is accessible irrespective of host and guest platforms. The file format should be well documented and accessible on external media; for example, the OVF, VMDK, VHD, and Kernel-based Virtual Machined file formatse can be accessed by mounting externally. Virtual machine files encrypted or password-protected must be decrypted or given proper credentials for the virtual file interpreter to be able to mount and access the virtual machines. The action engine is responsible for carrying out platform-specific actions directed by action scripts (such as remediate vulnerabilities on a Windows system and apply patches to a Debian server).

The model allows for scheduling updates in the background without having to power virtual machines, possibly resulting in significant resource savings, especially in cloud environments. One obvious overhead that could thus be avoided is generation of action logic. However, in some cases the action engine and action logic may be repurposed traditional agent technology. Techniques like those used for virtualized-application-streaming package creation can also be used to automate action logic creation.8

Back to Top

Virtual Machine Patching Framework

Here, we discuss a novel system for offline virtual machine patching on Windows platforms based on offline management for virtualized environments covered earlier (see Figure 4). Each virtual machine is stored as a file when offline, with its image rendered to file-system data by mounting the virtual machine’s virtual-hard-disk drive. The rendering engine uses the registry hive files and virtual hard disks to provide a computing platform that can be used by the offline virtual machine image update engine, in collaboration with the scripting engine. The scripting engine runs the patch script and updates the virtual machine image with patch files; the patch script records the registry and file system location to be updated, along with relevant metadata.

Virtual machine image mount engine. The virtual machine image mount engine is responsible for mounting the virtual machine image as a virtual disk device on the host operating system. It also generates drive-mapping information (such as on virtual machine X: was C: provided to the virtual machine image rendering engine).

Virtual machine image rendering engine. This engine uses drive-mapping information and mounted drive to locate registry hives and identify the operating system, system files, and data files. It loads the user hives and the system hives into the host operating system environment and maintains the registry mapping information. It uses drive-mapping information so the patch-script instructions are executed in correct context; for example, the patch script may ask to update C:WindowsSystem32avifile.dll, which, in the context of the host operating system, is X:WindowsSystem32avifile.dll. Likewise, all registry update instructions from the patch script may be redirected to correct locations (such as when the virtual machine’s HKLMSoftwareCA update instruction is redirected to HKLMVirtualHKLMSoftwareCA).

Offline virtual machine image-update engine. The image-update engine works closely with the scripting engine and the virtual machine image-rendering machine, executing the patch script and updating the files and registry using the patch files; Figure 5 outlines the steps required to apply the patch bundle on an offline virtual machine.

Back to Top

Advantage and Disadvantages of Offline Management

Why manage computers this way? Alternatively, why not manage them this way?

*  Advantages

  • The computer does not have to be started:
    • Less demand on resources, including physical computing resources (such as CPU and memory); power; and network bandwidth (if loaded from SAN or network);
  • More security:
    • A noncompliant machine does not have to execute before being brought into compliance;
    • A noncompliant machine may be inhibited from executing;
    • The runtime external attack surface is reduced since there is less need for an internal agent; and
    • A fundamental limitation of traditional host-based anti-malware systems is they run inside the very hosts they protect (called “in the box”), leaving them vulnerable to counter-detection and subversion by malware.5
  • Vendors are regularly developing new virtualization layers (such as a user-persona layer, user-application layer, corporate-security layer, business-application layer, corporate-application layer, and operating-system layer), particularly for virtual desktops; the offline management model allows for interrogating and manipulating data at each one;
  • Speedier application of changes to computers;
  • Management code does not have to be compiled for specific platforms; there is no runtime agent so no requirement for different compilation targets and installers;
  • Potential for being able to repurpose existing agent code (such as file scanning); and
  • Discovery as file scanning.

*  Disadvantages

  • No runtime monitoring; certain management operations need access to the managed computer during execution;
  • The cost of code for new platforms (such as action logic); development costs are incurred since programmatic interfaces used today by agents are likely different from those available when using a virtualization API; and
  • No real-time update; virtual machines exist in order to execute, and once they are executing the offline model become less effective; it is not possible to apply changes to a running virtual machine through access to its files, though this may change.

Back to Top

Conclusion

The offline model is a promising operations-management model for virtualized environments. Managing virtualized environments through agent-based tools is convenient and seamless but does not take advantage of the unique characteristics of virtual systems. The offline model augments traditional management models, promising to be more effective and less resource intensive, even though challenges complicate creation of a seamless management environment in which both physical and virtual systems are managed identically; some aspects of runtime management may require other approaches. The model, which is scalable, is based on constructing a logical view based on interoperable virtual machine file formats and can be applied to a range of operations-management tasks. Keeping in mind that all nodes share resources in virtualized environments, moving to an offline model would help administrators manage these shared resources more efficiently.

Back to Top

Back to Top

Back to Top

Back to Top

Figures

F1 Figure 1. Information perimeter and IT operations management perimeter.

F2 Figure 2. Security vulnerability when applying traditional tools in virtualized environments.

F3 Figure 3. Offline management functional model.

F4 Figure 4. Offline virtual machine patching framework.

F5 Figure 5. Offline patching process.

Back to Top

Tables

T1 Table 1. Operations management challenges in virtual machines.

T2 Table 2. IT management use cases for offline virtual machine management.

Back to top

Join the Discussion (0)

Become a Member or Sign In to Post a Comment

The Latest from CACM

Shape the Future of Computing

ACM encourages its members to take a direct hand in shaping the future of the association. There are more ways than ever to get involved.

Get Involved

Communications of the ACM (CACM) is now a fully Open Access publication.

By opening CACM to the world, we hope to increase engagement among the broader computer science community and encourage non-members to discover the rich resources ACM has to offer.

Learn More