How virtual machines work

From Samuel T. King, Peter M. Chen, Yi-Min Wang, Chad Verbowski, Helen J. Wang, & Jacob R. Lorch’s “SubVirt: Implementing malware with virtual machines
” [PDF] (: ):

A virtual-machine monitor (VMM) manages the resources of the underlying hardware and provides an abstraction of one or more virtual machines [20]. Each virtual machine can run a complete operating system and its applications. Figure 1 shows the architecture used by two modern VMMs (VMware and VirtualPC). Software running within a virtual machine is called guest software (i.e., guest operating systems and guest applications). All guest software (including the guest OS) runs in user mode; only the VMM runs in the most privileged level (kernel mode). The host OS in Figure 1 is used to provide portable access to a wide variety of I/O devices [44].

VMMs export hardware-level abstractions to guest software using emulated hardware. The guest OS interacts with the virtual hardware in the same manner as it would with real hardware (e.g., in/out instructions, DMA), and these interactions are trapped by the VMM and emulated in software. This emulation allows the guest OS to run without modification while maintaining control over the system at the VMM layer.

A VMM can support multiple OSes on one computer by multiplexing that computer’s hardware and providing the illusion of multiple, distinct virtual computers, each of which can run a separate operating system and its applications. The VMM isolates all resources of each virtual computer through redirection. For example, the VMM can map two virtual disks to different sectors of a shared physical disk, and the VMM can map the physical memory space of each virtual machine to different pages in the real machine’s memory. In addition to multiplexing a computer’s hardware, VMMs also provide a powerful platform for adding services to an existing system. For example, VMMs have been used to debug operating systems and system configurations [30, 49], migrate live machines [40], detect or prevent intrusions [18, 27, 8], and attest for code integrity [17]. These VM services are typically implemented outside the guest they are serving in order to avoid perturbing the guest.

One problem faced by VM services is the difficulty in understanding the states and events inside the guest they are serving; VM services operate at a different level of abstraction from guest software. Software running outside of a virtual machine views lowlevel virtual-machine state such as disk blocks, network packets, and memory. Software inside the virtual machine interprets this state as high-level abstractions such as files, TCP connections, and variables. This gap between the VMM’s view of data/events and guest software’s view of data/events is called the semantic gap [13].

Virtual-machine introspection (VMI) [18, 27] describes a family of techniques that enables a VM service to understand and modify states and events within the guest. VMI translates variables and guest memory addresses by reading the guest OS and applications’ symbol tables and page tables. VMI uses hardware or software breakpoints to enable a VM service to gain control at specific instruction addresses. Finally, VMI allows a VM service to invoke guest OS or application code. Invoking guest OS code allows the VM service to leverage existing, complex guest code to carry out general-purpose functionality such as reading a guest file from the file cache/disk system. VM services can protect themselves from guest code by disallowing external I/O. They can protect the guest data from perturbation by checkpointing it before changing its state and rolling the guest back later.