next up previous contents index
Next: 2.12 Time Scheduling for Up: 2. Users' Manual Previous: 2.10 DAGMan Applications   Contents   Index

Subsections


2.11 Virtual Machine Applications

The vm universe facilitates a Condor job that matches and then lands a disk image on an execute machine within a Condor pool. This disk image is intended to be a virtual machine. In this manner, the virtual machine is the job to be executed.

This section describes this Condor job. See section 3.3.28 for details of configuration variables.


2.11.1 The Submit Description File

Different than all other universe jobs, the vm universe job specifies a disk image, not an executable. Therefore, the submit commands input, output, and error do not apply. If specified, condor_submit rejects the job with an error. The executable command changes definition within a vm universe job. It no longer specifies an executable file, but instead provides a string that identifies the job for tools such as condor_q. Other commands specific to the type of virtual machine software identify the disk image.

Use of the args command creates a file named condor.arg, which is added to the set of CD-ROM files. The contents of this file are the arguments specified.

VMware and Xen virtual machine software are supported. As the two differ from each other, the submit description file specifies either

  vm_type = vmware
or
  vm_type = xen

The job specifies its memory needs for the disk image with vm_memory, which is given in Mbytes. Condor uses this number to assure a match with a machine that can provide the needed memory space.

A CD-ROM for the virtual machine is composed of a set of files. These files are specified in the submit description file with a comma-separated list of file names.

  vm_cdrom_files = a.txt,b.txt,c.txt
Condor must also be told to transfer these files from the submit machine to the machine that will run the vm universe job with
  vm_should_transfer_cdrom_files = YES

Creating a checkpoint is straightforward for a virtual machine, as a checkpoint is a set of files that represent a snapshot of both disk image and memory. The checkpoint is created and all files are transferred back to the $(SPOOL) directory on the machine from which the job was submitted. vm universe jobs can not use a checkpoint server. The submit command to create checkpoints is

  vm_checkpoint = true
Without this command, no checkpoints are created (by default).

Virtual machine networking is enabled with the command

  vm_networking = true
And, when networking is enabled, a definition of vm_networking_type as bridge matches the job only with a machine that is configured to use bridge networking. A definition of vm_networking_type as nat matches the job only with a machine that is configured to use NAT networking. When no definition of vm_networking_type is given, Condor may match the job with a machine that enables networking, and further, the choice of bridge or NAT networking is determined by the machine's configuration.

A current limitation restricts the use of networking to vm universe jobs that do not create checkpoints such that the job may migrate to another machine.

When both checkpoints and networking are enabled, the job further specifies

  when_to_transfer_output = ON_EXIT_OR_EVICT

Modified disk images are transferred back to the machine from which the job was submitted as the vm universe job completes. Job completion for a vm universe job occurs when the virtual machine is shut down, and Condor notices (as the result of a periodic check on the state of the virtual machine). Should the job not want any files transferred back (modified or not), for example because the job explicitly transferred its own files, the submit command to prevent the transfer is

  vm_no_output_vm = true

Further commands specify information that is specific to the virtual machine type targeted.


2.11.1.1 VMware-Specific Submit Commands

Specific to VMware, the submit description file command vmware_dir gives the path and directory (on the machine from which the job is submitted) where VMware-specific files and applications reside. One example of a VMware-specific application is the VMDK files, which form a virtual hard drive (disk image) for the virtual machine. VMX files containing the primary configuration for the virtual machine would also be in this directory.

Condor must be told whether or not the contents of the vmware_dir directory must be transferred to the machine where the job is to be executed. This required information is given with the submit command vmware_should_transfer_files. With a value of True, Condor does transfer the contents of the directory. With a value of False, Condor does not transfer the contents of the directory, and instead presumes that access to this directory is available through a shared file system.

By default, Condor uses a snapshot disk for new and modified files. They may also be utilized for checkpoints. The snapshot disk is initially quite small, growing only as new files are created or files are modified. When vmware_should_transfer_files is True, a job may specify that a snapshot disk is not to be used with the command

  vmware_snapshot_disk = False
In this case, Condor will utilize original disk files in producing checkpoints. Note that condor_submit issues an error message and does not submit the job if both vmware_should_transfer_files and vmware_snapshot_disk are False.

Note that if snapshot disks are requested and file transfer is not being used, the vmware_dir setting given in the submit file should not contain any symlink path components. This is to work around the issue discussed in the FAQ entry in section  7.3.

Here is a sample submit description file for a VMware virtual machine:

universe                     = vm
executable                   = vmware_sample_job
log                          = simple.vm.log.txt
vm_type			     = vmware
vm_memory		     = 64
vmware_dir		     = C:\condor-test
vmware_should_transfer_files = True
queue
This sample uses the vmware_dir command to identify the location of the disk image to be executed as a Condor job. The contents of this directory are transferred to the machine assigned to execute the Condor job.


2.11.1.2 Xen-Specific Submit Commands

The required disk image must be identified for a Xen virtual machine. This xen_disk command specifies a list of comma-separated files. Each disk file is specified by 3 colon separated fields. The first field is the path and file name of the disk file. The second field specifies the device, and the third field specifies permissions. Here is an example that identifies two files:

  xen_disk = /myxen/diskfile.img:sda1:w,/myxen/swap.img:sda2:w

If any files need to be transferred from the submit machine to the machine where the vm universe job will execute, Condor must be explicitly told to do so with the xen_transfer_files command:

  xen_transfer_files = /myxen/diskfile.img,/myxen/swap.img
Any and all needed files on a system without a shared file system (between the submit machine and the machine where the job will execute) must be listed.

A Xen vm universe job requires specification of the guest kernel. The xen_kernel command accomplishes this, utilizing one of the following definitions.

  1. xen_kernel = any tells Condor that the kernel is pre-staged, and its location is specified by configuration of the condor_vm-gahp.

  2. xen_kernel = included implies that the kernel is to be found in disk image given by the definition of the single file specified in xen_disk.

  3. xen_kernel = path-to-kernel gives a full path and file name of the required kernel. If this kernel must be transferred to machine on which the vm universe job will execute, it must also be included in the xen_transfer_files command.

    This form of the xen_kernel command also requires further definition of the xen_root command. xen_root defines the device containing files needed by root.

Transfer of CD-ROM files under Xen requires the definition of the associated device in addition to the specification of the files. The submit description file contains

  vm_cdrom_files = a.txt,b.txt,c.txt
  vm_should_transfer_cdrom_files = YES
  xen_cdrom_device = device-name
where the last line of this example defines the device.


2.11.2 Checkpoints

\fbox{This section has not yet been written}


2.11.3 Disk Images


2.11.3.1 VMware on Windows and Linux

Following the platform-specific guest OS installation instructions found at http://pubs.vmware.com/guestnotes, creates a VMware disk image.


2.11.3.2 Xen

\fbox{This section has not yet been written}


2.11.4 Job Completion in the vm Universe

Job completion for a vm universe job occurs when the virtual machine is shut down, and Condor notices (as the result of a periodic check on the state of the virtual machine). This is different from jobs executed under the environment of other universes.

Shut down of a virtual machine occurs from within the virtual machine environment. Under a Windows 2000, Windows XP, or Vista virtual machine, an administrator issues the command

  shutdown -s -t 01
For older versions of Windows operating systems, directions are given at http://www.aumha.org/win4/a/shutcut.php.

Under a Linux virtual machine, the root user executes

  /sbin/poweroff
The command /sbin/halt will not completely shut down some Linux distributions, and instead causes the job to hang.

Since the successful completion of the vm universe job requires the successful shut down of the virtual machine, it is good advice to try the shut down procedure outside of Condor, before a vm universe job is submitted.


next up previous contents index
Next: 2.12 Time Scheduling for Up: 2. Users' Manual Previous: 2.10 DAGMan Applications   Contents   Index
condor-admin@cs.wisc.edu