next up previous contents index
Next: 5. Condor for Microsoft Up: 4. Miscellaneous Concepts Previous: 4.2 An Introduction to

Subsections

  
4.3 The Condor Perl Module

The Condor perl module facilitates automatic submitting and monitoring of condor jobs, along with automated administration of condor. The most common use of the perl module is the monitoring of condor jobs. The condor perl module uses the user log of a condor job for monitoring.

The Condor perl module is made up of several subroutines. Many subroutines take other subroutines as arguments. These subroutines are used as callbacks which are called when interesting events happen.

4.3.1 Subroutines

1.
Submit(command_file)
The submit subroutine takes a command file name as an argument and submits it to condor. The condor_submit program should be in the path of the user. If the user wishes to monitor the job with condor they must specify a log file in the command file. The cluster submitted is returned. For more information see the condor_submit man page.
2.
Vacate(machine)
Vacate the machine specified. The machine may be specified either by hostname, or by sinful string. For more information see the condor_vacate man page.

3.
Reschedule(machine)
Reschedule the machine specified. The machine may be specified either by hostname, or by sinful string. For more information see the condor_reschedule man page.

4.
RegisterEvicted(sub)
Register an eviction handler that will be called anytime a job from the specified cluster is evicted. The eviction handler will be called with two arguments: cluster and job. The cluster and job are the cluster number and process number of the job that was evicted.
5.
RegisterEvictedWithCheckpoint(sub)
Same as RegisterEvicted except that the handler is called when the evicted job was checkpointed.

6.
RegisterEvictedWithoutCheckpoint(sub)
Same as RegisterEvicted except that the handler is called when the evicted job was not checkpointed.

7.
RegisterExit(sub)
Register a termination handler that is called when a job exits. The termination handler will be called with two arguments: cluster and job. The cluster and job are the cluster and process numbers of the existing job.
8.
RegisterExitSuccess(sub)
Register a termination handler that is called when a job exits without errors. The termination handler will be called with two arguments: cluster and job The cluster and job are the cluster and process numbers of the existing job.

9.
RegisterExitFailure(sub)
Register a termination handler that is called when a job exits with errors. The termination handler will be called with three arguments: cluster, job and retval. The cluster and job are the cluster and process numbers of the existing job and the retval is the exit code of the job.

10.
RegisterExitAbnormal(sub)
Register an termination handler that is called when a job abnormally exits (segmentation fault, bus error, ...). The termination handler will be called with four arguments: cluster, job signal and core. The cluster and job are the cluster and process numbers of the existing job. The signal indicates the signal that the job died with and core indicates whether a core file was created and if so, what the full path to the core file is.

11.
RegisterAbort(sub)
Register a handler that is called when a job is aborted by a user.

12.
RegisterJobErr(sub)
Register a handler that is called when a job is not executable.

13.
RegisterExecute(sub)
Register an execution handler that is called whenever a job starts running on a given host. The handler is called with four arguments: cluster, job host, and sinful. Cluster and job are the cluster and process numbers for the job, host is the Internet address of the machine running the job, and sinful is the Internet address and command port of the condor_starter supervising the job.

14.
RegisterSubmit(sub)
Register a submit handler that is called whenever a job is submitted with the given cluster. The handler is called with cluster, job host, and sinful. Cluster and job are the cluster and process numbers for the job, host is the Internet address of the machine running the job, and sinful is the Internet address and command port of the condor_schedd responsible for the job.

15.
Monitor(cluster)
Begin monitoring this cluster. This process starts a sub process in order to monitor the child, so other actions may proceed in the main loop of the perl script. However, handlers cannot rely on being able to communicate back to the main script by simply changing variables latter on.
16.
Wait()
Wait until all monitors finish and exit.

17.
DebugOn()
Turn debug messages on. This may be useful if you don't understand what your script is doing.

18.
DebugOff()
Turn debug messages off.

4.3.2 An Example

The following is a simple example of using the condor perl module.
#!/usr/bin/perl
use Condor;

$CMD_FILE = 'mycmdfile.cmd';
$evicts = 0;
$vacates = 0;

# A subroutine that will be used as the normal execution callback
$normal = sub
{
    %parameters = @_;
    $cluster = $parameters{'cluster'};
    $job = $parameters{'job'};

    print "Job $cluster.$job exited normally without errors.\n";
    print "Job was vacated $vacates times and evicted $evicts times\n";
    exit(0);
};	

$evicted = sub
{
    %parameters = @_;
    $cluster = $parameters{'cluster'};
    $job = $parameters{'job'};

    print "Job $cluster, $job was evicted.\n";
    $evicts++;
    &Condor::Reschedule();	
};

$execute = sub
{
    %parameters = @_;
    $cluster = $parameters{'cluster'};
    $job = $parameters{'job'};
    $host = $parameters{'host'};
    $sinful = $parameters{'sinful'};

    print "Job running on $sinful, vacating...\n";
    &Condor::Vacate($sinful);
    $vacates++;
};

$cluster = Condor::Submit($CMD_FILE);
&Condor::RegisterExitSuccess($normal);
&Condor::RegisterEvicted($evicted);
&Condor::RegisterExecute($execute);
&Condor::Monitor($cluster);
&Condor::Wait();

This example program will submit the command file 'mycmdfile.cmd' and attempt to vacate any machine that the job runs on. The termination handler then prints out a summary of what has happened.


next up previous contents index
Next: 5. Condor for Microsoft Up: 4. Miscellaneous Concepts Previous: 4.2 An Introduction to
condor-admin@cs.wisc.edu