A newer version can be found here

Status and State Numbers

This is just my working collection of notes for various magic numbers in the bits of software I deal with on a regular basis. These are mostly extracted from the source files for the software in question. As it was originally written to be a set of notes for myself, I can't make any promises that it's accurate or up to date.

HTCondor

Universe

JobUniverse in job ClassAds

0MinA placeholder, not a universe
1StandardSingle process relinked jobs
2PipeA placeholder, no longer used
3LindaA placeholder, no longer used
4PVMParallel Virtual Machine apps
5VanillaSingle process non-relinked jobs
6PVMDPVM daemon process
7SchedulerA job run under the schedd
8MPIMessage Passing Interface jobs
9Grid / GlobusJobs managed by condor_gridmanager (V6.6: always Globus, V6.7: grid_type=gt2, gt3, gt5, condor, oracle, nordugrid...)
10JavaJobs for the Java Virtual Machine
11ParallelGeneralized parallel jobs
12LocalA job run under the schedd using a starter (advanced form of Scheduler).
13MaxA placeholder, not a universe.

Job Status

JobStatus in job ClassAds

0Unexpanded U
1Idle I
2Running R
3Removed X
4Completed C
5Held H
6Submission_err E

Notification

JobNotification in job ClassAds

0Never
1Always
2Complete
3Error

Shadow exit status

(Source: h/exit.h)

Value Name Description
4 JOB_EXCEPTION The job exited with an exception
44 DPRINTF_ERROR There is a fatal error with dprintf()
100 JOB_EXITED The job exited (not killed)
101 JOB_CKPTED The job was checkpointed
102 JOB_KILLED The job was killed
103 JOB_COREDUMPED The job was killed and a core file produced
105 JOB_NO_MEM Not enough memory to start the shadow
106 JOB_SHADOW_USAGE incorrect arguments to condor_shadow
107 JOB_NOT_CKPTED The job was kicked off without a checkpoint
107 JOB_SHOULD_REQUEUE (!) We define this to the same number, since we want the same behavior. However, "JOB_NOT_CKPTED" doesn't mean much if we're not a standard universe job. The effect of this exit code is that we want the job to be put back in the job queue and run again.
108 JOB_NOT_STARTED Can't connect to startd or request refused
109 JOB_BAD_STATUS Job status != RUNNING on startup
110 JOB_EXEC_FAILED Exec failed for some reason other than ENOMEM
111 JOB_NO_CKPT_FILE There is no checkpoint file (lost)
112 JOB_SHOULD_HOLD The job should be put on hold
113 JOB_SHOULD_REMOVE The job should be removed

Log event codes

Submit0
Execute1
Executable error2
Checkpointed3
Job evicted4
Job terminated5
Image size6
Shadow exception7
Generic8
Job aborted9
Job suspended10
Job unsuspended11
Job held12
Job released13
Node execute14
Node terminated15
Post script terminated16
Globus submit17
Globus submit failed18
Globus resource up19
Globus resource down20
Remote error21

Starters and Shadows

ShadowStarterUniverse
jimjimPVM
V6V5Standard
V6.1V6.1Everything else

Globus

GRAM Error Codes

From Globus 2.2.4, globus_gram_protocol-5.0/globus_gram_protocol_constants.h and globus_gram_protocol_error.c

ValueGLOBUS_GRAM_
PROTOCOL_ERROR_...
the job failed
0Success
1PARAMETER_NOT_SUPPORTEDone of the RSL parameters is not supported
2INVALID_REQUESTthe RSL length is greater than the maximum allowed
3NO_RESOURCESan I/O operation failed
4BAD_DIRECTORYjobmanager unable to set default to the directory requested
5EXECUTABLE_NOT_FOUNDthe executable does not exist
6INSUFFICIENT_FUNDSof an unused INSUFFICIENT_FUNDS
7AUTHORIZATIONauthentication with the remote server failed
8USER_CANCELLEDthe user cancelled the job
9SYSTEM_CANCELLEDthe system cancelled the job
10PROTOCOL_FAILEDdata transfer to the server failed
11STDIN_NOT_FOUNDthe stdin file does not exist
12CONNECTION_FAILEDthe connection to the server failed (check host and port)
13INVALID_MAXTIMEthe provided RSL 'maxtime' value is not an integer
14INVALID_COUNTthe provided RSL 'count' value is not an integer
15NULL_SPECIFICATION_TREEthe job manager received an invalid RSL
16JM_FAILED_ALLOW_ATTACHthe job manager failed in allowing others to make contact
17JOB_EXECUTION_FAILEDthe job failed when the job manager attempted to run it
18INVALID_PARADYNan invalid paradyn was specified
19INVALID_JOBTYPEthe provided RSL 'jobtype' value is invalid
20INVALID_GRAM_MYJOBthe provided RSL 'myjob' value is invalid
21BAD_SCRIPT_ARG_FILEthe job manager failed to locate an internal script argument file
22ARG_FILE_CREATION_FAILEDthe job manager failed to create an internal script argument file
23INVALID_JOBSTATEthe job manager detected an invalid job state
24INVALID_SCRIPT_REPLYthe job manager detected an invalid script response
25INVALID_SCRIPT_STATUSthe job manager detected an invalid script status
26JOBTYPE_NOT_SUPPORTEDthe provided RSL 'jobtype' value is not supported by this job manager
27UNIMPLEMENTEDunused ERROR_UNIMPLEMENTED
28TEMP_SCRIPT_FILE_FAILEDthe job manager failed to create an internal script submission file
29USER_PROXY_NOT_FOUNDthe job manager cannot find the user proxy
30OPENING_USER_PROXYthe job manager failed to open the user proxy
31JOB_CANCEL_FAILEDthe job manager failed to cancel the job as requested
32MALLOC_FAILEDsystem memory allocation failed
33DUCT_INIT_FAILEDthe interprocess job communication initialization failed
34DUCT_LSP_FAILEDthe interprocess job communication setup failed
35INVALID_HOST_COUNTthe provided RSL 'host count' value is invalid
36UNSUPPORTED_PARAMETERone of the provided RSL parameters is unsupported
37INVALID_QUEUEthe provided RSL 'queue' parameter is invalid
38INVALID_PROJECTthe provided RSL 'project' parameter is invalid
39RSL_EVALUATION_FAILEDthe provided RSL string includes variables that could not be identified
40BAD_RSL_ENVIRONMENTthe provided RSL 'environment' parameter is invalid
41DRYRUNthe provided RSL 'dryrun' parameter is invalid
42ZERO_LENGTH_RSLthe provided RSL is invalid (an empty string)
43STAGING_EXECUTABLEthe job manager failed to stage the executable
44STAGING_STDINthe job manager failed to stage the stdin file
45INVALID_JOB_MANAGER_TYPEthe requested job manager type is invalid
46BAD_ARGUMENTSthe provided RSL 'arguments' parameter is invalid
47GATEKEEPER_MISCONFIGUREDthe gatekeeper failed to run the job manager
48BAD_RSLthe provided RSL could not be properly parsed
49VERSION_MISMATCHthere is a version mismatch between GRAM components
50RSL_ARGUMENTSthe provided RSL 'arguments' parameter is invalid
51RSL_COUNTthe provided RSL 'count' parameter is invalid
52RSL_DIRECTORYthe provided RSL 'directory' parameter is invalid
53RSL_DRYRUNthe provided RSL 'dryrun' parameter is invalid
54RSL_ENVIRONMENTthe provided RSL 'environment' parameter is invalid
55RSL_EXECUTABLEthe provided RSL 'executable' parameter is invalid
56RSL_HOST_COUNTthe provided RSL 'host_count' parameter is invalid
57RSL_JOBTYPEthe provided RSL 'jobtype' parameter is invalid
58RSL_MAXTIMEthe provided RSL 'maxtime' parameter is invalid
59RSL_MYJOBthe provided RSL 'myjob' parameter is invalid
60RSL_PARADYNthe provided RSL 'paradyn' parameter is invalid
61RSL_PROJECTthe provided RSL 'project' parameter is invalid
62RSL_QUEUEthe provided RSL 'queue' parameter is invalid
63RSL_STDERRthe provided RSL 'stderr' parameter is invalid
64RSL_STDINthe provided RSL 'stdin' parameter is invalid
65RSL_STDOUTthe provided RSL 'stdout' parameter is invalid
66OPENING_JOBMANAGER_SCRIPTthe job manager failed to locate an internal script
67CREATING_PIPEthe job manager failed on the system call pipe()
68FCNTL_FAILEDthe job manager failed on the system call fcntl()
69STDOUT_FILENAME_FAILEDthe job manager failed to create the temporary stdout filename
70STDERR_FILENAME_FAILEDthe job manager failed to create the temporary stderr filename
71FORKING_EXECUTABLEthe job manager failed on the system call fork()
72EXECUTABLE_PERMISSIONSthe executable file permissions do not allow execution
73OPENING_STDOUTthe job manager failed to open stdout
74OPENING_STDERRthe job manager failed to open stderr
75OPENING_CACHE_USER_PROXYthe cache file could not be opened in order to relocate the user proxy
76OPENING_CACHEcannot access cache files in ~/.globus/.gass_cache, check permissions, quota, and disk space
77INSERTING_CLIENT_CONTACTthe job manager failed to insert the contact in the client contact list
78CLIENT_CONTACT_NOT_FOUNDthe contact was not found in the job manager's client contact list
79CONTACTING_JOB_MANAGERconnecting to the job manager failed. Possible reasons: job terminated, invalid job contact, network problems, ...
80INVALID_JOB_CONTACTthe syntax of the job contact is invalid
81UNDEFINED_EXEthe executable parameter in the RSL is undefined
82CONDOR_ARCHthe job manager service is misconfigured. condor arch undefined
83CONDOR_OSthe job manager service is misconfigured. condor os undefined
84RSL_MIN_MEMORYthe provided RSL 'min_memory' parameter is invalid
85RSL_MAX_MEMORYthe provided RSL 'max_memory' parameter is invalid
86INVALID_MIN_MEMORYthe RSL 'min_memory' value is not zero or greater
87INVALID_MAX_MEMORYthe RSL 'max_memory' value is not zero or greater
88HTTP_FRAME_FAILEDthe creation of a HTTP message failed
89HTTP_UNFRAME_FAILEDparsing incoming HTTP message failed
90HTTP_PACK_FAILEDthe packing of information into a HTTP message failed
91HTTP_UNPACK_FAILEDan incoming HTTP message did not contain the expected information
92INVALID_JOB_QUERYthe job manager does not support the service that the client requested
93SERVICE_NOT_FOUNDthe gatekeeper failed to find the requested service
94JOB_QUERY_DENIALthe jobmanager does not accept any new requests (shutting down)
95CALLBACK_NOT_FOUNDthe client failed to close the listener associated with the callback URL
96BAD_GATEKEEPER_CONTACTthe gatekeeper contact cannot be parsed
97POE_NOT_FOUNDthe job manager could not find the 'poe' command
98MPIRUN_NOT_FOUNDthe job manager could not find the 'mpirun' command
99RSL_START_TIMEthe provided RSL 'start_time' parameter is invalid
100RSL_RESERVATION_HANDLEthe provided RSL 'reservation_handle' parameter is invalid
101RSL_MAX_WALL_TIMEthe provided RSL 'max_wall_time' parameter is invalid
102INVALID_MAX_WALL_TIMEthe RSL 'max_wall_time' value is not zero or greater
103RSL_MAX_CPU_TIMEthe provided RSL 'max_cpu_time' parameter is invalid
104INVALID_MAX_CPU_TIMEthe RSL 'max_cpu_time' value is not zero or greater
105JM_SCRIPT_NOT_FOUNDthe job manager is misconfigured, a scheduler script is missing
106JM_SCRIPT_PERMISSIONSthe job manager is misconfigured, a scheduler script has invalid permissions
107SIGNALING_JOBthe job manager failed to signal the job
108UNKNOWN_SIGNAL_TYPEthe job manager did not recognize/support the signal type
109GETTING_JOBIDthe job manager failed to get the job id from the local scheduler
110WAITING_FOR_COMMITthe job manager is waiting for a commit signal
111COMMIT_TIMED_OUTthe job manager timed out while waiting for a commit signal
112RSL_SAVE_STATEthe provided RSL 'save_state' parameter is invalid
113RSL_RESTARTthe provided RSL 'restart' parameter is invalid
114RSL_TWO_PHASE_COMMITthe provided RSL 'two_phase' parameter is invalid
115INVALID_TWO_PHASE_COMMITthe RSL 'two_phase' value is not zero or greater
116RSL_STDOUT_POSITIONthe provided RSL 'stdout_position' parameter is invalid
117INVALID_STDOUT_POSITIONthe RSL 'stdout_position' value is not zero or greater
118RSL_STDERR_POSITIONthe provided RSL 'stderr_position' parameter is invalid
119INVALID_STDERR_POSITIONthe RSL 'stderr_position' value is not zero or greater
120RESTART_FAILEDthe job manager restart attempt failed
121NO_STATE_FILEthe job state file doesn't exist
122READING_STATE_FILEcould not read the job state file
123WRITING_STATE_FILEcould not write the job state file
124OLD_JM_ALIVEold job manager is still alive
125TTL_EXPIREDjob manager state file TTL expired
126SUBMIT_UNKNOWNit is unknown if the job was submitted
127RSL_REMOTE_IO_URLthe provided RSL 'remote_io_url' parameter is invalid
128WRITING_REMOTE_IO_URLcould not write the remote io url file
129STDIO_SIZEthe standard output/error size is different
130JM_STOPPEDthe job manager was sent a stop signal (job is still running)
131USER_PROXY_EXPIREDthe user proxy expired (job is still running)
132JOB_UNSUBMITTEDthe job was not submitted by original jobmanager
133INVALID_COMMITthe job manager is not waiting for that commit signal
134RSL_SCHEDULER_SPECIFICthe provided RSL scheduler specific parameter is invalid
135STAGE_IN_FAILEDthe job manager could not stage in a file
136INVALID_SCRATCHthe scratch directory could not be created
137RSL_CACHEthe provided 'gass_cache' parameter is invalid
138INVALID_SUBMIT_ATTRIBUTEthe RSL contains attributes which are not valid for job submission
139INVALID_STDIO_UPDATE_ATTRIBUTEthe RSL contains attributes which are not valid for stdio update
140INVALID_RESTART_ATTRIBUTEthe RSL contains attributes which are not valid for job restart
141RSL_FILE_STAGE_INthe provided RSL 'file_stage_in' parameter is invalid
142RSL_FILE_STAGE_IN_SHAREDthe provided RSL 'file_stage_in_shared' parameter is invalid
143RSL_FILE_STAGE_OUTthe provided RSL 'file_stage_out' parameter is invalid
144RSL_GASS_CACHEthe provided RSL 'gass_cache' parameter is invalid
145RSL_FILE_CLEANUPthe provided RSL 'file_cleanup' parameter is invalid
146RSL_SCRATCHthe provided RSL 'scratch_dir' parameter is invalid
147INVALID_SCHEDULER_SPECIFICthe provided scheduler-specific RSL parameter is invalid
148UNDEFINED_ATTRIBUTEa required RSL attribute was not defined in the RSL spec
149INVALID_CACHEthe gass_cache attribute points to an invalid cache directory
150INVALID_SAVE_STATEthe provided RSL 'save_state' parameter has an invalid value
151OPENING_VALIDATION_FILEthe job manager could not open the RSL attribute validation file
152READING_VALIDATION_FILEthe job manager could not read the RSL attribute validation file
153RSL_PROXY_TIMEOUTthe provided RSL 'proxy_timeout' is invalid
154INVALID_PROXY_TIMEOUTthe RSL 'proxy_timeout' value is not greater than zero
155STAGE_OUT_FAILEDthe job manager could not stage out a file
156JOB_CONTACT_NOT_FOUNDthe job contact string does not match any which the job manager is handling
157DELEGATION_FAILEDproxy delegation failed
158LOCKING_STATE_LOCK_FILEthe job manager could not lock the state lock file
159INVALID_ATTRan invalid globus_io_clientattr_t was used.
160NULL_PARAMETERan null parameter was passed to the gram library
161STILL_STREAMINGthe job manager is still streaming output
162LAST

GRAM Job States

Value GLOBUS_GRAM_
PROTOCOL_
JOB_STATE_...
Description
1PENDING The job is waiting for resources to become available to run.
2ACTIVE The job has received resources and the application is executing.
4FAILED The job terminated before completion because an error, user-triggered cancel, or system-triggered cancel.
8DONE The job completed successfully
16SUSPENDED The job has been suspended. Resources which were allocated for this job may have been released due to some scheduler-specific reason.
32UNSUBMITTED The job has not been submitted to the scheduler yet, pending the reception of the GLOBUS_GRAM_PROTOCOL_JOB_SIGNAL_COMMIT_REQUEST signal from a client.
64STAGE_IN The job manager is staging in files to run the job.
128STAGE_OUT The job manager is staging out files generated by the job.
0xFFFFF ALL A mask of all job states.

GRAM Signals

globus_gram_protocol_job_signal_t GRAM Signals
ValueGLOBUS_GRAM_
PROTOCOL_
JOB_SIGNAL_...
Description
1CANCEL Cancel a job
2SUSPEND Suspend a job
3RESUME Resume a previously suspended job
4PRIORITY Change the priority of a job
5COMMIT_REQUEST Signal the job manager to commence with a job submission if the job request was accompanied by the (two_state=yes) RSL attribute.
6COMMIT_EXTEND Signal the job manager to wait an additional number of seconds (specified by an integer value string as the signal's argument) before timing out a two-phase job commit.
7STDIO_UPDATE Signal the job manager to change the way it is currently handling standard output and/or standard error. The argument for this signal is an RSL containing new @a stdout, @a stderr, @a stdout_position, @a stderr_position, or @a remote_io_url relations.
8STDIO_SIZE Signal the job manager to verify that streamed I/O has been completely received. The argument to this signal contains the number of bytes of stdout and stderr received, seperated by a space. The reply to this signal will be a SUCCESS message if these matched the amount sent by the job manager. Otherwise, an error reply indicating GLOBUS_GRAM_PROTOCOL_ERROR_STDIO_SIZE is returned. If standard output and standard error are merged, only one number should be sent as an argument to this signal. An argument of -1 for either stream size indicates that the client is not interested in the size of that stream.
9STOP_MANAGER Signal the job manager to stop managing the current job and terminate. The job continues to run as normal. The job manager will send a state change callback with the job status being FAILED and the error GLOBUS_GRAM_PROTOCOL_ERROR_JM_STOPPED.
10COMMIT_END Signal the job manager to clean up after the completion of the job if the job RSL contained the (two-phase = yes) relation.

Misc

Well Known Ports

condor negotiator9614 (obsolete, dynamic in 6.7.x)
condor collector9618
GT2 gatekeeper2119
gridftp2811
GT4 web services8443