next up previous contents index
Next: 3.12 Setting Up for Up: 3. Administrators' Manual Previous: 3.10 The High Availability   Contents   Index

Subsections


3.11 Quill

Quill builds and maintains a mirror database of a Condor job queue. The condor_ quill daemon implements it, and the condor_ q and condor_ history tools use it.


3.11.1 Installation and Configuration

Quill uses the PostgreSQL database management system. Quill uses the PostgreSQL server as its back end and client library, libpq to talk to the server. Quill works with PostgreSQL version 8.0; it has also been tested with version 7.4.

Obtain PostgreSQL from

http://www.postgresql.org/ftp/source/

Installation instructions are detailed in: http://www.postgresql.org/docs/8.0/static/installation.html

Configure PostgreSQL after installation:

  1. The condor_ quill daemon and client tools connect to the database as users ``quillreader'' and ``quillwriter''. These are database users, not operating system users. The two types of users are quite different from each other. If these data base users do not exist, add them using the createuser command supplied with the installation. Assign them with appropriate passwords; these passwords will be used by the Quill tools to connect to the database in a secure way. User ``quillreader'' should not be allowed to create more databases nor create more users. User ``quillwriter'' should not be allowed to create more users, however it should be allowed to create more databases. The following commands create the two users with the appropriate permissions, and be ready to enter the corresponding passwords when prompted.

    /path/to/postgreSQL/bin/directory/createuser quillreader \
    	--no-createdb --no-adduser --pwprompt
    
    /path/to/postgreSQL/bin/directory/createuser quillwriter \
    	--createdb --no-adduser --pwprompt
    

  2. Configure to accept TCP/IP connections. For PostgreSQL version 8, use the listen_addresses variable in postgresql.conf file as a guide. For example, listen_addresses = '*' means listen on any IP interface. In PostgreSQL version 7, this was accomplished by setting tcpip_socket=true in the postgresql.conf file.

  3. Configure PostgreSQL to accept TCP/IP connections from specific hosts. Modify the pg_hba.conf file (which usually resides in the PostgreSQL server's data directory). Access is required by the condor_ quill daemon, as well as the database users ``quillreader'' and ``quillwriter''. For example, to give database users ``quillreader'' and ``quillwriter'' password-enabled access to all databases on current machine from any other machine in the network, add the following:

    host all quillreader 128.105.0.0 255.255.0.0 password
    host all quillwriter 128.105.0.0 255.255.0.0 password

    Note that in addition to the database specified by the configuration variable QUILL_DB_NAME, the condor_ quill daemon also needs access to the database "template1". In order to create the database in the first place, the condor_ quill daemon needs to connect to the database.

  4. The condor_ quill daemon needs read and write access to the database. It connects as user ``quillwriter'', who has owner privileges to the database. Since this gives all access to the ``quillwriter'' user, this password cannot be stored in a public place (such as the condor_ collector). For this reason, the ``quillwriter'' password is stored in a file named .quillwritepassword in the Condor spool directory. Appropriate protections on this file guarantee secure access to the database. This file must be created and protected by the site administrator; if this file does not exist as and where expected, the condor_ quill daemon logs an error and exits.

Condor must also be configured to use Quill.

Add QUILL to the variable DAEMON_LIST.
Add the file .quillwritepassword to the VALID_SPOOL_FILES variable, since condor_ preen must be told not to delete this file.
Tell Condor where Quill resides, as well as specify Quill's start-up arguments:
QUILL = $(SBIN)/condor_quill
QUILL_ARGS = -f
Identify Quill's log file:
QUILL_LOG = $(LOG)/QuillLog
Set up all other configuration variables that are specific to the installation.
QUILL_ENABLED           = TRUE
QUILL_NAME              = some-unique-quill-name.cs.wisc.edu
QUILL_DB_NAME           = database-for-some-unique-quill-name
QUILL_DB_IP_ADDR        = databaseipaddress:port
# the following parameter's units is in seconds
QUILL_POLLING_PERIOD    = 10
# the following parameter's units is in hours
QUILL_HISTORY_CLEANING_INTERVAL = 24
# the following parameter's units is in days
QUILL_HISTORY_DURATION 	= 30
QUILL_IS_REMOTELY_QUERYABLE = TRUE
QUILL_DB_QUERY_PASSWORD =  password-for-database-user-quillreader
QUILL_ADDRESS_FILE      = $(LOG)/.quill_address

Descriptions of these and other configuration variables are in section 3.3.23. Here are further brief details:


3.11.2 Four Usage Examples

  1. Query a remote Quill daemon on regular.cs.wisc.edu for all the jobs in the queue
    	condor_q -name quill@regular.cs.wisc.edu
    	condor_q -name schedd@regular.cs.wisc.edu
    
    There are two ways to get to a Quill daemon: directly using its name as specified in the QUILL_NAME configuration variable, or indirectly by querying the condor_ schedd daemon using its name. In the latter case, condor_ q will detect if that condor_ schedd daemon is being serviced by a database, and if so, directly query it. In both cases, the IP address and port of the database server hosting the data of this particular remote Quill daemon can be figured out by the QUILL_DB_IP_ADDR and QUILL_DB_NAME variables specified in the QUILL_AD sent by the quill daemon to the collector and in the SCHEDD_AD sent by the condor_ schedd daemon.

  2. Query a remote Quill daemon on regular.cs.wisc.edu for all historical jobs belonging to owner einstein.
    	condor_history -name quill@regular.cs.wisc.edu einstein
    

  3. Query the local Quill daemon for the average time spent in the queue for all non-completed jobs.
    	condor_q -avgqueuetime
    
    The averate queue time is defined as the average of (currenttime - jobsubmissiontime) over all jobs which are neither completed (JobStatus == 4) or removed (JobStatus == 3).

  4. Query the local Quill daemon for all historical jobs completed since Apr 1, 2005 at 13h 00m.
    	condor_history -completedsince '04/01/2005 13:00'
    
    It fetches all jobs which got into the 'Completed' state on or after the specified time stamp. It use the PostgreSQL date/time syntax rules, as it encompasses most format options. See http://www.postgresql.org/docs/8.0/static/datatype-datetime.html#AEN4516 for the various time stamp formats.


3.11.3 Quill and Its RDBMS Schema

With only 7 tables and 2 views, Quill uses a relatively simple database schema. These can be broadly divided into tables used to store job queue information and those used to store historical information.

The job queue part of the schema closely follows Condor's ClassAd data model. For example, each row in these tables describe an <attribute,value> pair of the classad. Additionally, just as how Condor distinguishes a ClusterAd from a ProcAd where the former stores attributes common to all jobs within a cluster whereas the latter stores attributes specific to each job, the schema also makes this distinction. Finally, numerical and string valued attributes are stored separately.

Thus, there are four tables:

In addition to the <attribute, value>, each row contains the cluster-id (cid) and in the case of the ProcAd tables, also the proc-id (pid).

Since each ClassAd would be split into potentially two tables (string and numeric), there are views that unify them into a single entity in order to simplify queries.

Here are the view definitions:

Finally, the job queue part of the schema also contains a table that stores metadata information related to the job_queue.log file.

At all times, there is only 1 row in this table and it describes information related to the last time Quill polled the job_queue.log file.

The historical information on the other hand is slightly differently designed. Instead of a purely vertical data model (each row is a <attribute,value> pair), we have two tables that together represent the complete job classad. Their schema is as follows:

  1. History_Horizontal (cid int, pid int, EnteredHistoryTable timestamp with time zone, Owner text, QDate int, RemoteWallClockTime int, RemoteUserCpu float, RemoteSysCpu float, ImageSize int, JobStatus int, JobPrio int, Cmd text, CompletionDate int, LastRemoteHost text, primary key(cid,pid))

  2. History_Vertical (cid int, pid int, attr text, val text, primary key (cid, pid, attr))

Each historical job ad is divided into its horizontal and vertical counterparts. This division was made because of query performance reasons. While its easier to store ClassAds in a vertical table, queries on vertical tables generally perform worse than those on horizontal tables since the latter has lot fewer records. However, in Condor, since job ads do not have a fixed schema (users can define their own attributes), a purely horizontal schema would end up having a lot of null values. As such, we have a hybrid schema where attributes on which queries are frequently performed (via condor_ history) are put in the History_Horizontal table and the other attributes are stored vertically (just as in the Cluster/Proc tables above) in the History_Vertical table. Also History_Horizontal contains all the attributes needed to service the short form of the condor_ history command (that is, without the -l option).

The resulting hybrid schema has proven to be the most efficient in servicing condor_ history queries. The job queue tables (Cluster and Proc) were not designed in this hybrid manner because job queues aren't as large as history; just a vertical schema worked great.


3.11.4 Quill and Security

There are several layers of security in Quill, some provided by Condor and others provided by the database. First, all accesses to the database are password-protected.

  1. The query tools, condor_ q and condor_ history connect to the database as user ``quillreader''. The password for this user can vary from one database to another and as such, each Quill daemon advertises this password to the collector. The query tools then obtain this password from the collector and connect successfully to the database. Access to the database by the ``quillreader'' user is read-only, as this is sufficient for the query tools. The condor_ quill daemon ensures this protected access using the sql GRANT command when it first creates the tables in the database. Note that access to the ``quillreader'' password itself can be blocked by blocking access to the collector, a feature already supported in Condor.

  2. The condor_ quill daemon, on the other hand, needs read and write access to the database. As such, it connects as user ``quillwriter'', who has owner privileges to the database. Since this gives all access to the ``quillwriter'' user, this password cannot be stored in a public place (such as the collector). For this reason, the ``quillwriter'' password is stored in a file called .quillwritepassword in the Condor spool directory. Appropriate protections on this file guarantee secure access to the database. This file must be created and protected by the site administrator; if this file does not exist as and where expected, the condor_ quill daemon logs an error and exits.

  3. The IsRemotelyQueryable attribute in the Quill ClassAd advertised by the Quill daemon to the collector can be used by site administrators to disallow the database from being read by all remote Condor query tools.


next up previous contents index
Next: 3.12 Setting Up for Up: 3. Administrators' Manual Previous: 3.10 The High Availability   Contents   Index
condor-admin@cs.wisc.edu