LIGO Support Ticket 2116
Ticket Information
Number: support 2116
User: anderson@ligo.caltech.edu
Email: espinoza_e__AT__ligo.caltech.edu,skoranda__AT__gravity.phys.uwm.edu
Status: resolved
Assigned To: gthain
Date: Tue, 2 Oct 2007 11:35:31 -0700
From: Stuart Anderson <anderson__AT__ligo.caltech.edu>
To: condor-support__AT__cs.wisc.edu
CC: Erik Espinoza <espinoza_e__AT__ligo.caltech.edu>
Subject: LIGO: problems migrating Quill from 6.8.5 to 6.9.4
X-Seen-BY: mailfromd 4.1 lava.cs.wisc.edu
We are trying to migrate the LIGO Condor pool today at Caltech from 6.8.5
running Quill to 6.9.4 running Quill++ and are running into a few problems.
We have a dedicated collector/negotiator machine that also runs postgres
for quill. However, when we try to start condor_dbmsd on that machine
(by additing to the DAEMON_LIST--is this the right place to run this?)
we get the following error:
10/2 10:11:02 ******************************************************
10/2 10:11:02 ** condor_dbmsd (CONDOR_DBMSD) STARTING UP
10/2 10:11:02 ** /usr/sbin/condor_dbmsd
10/2 10:11:02 ** $CondorVersion: 6.9.4 Aug 30 2007 $
10/2 10:11:02 ** $CondorPlatform: X86_64-LINUX_RHEL3 $
10/2 10:11:02 ** PID = 32642
10/2 10:11:02 ** Log last touched 10/2 10:10:51
10/2 10:11:02 ******************************************************
10/2 10:11:02 Using config source: /usr1/condor/condor_config
10/2 10:11:02 Using local config sources:
10/2 10:11:02 /usr1/condor/condor_config.local
10/2 10:11:02 DaemonCore: Command Socket at <10.14.0.25:59849>
10/2 10:11:02 main_init() called
10/2 10:11:02 ERROR "Assertion ERROR on (jobQueueDBUser)" at line 137 in file dbms_utils.C
Looking at the source code it appears that this indicates that QUILL_DB_USER
is not set, however, we did not have that set in 6.8 and I can find no
mention of this variable in the 6.9 manual. Assuming we are starting
condor_dbmsd on the right machine, what should this variable be set to?
Here are our current QUILL settings on the central
collector/negotatior/postgres machine:
QUILL = $(SBIN)/condor_quill
QUILL_ARGS = -f
QUILL_LOG = $(LOG)/QuillLog
QUILL_ENABLED = TRUE
QUILL_NAME = citquill@ligo
QUILL_DB_NAME = citquill_db
QUILL_DB_IP_ADDR = 10.14.0.25:5432
QUILL_POLLING_PERIOD = 10
QUILL_HISTORY_CLEANING_INTERVAL = 1
QUILL_HISTORY_DURATION = 7
QUILL_MANAGE_VACUUM = FALSE
QUILL_IS_REMOTELY_QUERYABLE = TRUE
QUILL_DB_QUERY_PASSWORD = quillro
QUILL_ADDRESS_FILE = $(LOG)/.quill_address
DBMSD = $(SBIN)/condor_dbmsd
DBMSD_ARGS = -f
DBMSD_LOG = $(LOG)/DbmsdLog
Thanks.
--
Stuart Anderson anderson__AT__ligo.caltech.edu
http://www.ligo.caltech.edu/~anderson
===========================================================================
Date of creation: Tue Oct 2 13:35:51 2007 (1191350154)
Subject: Actions
Assigned to burnett by burnett
===========================================================================
Date of actions: Tue Oct 2 16:53:35 2007 (1191362015)
Subject: Actions
Assigned to gthain by burnett
===========================================================================
Date of actions: Tue Oct 2 17:01:10 2007 (1191362470)
Date: Tue, 02 Oct 2007 17:08:48 -0500
From: Greg Thain <gthain__AT__cs.wisc.edu>
To: condor-support__AT__cs.wisc.edu
Subject: Re: [condor-support #2116] LIGO: problems migrating Quill from
6.8.5 to 6.9.4
> We are trying to migrate the LIGO Condor pool today at Caltech from 6.8.5
> running Quill to 6.9.4 running Quill++ and are running into a few problems.
Stuart:
Unfortunately,the documentation is a little behind the code. I've
updated the documentation that's linked on the website. There's also a
paper which discusses installation at:
https://www.cs.wisc.edu/condordb/overview_05-02-2007.pdf
-Greg
===========================================================================
Date mail was appended: Tue Oct 2 17:09:05 2007 (1191362947)
Date: Wed, 3 Oct 2007 13:45:40 -0700
From: Stuart Anderson <anderson__AT__ligo.caltech.edu>
To: condor-support response tracking system <condor-support__AT__cs.wisc.edu>
CC: espinoza_e__AT__ligo.caltech.edu, Scott Koranda <skoranda__AT__gravity.phys.uwm.edu>
Subject: Re: [condor-support #2116] LIGO: problems migrating Quill from
6.8.5 to 6.9.4
X-Seen-BY: mailfromd 4.1 granite.cs.wisc.edu
This may now be moot since our initial observations of the 6.9.4 schedd
performance are such that it can easily handle the condor_q load,
which was our main reason for running Quill in the first place.
Please feel free to close this ticket, and we will re-open a new ticket
if we determine at a later date that we need to run Quill++ and the
referenced documentation is insufficient to figure it out.
Thanks.
On Tue, Oct 02, 2007 at 05:09:05PM -0500, condor-support response tracking system wrote:
>
> > We are trying to migrate the LIGO Condor pool today at Caltech from 6.8.5
> > running Quill to 6.9.4 running Quill++ and are running into a few problems.
>
> Stuart:
>
> Unfortunately,the documentation is a little behind the code. I've
> updated the documentation that's linked on the website. There's also a
> paper which discusses installation at:
>
>
> https://www.cs.wisc.edu/condordb/overview_05-02-2007.pdf
>
> -Greg
>
>
> ========================================
> MESSAGE INFORMATION
> ========================================
> * From: Greg Thain <gthain__AT__cs.wisc.edu>
> * Ticket Email List: anderson__AT__ligo.caltech.edu, espinoza_e__AT__ligo.caltech.edu
>
> --
> ======================================================================
> This mail was sent from the RUST Mail System
> Please direct all replies to condor-support__AT__cs.wisc.edu
> Please include the current subject line in your reply.
> ======================================================================
>
--
Stuart Anderson anderson__AT__ligo.caltech.edu
http://www.ligo.caltech.edu/~anderson
===========================================================================
Date mail was appended: Wed Oct 3 15:45:59 2007 (1191444359)
Subject: Actions
Ticket resolved by gthain
===========================================================================
Date of actions: Wed Oct 3 16:55:06 2007 (1191448507)
Subject: Actions
Ticket was reopened by mailnull
===========================================================================
Date of actions: Wed Oct 3 16:55:15 2007 (1191448516)
Date: Wed, 03 Oct 2007 16:54:58 -0500
From: Greg Thain <gthain__AT__cs.wisc.edu>
To: condor-support__AT__cs.wisc.edu
Subject: Re: [condor-support #2116] LIGO: problems migrating Quill from
6.8.5 to 6.9.4
condor-support response tracking system wrote:
> This may now be moot since our initial observations of the 6.9.4 schedd
> performance are such that it can easily handle the condor_q load,
> which was our main reason for running Quill in the first place.
Thanks. We intend to continue our work on performance improvements in
the schedd, so if there are particular areas you are concerned with, we
would be very interested in hearing what those are.
Thanks,
_Greg
===========================================================================
Date mail was appended: Wed Oct 3 16:55:15 2007 (1191448516)
Subject: Actions
Ticket resolved by gthain
===========================================================================
Date of actions: Mon Oct 15 16:12:23 2007 (1192482744)