LIGO Support Ticket 18026

Ticket Information
  Number:      admin 18026
  User:        skoranda@gravity.phys.uwm.edu
  Email:       anderson__AT__ligo.caltech.edu
  Status:      open
  Assigned To: tannenba
Date: Wed, 21 May 2008 18:07:01 -0500
From: Scott Koranda <skoranda__AT__gravity.phys.uwm.edu>
To: condor-admin__AT__cs.wisc.edu
CC: Stuart Anderson <anderson__AT__ligo.caltech.edu>
Subject: LIGO: condor_dagman overwrites X509_CERT_DIR
X-Seen-BY: mailfromd 4.1 granite.cs.wisc.edu

Hi,

We just determined by experiment that condor_dagman from
Condor 7.0.1 will take the value for

GSI_DAEMON_TRUSTED_CA_DIR

if set in the condor_config and assign that value to

X509_CERT_DIR in the environment of jobs that run in the local
universe, even if X509_CERT_DIR is already set in the
environment of the job, for example if the job uses

getenv = True

Question: Is this the expected/designed behavior? If so, why is this
a good thing?

Thanks,

Scott

P.S. Also, what is the expected/designed behavior for
condor_dagman and the X509_USER_PROXY environment variable? We
think that it might also be reset/removed when jobs are run
under DAGman.

===========================================================================
Date of creation: Wed May 21 18:04:10 2008 (1211411052)
Date: Wed, 21 May 2008 18:15:39 -0500
From: Scott Koranda <skoranda__AT__gravity.phys.uwm.edu>
To: condor-admin__AT__cs.wisc.edu
Subject: Re: [condor-admin #18026] LIGO: condor_dagman overwrites
	X509_CERT_DIR
X-Seen-BY: mailfromd 4.1 obsidian.cs.wisc.edu

Hi,

I submitted this same report twice. The first time I forgot to
put "LIGO:" in the subject.

Please delete/close that one but keep this one (with LIGO: in
the subject) open.

Sorry for the trouble,

Scott

> Greetings.  (This is an automated response.  There is no need to reply.)
> 
> Your message regarding: 
>   "LIGO: condor_dagman overwrites X509_CERT_DIR"
> has been received by the condor-admin response tracking system.
> 
> In order to help us track the progress of your request, we ask that you
> include the string:
>   "[condor-admin #18026] LIGO: condor_dagman overwrites X509_CERT_DIR"
> in the subject line of any further mail about this particular request.
> 
> You can do this by simply replying to this email.
> 
> While you are waiting for a reply, please look at the Condor Manual:
>   http://www.cs.wisc.edu/condor/manual/
> for full documentation of Condor.  Your problem may have already
> been solved or explained.
> 
> Support for Condor through the condor-admin list is free of charge.
> We will make a best effort to respond in a timely fashion, but please
> keep in mind that our resources are limited.
> 
> We offer a higher level of support for a fee.  If you are interested in
> this, please send a message to condor-support__AT__cs.wisc.edu.
> 
> If possible, we encourage you to try to experiment a little to see if
> you can solve the problem yourself.
> 
>                         Thank You,
>                         - condor-admin response tracking system

===========================================================================
Date mail was appended: Wed May 21 18:12:45 2008 (1211411565)
Subject: Actions

Assigned to tannenba by gthain
===========================================================================
Date of actions: Thu May 22 11:37:45 2008 (1211474265)
Date: Mon, 13 Oct 2008 15:15:10 -0500
From: Todd Tannenbaum <tannenba__AT__cs.wisc.edu>
To: condor-admin__AT__cs.wisc.edu
Subject: Re: [condor-admin #18026] LIGO: condor_dagman overwrites
 X509_CERT_DIR

> 
> Hi,
> 
> We just determined by experiment that condor_dagman from
> Condor 7.0.1 will take the value for
> 
> GSI_DAEMON_TRUSTED_CA_DIR
> 
> if set in the condor_config and assign that value to
> 
> X509_CERT_DIR in the environment of jobs that run in the local
> universe, even if X509_CERT_DIR is already set in the
> environment of the job, for example if the job uses
> 
> getenv = True
> 
> Question: Is this the expected/designed behavior? 

Unfortunately, yes, the above behavior is currently expected.

It is not how it really should be, however.

I propose the following:

1) In the immediate timeframe, you can disable the above behavior via 
the condor_config file.  I believe if you put the following into your 
config file:
     DAGMAN.GSI_DAEMON_TRUSTED_CA_DIR =
Note there is nothing after the equals sign.  The idea here is DAGMan 
will only write over environment GSI settings if it sees GSI settings in 
the condor_config file.  By putting a line like the above in 
condor_config for each GSI setting you may have specified, your are 
essentially "clearing" that setting out for DAGMan.

2) In the longer term, I want to change DAGMan so that it stores a 
snapshot of its environment on startup.  Then when DAGMan spawns 
condor_submit, it temporarily restores that environment snapshot for the 
duration of the condor_submit.  Since condor_submit_dag does 
"getenv=true", that means that if jobs submitted by DAGMan include 
getenv=true in their submit files, these jobs will see the same 
environment that condor_submit_dag had.

Does the above sound good?  Is item #1 an acceptable immediate 
workaround, and does item #2 sound good for the long-term from LIGO's 
perspective?  Any input on a timeline for #2 from LIGO's wants/needs 
perspective?

Much thanks for annunciating this issue to us, it is great having 
thoughtful and engaged users like LIGO,

Todd

-- 
Todd Tannenbaum                       University of Wisconsin-Madison
Condor Project Research               Department of Computer Sciences
tannenba__AT__cs.wisc.edu                  1210 W. Dayton St. Rm #4257
Phone: (608) 263-7132                 Madison, WI 53706-1685

===========================================================================
Date mail was appended: Mon Oct 13 15:15:20 2008 (1223928921)
Subject: Actions

Status changed from open to pending by tannenba
===========================================================================
Date of actions: Mon Oct 13 15:15:20 2008 (1223928922)
Date: Fri, 30 Jan 2009 13:05:58 -0600
From: Todd Tannenbaum <tannenba__AT__cs.wisc.edu>
To: condor-admin__AT__cs.wisc.edu
Subject: Re: [condor-admin #18026] LIGO: condor_dagman overwrites
 X509_CERT_DIR

> 
> 2) In the longer term, I want to change DAGMan so that it stores a 
> snapshot of its environment on startup.  Then when DAGMan spawns 
> condor_submit, it temporarily restores that environment snapshot for the 
> duration of the condor_submit.  Since condor_submit_dag does 
> "getenv=true", that means that if jobs submitted by DAGMan include 
> getenv=true in their submit files, these jobs will see the same 
> environment that condor_submit_dag had.
> 

Hi -  So the time for the longer term has come.  We implemented the 
above change, and it will be released starting with v7.3.0.

Thanks,
Todd


===========================================================================
Date mail was appended: Fri Jan 30 13:06:14 2009 (1233342375)
Subject: Actions

Ticket resolved by tannenba
===========================================================================
Date of actions: Fri Jan 30 13:06:14 2009 (1233342377)
Subject: Actions

Ticket reopened by wenger
===========================================================================
Date of actions: Fri Feb 20  9:18:45 2009 (1235143125)
Subject: Actions

Ticket was reopened by wenger
===========================================================================
Date of actions: Fri Feb 20  9:18:45 2009 (1235143125)
Subject: Comments added

When scott comes in for his monthly visits, we should go over this on a
whiteboard and solve the semantics of it for real.

Comments added by psilord

===========================================================================
Date comments were added: Fri Sep 25 13:30:33 2009 (1253903433)