LIGO Support Ticket 15836

Ticket Information
  Number:      admin 15836
  User:        dbrown@ligo.caltech.edu
  Email:       anderson__AT__ligo.caltech.edu,espinoza__AT__ligo.caltech.edu,skoranda__AT__gravity.phys.uwm.edu,fairhurst_s__AT__ligo.caltech.edu
  Status:      feature
  Assigned To: wenger
CC: Stuart Anderson <anderson__AT__ligo.caltech.edu>,         Erik Espinoza
 <espinoza__AT__ligo.caltech.edu>,         Scott Koranda
 <skoranda__AT__gravity.phys.uwm.edu>,         Steve Fairhurst
 <fairhurst_s__AT__ligo.caltech.edu>
From: Duncan Brown <dbrown__AT__ligo.caltech.edu>
Subject: LIGO: enhancement request to dagman
Date: Wed, 11 Jul 2007 15:37:46 -0700
To: condor-admin response tracking system <condor-admin__AT__cs.wisc.edu>

Hi,

After discussion with some of our heavy DAGman users in LIGO, we  
would like to request the following feature enhancement to DAGman. We  
would would like the ability to merge dags at the node level, so that  
subdags can be merged into a parent dag without needing to run  
separate dagman processes for each subdag.

One suggestion for doing this is to add an additional keyword to the  
dag file syntax DAG which would act like the current JOB keyword, but  
take a dag file rather than a condor submit file. When parses a dag  
file containing a DAG keyword, it would replace that node by the  
nodes in the dag that the keyword specified. This would allow sub- 
dags to be merged into a parent dag and have a single dagman process  
execute the workflow.

For example, I create a small workflow called subdag.dag as follows:

JOB a1 true.sub
JOB a2 true.sub
PARENT a1 CHILD a2

and another workflow called mydag.dag which uses the DAG keyword:

JOB b1 true.sub
DAG b2 subdag.dag
JOB b3 true.sub
PARENT b1 CHILD b2
PARENT b2 CHILD b3

Execution of mydag.dag would be equivalent of running the single dag:

JOB b1 true.sub
JOB a1 true.sub
JOB a2 true.sub
JOB b3 true.sub
PARENT b1 CHILD a1
PARENT a1 CHILD a2
PARENT a2 CHILD b3

If the the subdag has multiple 'root' nodes and several bottom level  
children. In this case, we suggest the behavior should be that all  
the root nodes are made children of the nodes that the DAG keyword is  
a child of. All nodes that are children of the DAG keyword should be  
children of the nodes in the inserted DAG.

If the dag fails to execute, the rescue dag written out should be the  
merged dag, for simplicity.

We'd be happy to discuss this further at the ligo-condor meetings.

Cheers,
Duncan.

-- 

Duncan Brown                          California Institute of Technology
Tapir: (626) 395 8409                  MS 18-34, Pasadena, CA 91125, USA
LIGO:  (626) 395 8812          http://www.lsc-group.phys.uwm.edu/~duncan



===========================================================================
Date of creation: Wed Jul 11 17:37:28 2007 (1184193451)
Subject: Actions

Assigned to wenger by jfrey
===========================================================================
Date of actions: Thu Jul 12 11:16:51 2007 (1184257012)
Date: Tue, 17 Jul 2007 14:42:33 -0500 (CDT)
From: "R. Kent Wenger" <wenger__AT__cs.wisc.edu>
To: jfrey <condor-admin__AT__cs.wisc.edu>
CC: "R. Kent Wenger" <wenger__AT__cs.wisc.edu>
Subject: Re: [condor-admin #15836] LIGO: enhancement request to dagman

Duncan,

I just got back from vacation today, and found that this ticket has been
assigned to me.

I'll think about it and get back to you in a few days -- I'm still going
thru email checking for emergencies, etc.

Kent Wenger
Condor Team


> After discussion with some of our heavy DAGman users in LIGO, we
> would like to request the following feature enhancement to DAGman. We
> would would like the ability to merge dags at the node level, so that
> subdags can be merged into a parent dag without needing to run
> separate dagman processes for each subdag.
>
> One suggestion for doing this is to add an additional keyword to the
> dag file syntax DAG which would act like the current JOB keyword, but
> take a dag file rather than a condor submit file. When parses a dag
> file containing a DAG keyword, it would replace that node by the
> nodes in the dag that the keyword specified. This would allow sub-
> dags to be merged into a parent dag and have a single dagman process
> execute the workflow.
>
> For example, I create a small workflow called subdag.dag as follows:
>
> JOB a1 true.sub
> JOB a2 true.sub
> PARENT a1 CHILD a2
>
> and another workflow called mydag.dag which uses the DAG keyword:
>
> JOB b1 true.sub
> DAG b2 subdag.dag
> JOB b3 true.sub
> PARENT b1 CHILD b2
> PARENT b2 CHILD b3
>
> Execution of mydag.dag would be equivalent of running the single dag:
>
> JOB b1 true.sub
> JOB a1 true.sub
> JOB a2 true.sub
> JOB b3 true.sub
> PARENT b1 CHILD a1
> PARENT a1 CHILD a2
> PARENT a2 CHILD b3
>
> If the the subdag has multiple 'root' nodes and several bottom level
> children. In this case, we suggest the behavior should be that all
> the root nodes are made children of the nodes that the DAG keyword is
> a child of. All nodes that are children of the DAG keyword should be
> children of the nodes in the inserted DAG.
>
> If the dag fails to execute, the rescue dag written out should be the
> merged dag, for simplicity.
>
> We'd be happy to discuss this further at the ligo-condor meetings.
>
> Cheers,
> Duncan.
>
> -- 
>
> Duncan Brown                          California Institute of Technology
> Tapir: (626) 395 8409                  MS 18-34, Pasadena, CA 91125, USA
> LIGO:  (626) 395 8812          http://www.lsc-group.phys.uwm.edu/~duncan
>
>
>
> ===========================================================================
> Date of creation: Wed Jul 11 17:37:28 2007 (1184193451)
>
>
>> From RUST Thu, 12 Jul 2007 11:16:51 -0500 (CDT)
> Subject: Actions
>
> Assigned to wenger by jfrey
> ===========================================================================
> Date of actions: Thu Jul 12 11:16:51 2007 (1184257012)
>

===========================================================================
Date mail was appended: Tue Jul 17 14:42:38 2007 (1184701358)
CC: anderson__AT__ligo.caltech.edu, espinoza__AT__ligo.caltech.edu,
 skoranda__AT__gravity.phys.uwm.edu, fairhurst_s__AT__ligo.caltech.edu
From: Duncan Brown <dbrown__AT__ligo.caltech.edu>
Subject: Re: [condor-admin #15836] LIGO: enhancement request to dagman
Date: Wed, 18 Jul 2007 18:56:11 -0700
To: condor-admin__AT__cs.wisc.edu

Hi Kent,

Great, thanks. I'll be on vacation for two weeks starting on  
Saturday, so please copy Steve Fairhurst on your replies. He should  
be able to answer any questions that you have.

Cheers,
Duncan.

On Jul 17, 2007, at 12:42 PM, condor-admin response tracking system  
wrote:

> Duncan,
>
> I just got back from vacation today, and found that this ticket has  
> been
> assigned to me.
>
> I'll think about it and get back to you in a few days -- I'm still  
> going
> thru email checking for emergencies, etc.
>
> Kent Wenger
> Condor Team
>
>
>> After discussion with some of our heavy DAGman users in LIGO, we
>> would like to request the following feature enhancement to DAGman. We
>> would would like the ability to merge dags at the node level, so that
>> subdags can be merged into a parent dag without needing to run
>> separate dagman processes for each subdag.
>>
>> One suggestion for doing this is to add an additional keyword to the
>> dag file syntax DAG which would act like the current JOB keyword, but
>> take a dag file rather than a condor submit file. When parses a dag
>> file containing a DAG keyword, it would replace that node by the
>> nodes in the dag that the keyword specified. This would allow sub-
>> dags to be merged into a parent dag and have a single dagman process
>> execute the workflow.
>>
>> For example, I create a small workflow called subdag.dag as follows:
>>
>> JOB a1 true.sub
>> JOB a2 true.sub
>> PARENT a1 CHILD a2
>>
>> and another workflow called mydag.dag which uses the DAG keyword:
>>
>> JOB b1 true.sub
>> DAG b2 subdag.dag
>> JOB b3 true.sub
>> PARENT b1 CHILD b2
>> PARENT b2 CHILD b3
>>
>> Execution of mydag.dag would be equivalent of running the single dag:
>>
>> JOB b1 true.sub
>> JOB a1 true.sub
>> JOB a2 true.sub
>> JOB b3 true.sub
>> PARENT b1 CHILD a1
>> PARENT a1 CHILD a2
>> PARENT a2 CHILD b3
>>
>> If the the subdag has multiple 'root' nodes and several bottom level
>> children. In this case, we suggest the behavior should be that all
>> the root nodes are made children of the nodes that the DAG keyword is
>> a child of. All nodes that are children of the DAG keyword should be
>> children of the nodes in the inserted DAG.
>>
>> If the dag fails to execute, the rescue dag written out should be the
>> merged dag, for simplicity.
>>
>> We'd be happy to discuss this further at the ligo-condor meetings.
>>
>> Cheers,
>> Duncan.
>>
>> -- 
>>
>> Duncan Brown                          California Institute of  
>> Technology
>> Tapir: (626) 395 8409                  MS 18-34, Pasadena, CA  
>> 91125, USA
>> LIGO:  (626) 395 8812          http://www.lsc-group.phys.uwm.edu/ 
>> ~duncan
>>
>>
>>
>> ===================================================================== 
>> ======
>> Date of creation: Wed Jul 11 17:37:28 2007 (1184193451)
>>
>>
>>> From RUST Thu, 12 Jul 2007 11:16:51 -0500 (CDT)
>> Subject: Actions
>>
>> Assigned to wenger by jfrey
>> ===================================================================== 
>> ======
>> Date of actions: Thu Jul 12 11:16:51 2007 (1184257012)
>>
>
>
> ========================================
> MESSAGE INFORMATION
> ========================================
> * From: "R. Kent Wenger" <wenger__AT__cs.wisc.edu>
> * Ticket Email List: dbrown__AT__ligo.caltech.edu,  
> anderson__AT__ligo.caltech.edu,espinoza__AT__ligo.caltech.edu,skoranda__AT__gravity.p 
> hys.uwm.edu,fairhurst_s__AT__ligo.caltech.edu
>
> -- 
> ======================================================================
> This mail was sent from the RUST Mail System
> Please direct all replies to condor-admin__AT__cs.wisc.edu
> Please include the current subject line in your reply.
> ======================================================================
>

-- 

Duncan Brown                          California Institute of Technology
Tapir: (626) 395 8409                  MS 18-34, Pasadena, CA 91125, USA
LIGO:  (626) 395 8812          http://www.lsc-group.phys.uwm.edu/~duncan



===========================================================================
Date mail was appended: Wed Jul 18 20:55:47 2007 (1184810147)
Date: Fri, 27 Jul 2007 10:50:50 -0500 (CDT)
From: "R. Kent Wenger" <wenger__AT__cs.wisc.edu>
To: jfrey <condor-admin__AT__cs.wisc.edu>
CC: "R. Kent Wenger" <wenger__AT__cs.wisc.edu>,
 Steve Fairhurst <fairhurst_s__AT__ligo.caltech.edu>
Subject: Re: [condor-admin #15836] LIGO: enhancement request to dagman

Duncan Brown wrote:

> After discussion with some of our heavy DAGman users in LIGO, we
> would like to request the following feature enhancement to DAGman. We
> would would like the ability to merge dags at the node level, so that
> subdags can be merged into a parent dag without needing to run
> separate dagman processes for each subdag.
>
> One suggestion for doing this is to add an additional keyword to the
> dag file syntax DAG which would act like the current JOB keyword, but
> take a dag file rather than a condor submit file. When parses a dag
> file containing a DAG keyword, it would replace that node by the
> nodes in the dag that the keyword specified. This would allow sub-
> dags to be merged into a parent dag and have a single dagman process
> execute the workflow.
>
> For example, I create a small workflow called subdag.dag as follows:
>
> JOB a1 true.sub
> JOB a2 true.sub
> PARENT a1 CHILD a2
>
> and another workflow called mydag.dag which uses the DAG keyword:
>
> JOB b1 true.sub
> DAG b2 subdag.dag
> JOB b3 true.sub
> PARENT b1 CHILD b2
> PARENT b2 CHILD b3
>
> Execution of mydag.dag would be equivalent of running the single dag:
>
> JOB b1 true.sub
> JOB a1 true.sub
> JOB a2 true.sub
> JOB b3 true.sub
> PARENT b1 CHILD a1
> PARENT a1 CHILD a2
> PARENT a2 CHILD b3
>
> If the the subdag has multiple 'root' nodes and several bottom level
> children. In this case, we suggest the behavior should be that all
> the root nodes are made children of the nodes that the DAG keyword is
> a child of. All nodes that are children of the DAG keyword should be
> children of the nodes in the inserted DAG.
>
> If the dag fails to execute, the rescue dag written out should be the
> merged dag, for simplicity.
>
> We'd be happy to discuss this further at the ligo-condor meetings.

Hmm -- this is an interesting idea.

It actually doesn't sound too hard, either.  About the only tricky part,
I think, is getting all of the dependencies right.

I've created an entry for this in our problem tracking system -- it's 
#867.

I guess we can talk about relative priorities in the next phone meeting.

Kent Wenger
Condor Team

===========================================================================
Date mail was appended: Fri Jul 27 10:50:52 2007 (1185551452)
Subject: Actions

Status changed from open to feature by wenger
===========================================================================
Date of actions: Thu Aug 23 11:13:42 2007 (1187885622)