LIGO Support Ticket 17983

Ticket Information
  Number:      admin 17983
  User:        dabrown@physics.syr.edu
  Email:       anderson__AT__ligo.caltech.edu,skoranda__AT__gravity.phys.uwm.edu
  Status:      resolved
  Assigned To: jfrey
CC: Stuart Anderson <anderson__AT__ligo.caltech.edu>,        Scott Koranda
 <skoranda__AT__gravity.phys.uwm.edu>
From: Duncan Brown <dabrown__AT__physics.syr.edu>
Subject: LIGO: stderr not captured in 7.0.1 standard universe
Date: Thu, 8 May 2008 09:56:51 -0400
To: condor-admin response tracking system <condor-admin__AT__cs.wisc.edu>
X-Scanner: InterScan AntiVirus for Sendmail
X-Seen-BY: mailfromd 4.1 gypsum.cs.wisc.edu

Hi all,

I have been running a DAG containing standard universe jobs which  
fail due to an error in the command line arguments. The jobs exit  
with return code 1 and dagman correctly captures them as failures.  
The stderr files are written, but are all empty. They should contain  
the error message from the code:

-rw-r--r-- 1 dbrown sufaculty  0 May  7 16:25  
inspiral-817566800-817568848-29160-0.err
-rw-r--r-- 1 dbrown sufaculty  0 May  7 16:25  
inspiral-817566800-817568848-29160-0.out
-rw-r--r-- 1 dbrown sufaculty  0 May  7 16:30  
inspiral-817568720-817570768-29175-0.err
-rw-r--r-- 1 dbrown sufaculty  0 May  7 16:30  
inspiral-817568720-817570768-29175-0.out
-rw-r--r-- 1 dbrown sufaculty  0 May  7 16:25  
inspiral-817570640-817572688-29161-0.err
-rw-r--r-- 1 dbrown sufaculty  0 May  7 16:25  
inspiral-817570640-817572688-29161-0.out
-rw-r--r-- 1 dbrown sufaculty  0 May  7 16:25  
inspiral-817572560-817574608-29163-0.err

When I run the code on the command line, I see the error message  
correctly:

[dbrown@sugar tag]$ ./lalapps_inspiral --trig-end-time 0 --cluster- 
method window --dynamic-range-exponent 69.0 --disable-rsq-veto --bank- 
file H1-TMPLTBANK-817577671-2048.xml.gz --high-pass-order 8 --strain- 
high-pass-order 8 --ifo-tag FIRST --gps-end-time 817579719 -- 
calibrated-data real_8 --channel-name H1:LSC-STRAIN --snr-threshold  
5.5 --number-of-segments 15 --trig-start-time 0 --enable-high-pass  
30.0 --debug-level 33 --gps-start-time 817577671 --enable-filter-inj- 
only --high-pass-attenuation 0.1 --chisq-bins 0 --inverse-spec-length  
16 --segment-length 1048576 --low-frequency-cutoff 40.0 --pad-data 8  
--cluster-window 16 --sample-rate 4096 --chisq-threshold 10.0 -- 
resample-filter ldas --strain-high-pass-atten 0.1 --strain-high-pass- 
freq 30.0 --segment-overlap 524288 --frame-cache cache/H- 
H1_RDS_C03_L2-817577663-817592553.cache --chisq-delta 0.2 --bank-veto- 
subbank-size 1 --approximant FindChirpSP --write-compress --enable- 
output --order twoPN --spectrum-type median
Condor: Notice: Will checkpoint to ./lalapps_inspiral.ckpt
Condor: Notice: Remote system calls disabled.
./lalapps_inspiral: unrecognized option `--bank-veto-subbank-size'

Any idea why this is not being captured in the log file? stdout from  
another job is:

-rw-r--r-- 1 dbrown sufaculty  0 May  7 21:15  
tmpltbank-817745157-817747205-29812-0.err
-rw-r--r-- 1 dbrown sufaculty 31 May  7 22:23  
tmpltbank-817745157-817747205-29812-0.out

Cheers,
Duncan.

-- 

Duncan Brown                          Room 263-1, Department of Physics,
Assistant Professor of Physics        Syracuse University, NY 13244, USA
Phone: (315) 443 5993             http://www.gravity.phy.syr.edu/~duncan



===========================================================================
Date of creation: Thu May  8  8:56:45 2008 (1210255008)
Subject: Actions

Assigned to jfrey by jfrey
===========================================================================
Date of actions: Fri May  9  9:39:59 2008 (1210343999)
From: Jaime Frey <jfrey__AT__cs.wisc.edu>
To: condor-admin__AT__cs.wisc.edu
Subject: Re: [condor-admin #17983] LIGO: stderr not captured in 7.0.1
 standard universe
Date: Fri, 9 May 2008 09:44:19 -0500

> I have been running a DAG containing standard universe jobs which
> fail due to an error in the command line arguments. The jobs exit
> with return code 1 and dagman correctly captures them as failures.
> The stderr files are written, but are all empty. They should contain
> the error message from the code:
>
> -rw-r--r-- 1 dbrown sufaculty  0 May  7 16:25
> inspiral-817566800-817568848-29160-0.err
> -rw-r--r-- 1 dbrown sufaculty  0 May  7 16:25
> inspiral-817566800-817568848-29160-0.out
> -rw-r--r-- 1 dbrown sufaculty  0 May  7 16:30
> inspiral-817568720-817570768-29175-0.err
> -rw-r--r-- 1 dbrown sufaculty  0 May  7 16:30
> inspiral-817568720-817570768-29175-0.out
> -rw-r--r-- 1 dbrown sufaculty  0 May  7 16:25
> inspiral-817570640-817572688-29161-0.err
> -rw-r--r-- 1 dbrown sufaculty  0 May  7 16:25
> inspiral-817570640-817572688-29161-0.out
> -rw-r--r-- 1 dbrown sufaculty  0 May  7 16:25
> inspiral-817572560-817574608-29163-0.err
>
> When I run the code on the command line, I see the error message
> correctly:
>
> [dbrown@sugar tag]$ ./lalapps_inspiral --trig-end-time 0 --cluster-
> method window --dynamic-range-exponent 69.0 --disable-rsq-veto --bank-
> file H1-TMPLTBANK-817577671-2048.xml.gz --high-pass-order 8 --strain-
> high-pass-order 8 --ifo-tag FIRST --gps-end-time 817579719 --
> calibrated-data real_8 --channel-name H1:LSC-STRAIN --snr-threshold
> 5.5 --number-of-segments 15 --trig-start-time 0 --enable-high-pass
> 30.0 --debug-level 33 --gps-start-time 817577671 --enable-filter-inj-
> only --high-pass-attenuation 0.1 --chisq-bins 0 --inverse-spec-length
> 16 --segment-length 1048576 --low-frequency-cutoff 40.0 --pad-data 8
> --cluster-window 16 --sample-rate 4096 --chisq-threshold 10.0 --
> resample-filter ldas --strain-high-pass-atten 0.1 --strain-high-pass-
> freq 30.0 --segment-overlap 524288 --frame-cache cache/H-
> H1_RDS_C03_L2-817577663-817592553.cache --chisq-delta 0.2 --bank-veto-
> subbank-size 1 --approximant FindChirpSP --write-compress --enable-
> output --order twoPN --spectrum-type median
> Condor: Notice: Will checkpoint to ./lalapps_inspiral.ckpt
> Condor: Notice: Remote system calls disabled.
> ./lalapps_inspiral: unrecognized option `--bank-veto-subbank-size'
>
> Any idea why this is not being captured in the log file? stdout from
> another job is:
>
> -rw-r--r-- 1 dbrown sufaculty  0 May  7 21:15
> tmpltbank-817745157-817747205-29812-0.err
> -rw-r--r-- 1 dbrown sufaculty 31 May  7 22:23
> tmpltbank-817745157-817747205-29812-0.out


Can you try condor_compile'ing the following program with the same  
compiler as your application and submitting it to your Condor pool?

------------------
#include <stdio.h>
int main() {
     fprintf( stderr, "I am stderr!\n" );
     return 0;
}
------------------

Thanks and regards,
Jaime Frey
UW-Madison Condor Team



===========================================================================
Date mail was appended: Fri May  9  9:44:21 2008 (1210344262)
Subject: Actions

Status changed from open to pending by jfrey
===========================================================================
Date of actions: Fri May  9  9:44:21 2008 (1210344263)
CC: anderson__AT__ligo.caltech.edu, skoranda__AT__gravity.phys.uwm.edu
From: Duncan Brown <dabrown__AT__physics.syr.edu>
Subject: Re: [condor-admin #17983] LIGO: stderr not captured in 7.0.1
 standard universe
Date: Thu, 15 May 2008 17:50:16 -0400
To: condor-admin__AT__cs.wisc.edu
X-Scanner: InterScan AntiVirus for Sendmail
X-Seen-BY: mailfromd 4.1 granite.cs.wisc.edu

Hi Jamie,

[dbrown@sugar stderr]$ cat stderr_test.c
#include <stdio.h>
int main() {
   fprintf( stderr, "I am stderr!\n" );
   return 0;
}


[dbrown@sugar stderr]$ condor_compile gcc -g -o stderr_test  
stderr_test.c
LINKING FOR CONDOR : /usr/bin/ld -L/usr/lib64 -Bstatic --eh-frame-hdr  
-m elf_x86_64 --hash-style=gnu -dynamic-linker /lib64/ld-linux- 
x86-64.so.2 -o stderr_test /usr/lib64/condor_rt0.o /usr/lib64/crti.o / 
usr/lib/gcc/x86_64-redhat-linux/4.1.2/crtbeginT.o -L/usr/lib64 -L/usr/ 
lib/gcc/x86_64-redhat-linux/4.1.2 -L/usr/lib/gcc/x86_64-redhat-linux/ 
4.1.2 -L/usr/lib/gcc/x86_64-redhat-linux/4.1.2/../../../../lib64 -L/ 
lib/../lib64 -L/usr/lib/../lib64 /tmp/cc0ez2X0.o /usr/lib64/ 
libcondorsyscall.a /usr/lib64/libcondor_z.a /usr/lib64/libcomp_libstdc 
++.a /usr/lib64/libcomp_libgcc.a /usr/lib64/libcomp_libgcc_eh.a --as- 
needed --no-as-needed -lcondor_c -lcondor_nss_files -lcondor_nss_dns - 
lcondor_resolv -lcondor_c -lcondor_nss_files -lcondor_nss_dns - 
lcondor_resolv -lcondor_c /usr/lib64/libcomp_libgcc.a /usr/lib64/ 
libcomp_libgcc_eh.a --as-needed --no-as-needed /usr/lib/gcc/x86_64- 
redhat-linux/4.1.2/crtend.o /usr/lib64/crtn.o
/usr/lib64/libcondorsyscall.a(condor_file_agent.o): In function  
`CondorFileAgent::open(char const*, int, int)':
/home/condor/execute/dir_15919/userdir/src/condor_ckpt/ 
condor_file_agent.C:106: warning: the use of `tmpnam' is dangerous,  
better use `mkstemp'
/usr/lib64/libcondorsyscall.a(special_stubs.o): In function  
`condor_gethostbyaddr':
/home/condor/execute/dir_15919/userdir/src/condor_syscall_lib/ 
special_stubs.C:200: warning: Using 'gethostbyaddr' in statically  
linked applications requires at runtime the shared libraries from the  
glibc version used for linking
/usr/lib64/libcondorsyscall.a(special_stubs.o): In function  
`condor_gethostbyname':
/home/condor/execute/dir_15919/userdir/src/condor_syscall_lib/ 
special_stubs.C:193: warning: Using 'gethostbyname' in statically  
linked applications requires at runtime the shared libraries from the  
glibc version used for linking
/usr/lib64/libcondorsyscall.a(sock.o): In function  
`Sock::getportbyserv(char*)':
/home/condor/execute/dir_15919/userdir/src/condor_io/sock.C:208:  
warning: Using 'getservbyname' in statically linked applications  
requires at runtime the shared libraries from the glibc version used  
for linking


[dbrown@sugar stderr]$ ./stderr_test
Condor: Notice: Will checkpoint to ./stderr_test.ckpt
Condor: Notice: Remote system calls disabled.
I am stderr!


[dbrown@sugar stderr]$ cat stderr_test.sub
universe = standard
executable = /home/dbrown/projects/daswg/condor/stderr/stderr_test
error = stderr_test.err
output = stderr_test.out
log = stderr_test.log
queue


[dbrown@sugar stderr]$ condor_submit stderr_test.sub
Submitting job(s).
Logging submit event(s).
1 job(s) submitted to cluster 40846.


[dbrown@sugar stderr]$ cat stderr_test.log
000 (40846.000.000) 05/15 17:08:11 Job submitted from host:  
<10.20.1.23:58095>
...
001 (40846.000.000) 05/15 17:08:13 Job executing on host:  
<10.20.2.33:42389>
...
005 (40846.000.000) 05/15 17:08:13 Job terminated.
         (1) Normal termination (return value 0)
                 Usr 0 00:00:00, Sys 0 00:00:00  -  Run Remote Usage
                 Usr 0 00:00:00, Sys 0 00:00:00  -  Run Local Usage
                 Usr 0 00:00:00, Sys 0 00:00:00  -  Total Remote Usage
                 Usr 0 00:00:00, Sys 0 00:00:00  -  Total Local Usage
         725  -  Run Bytes Sent By Job
         5683869  -  Run Bytes Received By Job
         725  -  Total Bytes Sent By Job
         5683869  -  Total Bytes Received By Job
...

[dbrown@sugar stderr]$ cat stderr_test.err
I am stderr!
[dbrown@sugar stderr]$ cat stderr_test.out
[dbrown@sugar stderr]$

Now the inspiral code:

[dbrown@sugar stderr]$ ./lalapps_inspiral --badgers
Condor: Notice: Will checkpoint to lalapps_inspiral.ckpt
Condor: Notice: Remote system calls disabled.
lalapps_inspiral: unrecognized option `--badgers'
[dbrown@sugar stderr]$

[dbrown@sugar stderr]$ ./lalapps_inspiral --badgers 2>/dev/null
[dbrown@sugar stderr]$

This prints nothing, so the error message really is going to stderr.

[dbrown@sugar stderr]$ cat inspiral_test.sub
universe = standard
executable = /home/dbrown/projects/daswg/condor/stderr/lalapps_inspiral
arguments = --badgers
error = inspiral_test.err
output = inspiral_test.out
log = inspiral_test.log
queue


[dbrown@sugar stderr]$ condor_submit inspiral_test.sub
Submitting job(s).
Logging submit event(s).
1 job(s) submitted to cluster 40847.


[dbrown@sugar stderr]$ cat inspiral_test.log
000 (40847.000.000) 05/15 17:12:23 Job submitted from host:  
<10.20.1.23:58095>
...
001 (40847.000.000) 05/15 17:12:38 Job executing on host:  
<10.20.2.33:42389>
...
005 (40847.000.000) 05/15 17:12:38 Job terminated.
         (1) Normal termination (return value 1)
                 Usr 0 00:00:00, Sys 0 00:00:00  -  Run Remote Usage
                 Usr 0 00:00:00, Sys 0 00:00:00  -  Run Local Usage
                 Usr 0 00:00:00, Sys 0 00:00:00  -  Total Remote Usage
                 Usr 0 00:00:00, Sys 0 00:00:00  -  Total Local Usage
         941  -  Run Bytes Sent By Job
         14791776  -  Run Bytes Received By Job
         941  -  Total Bytes Sent By Job
         14791776  -  Total Bytes Received By Job
...

[dbrown@sugar stderr]$ cat inspiral_test.err
[dbrown@sugar stderr]$ cat inspiral_test.out

Both are empty, even though the job exited with status code 1.

Now the error message 'unrecognized option' is coming from getopt, so  
I modified stderr_test to do getopt parsing. The source is attached  
below.

[dbrown@sugar stderr]$ ./stderr_test
Condor: Notice: Will checkpoint to ./stderr_test.ckpt
Condor: Notice: Remote system calls disabled.
I am stderr!

[dbrown@sugar stderr]$ ./stderr_test --verbose
Condor: Notice: Will checkpoint to ./stderr_test.ckpt
Condor: Notice: Remote system calls disabled.
I am stderr!
I really, really am stderr!

[dbrown@sugar stderr]$ ./stderr_test --badgers
Condor: Notice: Will checkpoint to ./stderr_test.ckpt
Condor: Notice: Remote system calls disabled.
./stderr_test: unrecognized option `--badgers'

[dbrown@sugar stderr]$ cat stderr_test.sub
universe = standard
executable = /home/dbrown/projects/daswg/condor/stderr/stderr_test
arguments = --badgers
error = stderr_test.err
output = stderr_test.out
log = stderr_test.log
queue

[dbrown@sugar stderr]$ condor_submit stderr_test.sub
Submitting job(s).
Logging submit event(s).
1 job(s) submitted to cluster 40849.

[dbrown@sugar stderr]$ cat stderr_test.log
000 (40849.000.000) 05/15 17:30:07 Job submitted from host:  
<10.20.1.23:58095>
...
001 (40849.000.000) 05/15 17:30:10 Job executing on host:  
<10.20.2.33:42389>
...
005 (40849.000.000) 05/15 17:30:10 Job terminated.
         (1) Normal termination (return value 1)
                 Usr 0 00:00:00, Sys 0 00:00:00  -  Run Remote Usage
                 Usr 0 00:00:00, Sys 0 00:00:00  -  Run Local Usage
                 Usr 0 00:00:00, Sys 0 00:00:00  -  Total Remote Usage
                 Usr 0 00:00:00, Sys 0 00:00:00  -  Total Local Usage
         725  -  Run Bytes Sent By Job
         5716309  -  Run Bytes Received By Job
         725  -  Total Bytes Sent By Job
         5716309  -  Total Bytes Received By Job
...

[dbrown@sugar stderr]$ cat stderr_test.err
condor_exec.40849.0: unrecognized option `--badgers'

So now I'm completely confused. If you want to look at the source  
code for lalapps_inspiral it is at

<http://www.lsc-group.phys.uwm.edu/cgi-bin/cvs/viewcvs.cgi/lalapps/ 
src/inspiral/inspiral.c?rev=1.272&cvsroot=lal&content-type=text/ 
vnd.viewcvs-markup>

I'll try doing some more experiments, but if you have suggestions of  
where to look, I'd appreciate it.

Cheers,
Duncan.


[dbrown@sugar stderr]$ cat stderr_test.c
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <getopt.h>

int main( int argc, char *argv[] )
{
   int vrbflg = 0;
   struct option long_options[] =
   {
     {"verbose", no_argument, &vrbflg, 1 },
     {0, 0, 0, 0}
   };
   int c;

   while ( 1 )
   {
     /* getopt_long stores long option here */
     int option_index = 0;

     c = getopt_long_only( argc, argv, "v", long_options,  
&option_index );


     /* detect the end of the options */
     if ( c == - 1 )
     {
       break;
     }

     switch ( c )
     {
       case 0:
         /* if this option set a flag, do nothing else
          * now */
         if ( long_options[option_index].flag != 0 )
         {
           break;
         }
         else
         {
           fprintf( stderr, "error parsing option %s with argument %s 
\n",
               long_options[option_index].name, optarg );
           exit( 1 );
         }
         break;

       /* exit if we get an unknown option */
       case '?':
         exit( 1 );
         break;

       default:
         fprintf( stderr, "unknown error while parsing options (%d) 
\n", c );
         exit( 1 );
     }
   }

   if ( optind < argc )
   {
     fprintf( stderr, "extraneous command line arguments:\n" );
     while ( optind < argc )
     {
       fprintf ( stderr, "%s\n", argv[optind++] );
     }
     exit( 1 );
   }

   fprintf( stderr, "I am stderr!\n" );

   if ( vrbflg )
   {
     fprintf( stderr, "I really, really am stderr!\n" );
   }

   return 0;
}




On May 9, 2008, at 10:44 AM, condor-admin response tracking system  
wrote:

>> I have been running a DAG containing standard universe jobs which
>> fail due to an error in the command line arguments. The jobs exit
>> with return code 1 and dagman correctly captures them as failures.
>> The stderr files are written, but are all empty. They should contain
>> the error message from the code:
>>
>> -rw-r--r-- 1 dbrown sufaculty  0 May  7 16:25
>> inspiral-817566800-817568848-29160-0.err
>> -rw-r--r-- 1 dbrown sufaculty  0 May  7 16:25
>> inspiral-817566800-817568848-29160-0.out
>> -rw-r--r-- 1 dbrown sufaculty  0 May  7 16:30
>> inspiral-817568720-817570768-29175-0.err
>> -rw-r--r-- 1 dbrown sufaculty  0 May  7 16:30
>> inspiral-817568720-817570768-29175-0.out
>> -rw-r--r-- 1 dbrown sufaculty  0 May  7 16:25
>> inspiral-817570640-817572688-29161-0.err
>> -rw-r--r-- 1 dbrown sufaculty  0 May  7 16:25
>> inspiral-817570640-817572688-29161-0.out
>> -rw-r--r-- 1 dbrown sufaculty  0 May  7 16:25
>> inspiral-817572560-817574608-29163-0.err
>>
>> When I run the code on the command line, I see the error message
>> correctly:
>>
>> [dbrown@sugar tag]$ ./lalapps_inspiral --trig-end-time 0 --cluster-
>> method window --dynamic-range-exponent 69.0 --disable-rsq-veto -- 
>> bank-
>> file H1-TMPLTBANK-817577671-2048.xml.gz --high-pass-order 8 --strain-
>> high-pass-order 8 --ifo-tag FIRST --gps-end-time 817579719 --
>> calibrated-data real_8 --channel-name H1:LSC-STRAIN --snr-threshold
>> 5.5 --number-of-segments 15 --trig-start-time 0 --enable-high-pass
>> 30.0 --debug-level 33 --gps-start-time 817577671 --enable-filter-inj-
>> only --high-pass-attenuation 0.1 --chisq-bins 0 --inverse-spec-length
>> 16 --segment-length 1048576 --low-frequency-cutoff 40.0 --pad-data 8
>> --cluster-window 16 --sample-rate 4096 --chisq-threshold 10.0 --
>> resample-filter ldas --strain-high-pass-atten 0.1 --strain-high-pass-
>> freq 30.0 --segment-overlap 524288 --frame-cache cache/H-
>> H1_RDS_C03_L2-817577663-817592553.cache --chisq-delta 0.2 --bank- 
>> veto-
>> subbank-size 1 --approximant FindChirpSP --write-compress --enable-
>> output --order twoPN --spectrum-type median
>> Condor: Notice: Will checkpoint to ./lalapps_inspiral.ckpt
>> Condor: Notice: Remote system calls disabled.
>> ./lalapps_inspiral: unrecognized option `--bank-veto-subbank-size'
>>
>> Any idea why this is not being captured in the log file? stdout from
>> another job is:
>>
>> -rw-r--r-- 1 dbrown sufaculty  0 May  7 21:15
>> tmpltbank-817745157-817747205-29812-0.err
>> -rw-r--r-- 1 dbrown sufaculty 31 May  7 22:23
>> tmpltbank-817745157-817747205-29812-0.out
>
>
> Can you try condor_compile'ing the following program with the same
> compiler as your application and submitting it to your Condor pool?
>
> ------------------
> #include <stdio.h>
> int main() {
>      fprintf( stderr, "I am stderr!\n" );
>      return 0;
> }
> ------------------
>
>
> Thanks and regards,
> Jaime Frey
> UW-Madison Condor Team
>
>
>
>
> ========================================
> MESSAGE INFORMATION
> ========================================
> * From: Jaime Frey <jfrey__AT__cs.wisc.edu>
> * Ticket Email List: dabrown__AT__physics.syr.edu,  
> anderson__AT__ligo.caltech.edu,skoranda__AT__gravity.phys.uwm.edu

-- 

Duncan Brown                          Room 263-1, Department of Physics,
Assistant Professor of Physics        Syracuse University, NY 13244, USA
Phone: (315) 443 5993             http://www.gravity.phy.syr.edu/~duncan



===========================================================================
Date mail was appended: Thu May 15 16:50:23 2008 (1210888224)
CC: anderson__AT__ligo.caltech.edu, skoranda__AT__gravity.phys.uwm.edu
From: Duncan Brown <dabrown__AT__physics.syr.edu>
Subject: Re: [condor-admin #17983] LIGO: stderr not captured in 7.0.1
 standard universe
Date: Thu, 15 May 2008 18:20:20 -0400
To: condor-admin__AT__cs.wisc.edu
X-Scanner: InterScan AntiVirus for Sendmail
X-Seen-BY: mailfromd 4.1 granite.cs.wisc.edu

Hi Jamie,

On May 9, 2008, at 10:44 AM, condor-admin response tracking system  
wrote:
> Can you try condor_compile'ing the following program with the same
> compiler as your application and submitting it to your Condor pool?


I forgot to mention that this works fine on the Caltech cluster  
running FC4:

[dbrown__AT__ldas-pcdev1.ligo inspiral]$ cat inspiral_test.sub
universe = standard
executable = /usr1/dbrown/src/lalapps/src/inspiral/lalapps_inspiral
arguments = --badgers
error = inspiral_test.err
output = inspiral_test.out
log = inspiral_test.log
queue


[dbrown__AT__ldas-pcdev1.ligo inspiral]$ condor_submit inspiral_test.sub
Submitting job(s).
Logging submit event(s).
1 job(s) submitted to cluster 2620571.

[dbrown__AT__ldas-pcdev1.ligo inspiral]$ cat inspiral_test.log
000 (2620571.000.000) 05/15 15:18:01 Job submitted from host:  
<10.14.0.18:50983>
...
001 (2620571.000.000) 05/15 15:18:06 Job executing on host:  
<10.14.1.18:36563>
...
005 (2620571.000.000) 05/15 15:18:06 Job terminated.
         (1) Normal termination (return value 1)
                 Usr 0 00:00:00, Sys 0 00:00:00  -  Run Remote Usage
                 Usr 0 00:00:00, Sys 0 00:00:00  -  Run Local Usage
                 Usr 0 00:00:00, Sys 0 00:00:00  -  Total Remote Usage
                 Usr 0 00:00:00, Sys 0 00:00:00  -  Total Local Usage
         1225  -  Run Bytes Sent By Job
         10186707  -  Run Bytes Received By Job
         1225  -  Total Bytes Sent By Job
         10186707  -  Total Bytes Received By Job
...

[dbrown__AT__ldas-pcdev1.ligo inspiral]$ cat inspiral_test.err
condor_exec.2620571.0: unrecognized option `--badgers'

Cheers,
Duncan.

-- 

Duncan Brown                          Room 263-1, Department of Physics,
Assistant Professor of Physics        Syracuse University, NY 13244, USA
Phone: (315) 443 5993             http://www.gravity.phy.syr.edu/~duncan



===========================================================================
Date mail was appended: Thu May 15 17:20:27 2008 (1210890030)
From: Jaime Frey <jfrey__AT__cs.wisc.edu>
To: condor-admin__AT__cs.wisc.edu
Subject: Re: [condor-admin #17983] LIGO: stderr not captured in 7.0.1
 standard universe
Date: Mon, 19 May 2008 11:03:51 -0500

> So now I'm completely confused. If you want to look at the source
> code for lalapps_inspiral it is at
>
> <http://www.lsc-group.phys.uwm.edu/cgi-bin/cvs/viewcvs.cgi/lalapps/
> src/inspiral/inspiral.c?rev=1.272&cvsroot=lal&content-type=text/
> vnd.viewcvs-markup>
>
> I'll try doing some more experiments, but if you have suggestions of
> where to look, I'd appreciate it.


Can you try building stderr_test and lalapps_inspiral on the same  
machine, then submit two executables to Condor?

Thanks and regards,
Jaime Frey
UW-Madison Condor Team



===========================================================================
Date mail was appended: Mon May 19 11:03:54 2008 (1211213034)
Subject: Actions

Status changed from open to pending by jfrey
===========================================================================
Date of actions: Mon May 19 11:03:54 2008 (1211213035)
CC: anderson__AT__ligo.caltech.edu, skoranda__AT__gravity.phys.uwm.edu
From: Duncan Brown <dabrown__AT__physics.syr.edu>
Subject: Re: [condor-admin #17983] LIGO: stderr not captured in 7.0.1
 standard universe
Date: Mon, 19 May 2008 14:07:16 -0400
To: condor-admin__AT__cs.wisc.edu
X-Scanner: InterScan AntiVirus for Sendmail
X-Seen-BY: mailfromd 4.1 granite.cs.wisc.edu

Hi Jamie,

On May 19, 2008, at 12:03 PM, condor-admin response tracking system  
wrote:

>> So now I'm completely confused. If you want to look at the source
>> code for lalapps_inspiral it is at
>>
>> <http://www.lsc-group.phys.uwm.edu/cgi-bin/cvs/viewcvs.cgi/lalapps/
>> src/inspiral/inspiral.c?rev=1.272&cvsroot=lal&content-type=text/
>> vnd.viewcvs-markup>
>>
>> I'll try doing some more experiments, but if you have suggestions of
>> where to look, I'd appreciate it.
>
>
> Can you try building stderr_test and lalapps_inspiral on the same
> machine, then submit two executables to Condor?


[dbrown@sugar-dev1 stderr]$ condor_compile gcc -Wall -g -o  
stderr_test stderr_test.c
LINKING FOR CONDOR : /usr/bin/ld -L/usr/lib64 -Bstatic --eh-frame-hdr  
-m elf_x86_64 --hash-style=gnu -dynamic-linker /lib64/ld-linux- 
x86-64.so.2 -o stderr_test /usr/lib64/condor_rt0.o /usr/lib64/crti.o / 
usr/lib/gcc/x86_64-redhat-linux/4.1.1/crtbeginT.o -L/usr/lib64 -L/usr/ 
lib/gcc/x86_64-redhat-linux/4.1.1 -L/usr/lib/gcc/x86_64-redhat-linux/ 
4.1.1 -L/usr/lib/gcc/x86_64-redhat-linux/4.1.1/../../../../lib64 -L/ 
lib/../lib64 -L/usr/lib/../lib64 /tmp/ccMlMLAE.o /usr/lib64/ 
libcondorsyscall.a /usr/lib64/libcondor_z.a /usr/lib64/libcomp_libstdc 
++.a /usr/lib64/libcomp_libgcc.a /usr/lib64/libcomp_libgcc_eh.a --as- 
needed --no-as-needed -lcondor_c -lcondor_nss_files -lcondor_nss_dns - 
lcondor_resolv -lcondor_c -lcondor_nss_files -lcondor_nss_dns - 
lcondor_resolv -lcondor_c /usr/lib64/libcomp_libgcc.a /usr/lib64/ 
libcomp_libgcc_eh.a --as-needed --no-as-needed /usr/lib/gcc/x86_64- 
redhat-linux/4.1.1/crtend.o /usr/lib64/crtn.o
/usr/lib64/libcondorsyscall.a(condor_file_agent.o): In function  
`CondorFileAgent::open(char const*, int, int)':
/home/condor/execute/dir_15919/userdir/src/condor_ckpt/ 
condor_file_agent.C:106: warning: the use of `tmpnam' is dangerous,  
better use `mkstemp'
/usr/lib64/libcondorsyscall.a(special_stubs.o): In function  
`condor_gethostbyaddr':
/home/condor/execute/dir_15919/userdir/src/condor_syscall_lib/ 
special_stubs.C:200: warning: Using 'gethostbyaddr' in statically  
linked applications requires at runtime the shared libraries from the  
glibc version used for linking
/usr/lib64/libcondorsyscall.a(special_stubs.o): In function  
`condor_gethostbyname':
/home/condor/execute/dir_15919/userdir/src/condor_syscall_lib/ 
special_stubs.C:193: warning: Using 'gethostbyname' in statically  
linked applications requires at runtime the shared libraries from the  
glibc version used for linking
/usr/lib64/libcondorsyscall.a(sock.o): In function  
`Sock::getportbyserv(char*)':
/home/condor/execute/dir_15919/userdir/src/condor_io/sock.C:208:  
warning: Using 'getservbyname' in statically linked applications  
requires at runtime the shared libraries from the glibc version used  
for linking

[dbrown@sugar-dev1 stderr]$ cat stderr_inspiral.sub
universe = standard
arguments = --badgers
error = stderr_inspiral.$(cluster).$(process).err
output = stderr_inspiral.$(cluster).$(process).out
log = stderr_inspiral.log
executable = /home/dbrown/projects/daswg/condor/stderr/stderr_test
queue
executable = /home/dbrown/projects/daswg/condor/stderr/lalapps_inspiral
queue

[dbrown@sugar stderr]$ condor_submit stderr_inspiral.sub
Submitting job(s)..
Logging submit event(s)..
1 job(s) submitted to cluster 43046.
1 job(s) submitted to cluster 43047.

[dbrown@sugar stderr]$ tail -f stderr_inspiral.log
000 (43046.000.000) 05/19 12:36:45 Job submitted from host:  
<10.20.1.23:58095>
...
000 (43047.000.000) 05/19 12:36:45 Job submitted from host:  
<10.20.1.23:58095>
...
001 (43046.000.000) 05/19 12:41:58 Job executing on host:  
<10.20.2.44:38245>
...
005 (43046.000.000) 05/19 12:41:58 Job terminated.
         (1) Normal termination (return value 1)
                 Usr 0 00:00:00, Sys 0 00:00:00  -  Run Remote Usage
                 Usr 0 00:00:00, Sys 0 00:00:00  -  Run Local Usage
                 Usr 0 00:00:00, Sys 0 00:00:00  -  Total Remote Usage
                 Usr 0 00:00:00, Sys 0 00:00:00  -  Total Local Usage
         725  -  Run Bytes Sent By Job
         5716333  -  Run Bytes Received By Job
         725  -  Total Bytes Sent By Job
         5716333  -  Total Bytes Received By Job
...
001 (43047.000.000) 05/19 12:41:58 Job executing on host:  
<10.20.2.44:38245>
...
005 (43047.000.000) 05/19 12:41:58 Job terminated.
         (1) Normal termination (return value 1)
                 Usr 0 00:00:00, Sys 0 00:00:00  -  Run Remote Usage
                 Usr 0 00:00:00, Sys 0 00:00:00  -  Run Local Usage
                 Usr 0 00:00:00, Sys 0 00:00:00  -  Total Remote Usage
                 Usr 0 00:00:00, Sys 0 00:00:00  -  Total Local Usage
         941  -  Run Bytes Sent By Job
         14791884  -  Run Bytes Received By Job
         941  -  Total Bytes Sent By Job
         14791884  -  Total Bytes Received By Job
...

[dbrown@sugar stderr]$ cat stderr_inspiral.43046.0.err
condor_exec.43046.0: unrecognized option `--badgers'
[dbrown@sugar stderr]$ cat stderr_inspiral.43047.0.err
[dbrown@sugar stderr]$

Nothing in the inspiral stderr.

Cheers,
Duncan.




-- 

Duncan Brown                          Room 263-1, Department of Physics,
Assistant Professor of Physics        Syracuse University, NY 13244, USA
Phone: (315) 443 5993             http://www.gravity.phy.syr.edu/~duncan



===========================================================================
Date mail was appended: Mon May 19 13:07:19 2008 (1211220439)
Subject: Comments added

This ticket is being closed since it looks like it was tied to admin #17975
which I've solved. In the event admin #17975 doesn't solve this ticket, it
will be reopened.

Comments added by psilord

===========================================================================
Date comments were added: Mon Jun  2 16:11:01 2008 (1212441062)
Subject: Actions

Ticket resolved by psilord
===========================================================================
Date of actions: Mon Jun  2 16:11:57 2008 (1212441117)