S5 Thread Test Results




The following are my analysis of examining the flatfield, science, and lincal threads.

General Notes

It is too hard to navigate the archive directory tree. IRAC (the IST, anyway) has never actually used the flight archive structure until now. Having each DCE at the bottom of an individual directory tree that is 10 levels deep makes analysis of the calibration threads almost impossible. Furthermore, for each DCE processed there are typically 130-140 associated files! We are going to have to have tools to condense this information further. Specifically, the final product generator will have to do this for the GOs, but there will be serious problems during IOC if the instrument teams are expected to look at data in this fashion while troubleshooting the pipeline. They are used to putting their hands directly on the data and manipulating it. I do not think anyone will be very satisfied if they are entirely limited to tweaking our pipeline and having to run it in order to do anything to the data. As a simple start, I suggest something like a "gather" script that can recursively parse down the archive directory tree, find all the files matching a given wildcard, and then deliver them to some other location, i.e. we need a tool to flatten the directory structure for easy access. I also suggest that it is likely that a lot of what the IT does during IOC will not be so much them tweaking our modules and rerunning the pipeline so much as it will be them reprocessing the data themselves and then telling us how to change the pipeline. (author's note - such a script is currently under development)

On a related note, there is no way to adequately trace the pedigree of the input files, at least not easily. In many cases while looking at the thread outputs I was able to trace problems back to the input calibration files, but from there was stuck since there was no traceability to the original data that went into them. Presumably this will change when the calibration server is in place. I think this underscores that the number of files (namelists, calibration images, etc.) that need to be juggled to make everything work has become so large that it is neccessary to fully automate the process. Up until this point we have been able to get away with Russ adapting the scripts to run primarily by hand. We have reached the point that this is breaking down.


Science Thread

I went through the intermediate products, at each stage differencing the stages to see what changes had been made by each module. In most cases everything was as expected, and I have only noted here things that did not work or were not completely clear. They are listed as follows, broken doen by module:


FLATFIELD Thread

There are 130+ files for 1 DCE!!!

What is "flattrack_prior_state.fits"? It doesn't look like anything I can recognize.

In the /scr/S5.0D/0001/0000/IRAC/0000257024/4/0001/0011/01/1/ subdirectory the file "flatfield.fits" looks terrible. It is erroneous, with step functions in the image. This can be traced back to FOWLINEARIZE. FOWLINEARIZE in turn failed because the lincal input was bad. I can't tell where the lincal data came from. It appears to ultimately be CPT data - it would have been better to have used CTA data. I also think this may be an older version of lincal as it does not have the model type written to the header, and is version 1.X and not 2.X. FOWLINEARIZE probably choked when it got data from an incompatible version of LINCAL. The other cal files look correct. This underscores the fact that we need the cal server and the executor so that we can keep track of these files more precisely.

Ater DARKSUB (left), and after FOWLINEARIZE (middle). The resulting flat (right) is ruined because FOWLINEARIZE got an incompatible input data set.
Things look pretty good up until then.


LINCAL Thread

The LINCAL thread also did not work. This was a little surprising since it was a data set that I have run through lincal manually with good results.

Some sample input data for LINCAL, in order of increasing exposure time.

LINCAL output. Roughly half the pixels
have no solution.
It should have looked like this
(sscsci4:/local/s2/irac/sf2_linearity/chan4/ds/ch4_out.fits).

An examination of the sosexec/out.lin file shows:
 No. of pixels for which model could not be computed:  5
   2: insufficient dynamic range
   1: insufficient number of usable points
   0: nonlinear system was singular
   2: masked in PMask
 No. of pixels with Sat < max usable DNobs:   36486
 No. of pixels with rejected saturated data:  21305
My guess is that the saturation detection algorithm was prematurely triggered by noisy values at low DN. If we examine the namelists:

Pipeline namelist:

> !more lincal.nl
 &lincin
   minimg = 5,
   dynhi = -9,
   dynlo = 99999,
   bmax = 500000.0,
   maxdmp = 5, dmpneg = t,
   ntable = 2000,
   ngpmax = 20, nsearch = 2,
   sigmax = 10000.0,
   slopemin = 0.4,
   satstats = t,
   dnsat = 25000,
 &end
 
Jason's namelist:
 > !more ch4.nl
 &lincin
   devfrac = 0.5,
   minimg = 3,
   dynhi = -9,
   dynlo = 99999,
   bmax = 500000.0,
   maxdmp = 5, dmpneg = t,
   ntable = 2000,
   ngpmax = 20, nsearch = 2,
   sigmax = 10000.0,
   slopemin = 0.3,
   satstats = t,
   dnsat = 20000,
   poscon = f,
   mindn = 2000,
 &end
 
The key differences are: a lowering of the minimum number of images, a cutoff at low DN so that low DN pixels are rejected from the fit, and an effective disabling of the code that tags saturation based on deviance from the fit (by setting devfrac = 0.5; saturation is instead based on when the linearity curve turns over, controlled by the value of slopemin = 0.3). The saturation detection algorithm is one of the more sophisticated parts of lincal, since it controls data rejection from the fit. It needs to be tuned fairly carefully per array. Here are the other namelists:
> !more ch1.nl
 &lincin
   devfrac = 0.5,
   minimg = 3,
   dynhi = -9,
   dynlo = 99999,
   bmax = 500000.0,
   maxdmp = 5, dmpneg = t,
   ntable = 2000,
   ngpmax = 20, nsearch = 2,
   sigmax = 10000.0,
   slopemin = 0.3,
   satstats = t,
   dnsat = 10000,
   poscon = f,
   mindn = 100,
   zsig = 3,
 &end

> !more ch2.nl
 &lincin
   devfrac = 0.5,
   minimg = 3,
   dynhi = -9,
   dynlo = 99999,
   bmax = 500000.0,
   maxdmp = 5, dmpneg = t,
   ntable = 2000,
   ngpmax = 20, nsearch = 2,
   sigmax = 10000.0,
   slopemin = 0.5,
   satstats = t,
   dnsat = 20000,
   poscon = f,
   mindn = 2000,
 &end
 
 > !more ch3.nl
 &lincin
   devfrac = 0.5,
   minimg = 3,
   dynhi = -9,
   dynlo = 99999,
   bmax = 500000.0,
   maxdmp = 5, dmpneg = t,
   ntable = 2000,
   ngpmax = 20, nsearch = 2,
   sigmax = 10000.0,
   slopemin = 0.2,
   satstats = t,
   dnsat = 20000,
   poscon = f,
   mindn = 1000,
 &end

I also checked the illumination images used. Mine was a smoothed version of the input data. The pipeline used the sum of the stored lamp images. They look essentially the same, and thus I don't think this is related to the problem. Note that in the future I will be changing these values again until they are tuned exactly to work. I suspect that this will be run the first time in flight by hand by me until I get the parameters right.

Pipeline input illumination pattern (left; used for saturation rejection) and Jason's input file (right; made from
smoothed input data). The two are substantially the same.