Trace fix improvements #2386

segasai · 2024-10-09T21:42:01Z

Hi,

This is a PR WIP trying to address #2380 (and maybe also improve systematic rv floor)

ATM I don't think it fixes anything much but currently does a couple of things I was hoping could help.

Use multiple PSFs at different wavelengths (rather than PSF at central wavelength) when cross-correlating with external spectra to avoid possible biases vs wavelength
in the same external spectrum x-correlation routine use PSFs from multiple fibers as well for the same reason
when doing 'internal wavelength' calibration, subtract the continuum in the same way we do for external calibration, to avoid issues if we have a bright stars and weak lines

I did also try to restructure the code a little bit/reduce duplication to make it easier to change.

I would ideally like to merge this type of patch even if it cannot fully fix #2380 (when it is ready and assuming nothing gets worse)

Also use different psfs to convolve the spectrum

therefore the filtering size needs to depend on that

coveralls · 2024-10-09T21:51:49Z

coverage: 30.083% (-0.1%) from 30.218%
when pulling b08cb4e on trace_fix_improvements
into 41da70a on main.

sbailey · 2024-10-14T19:14:04Z

py/desispec/trace_shifts.py

+                                                                   ivar=ivar[fiber,ok],
+                                                                   hw=3., calibrate=True)
+            if fiber %10==0 :
+                log.info("Wavelength offset %f +/- %f for fiber #%03d at wave %f "%(dwave, err, fiber, block_wave))


Minor style comment, but for the record so that it doesn't propagate to more code:

When formatting strings for logging, it is better to use the structure

log.info("Wavelength offset %f +/- %f for fiber #%03d at wave %f", dwave, err, fiber, block_wave)

so that the string formatting is only evaluated by the logger if the log level would actually print it. In practice we almost always have INFO-level on so this particular case doesn't make a performance difference, but the readability is nearly the same as old-syle %-formatting and we have had cases in the past of string evaluation of unprinted debug-level logging taking a significant amount of time, so this style is something to keep in mind when adding new log messages.

Personal opinion: I'm also fine with pre-evaluated new-style format strings for INFO-level and above if the author feels it helps with readability, e.g.

log.info(f"Wavelength offset {dwave} +/- {err} for fiber #{fiber:03d} at wave {block_wave}")

My basic motivation is that readability trumps performance given that we use INFO-level for production running anyway so there isn't a performance impact. This should not be used for DEBUG-level logging though, due to performance issues especially if it is deep in loops.

[actual review of algorithmic changes takes more thought/time...]

segasai · 2024-10-27T16:20:38Z

I think to avoid feature creep and very large patch, I have decided to test what we have here.
I ran full processing of several tiles/nights (same set I used for previous version of trace_shifts patch)
/global/homes/k/koposov/desi_koposov/wavelength_fix/bulk.sh

To my surprise this already lead to significant improvements. I looked at the xmatch of velocities ( from redrock-) to APOGEE.

And here, the MAD wrt to APOGEE changes from 1.3 km/s to 1.1 km/s.
(I checked for all other surveys there are still improvements, but less visible due to their bigger errors)
Also I checked and the MEANDY's scatter also improves (MAD goes from 0.018 to 0.015)

So I think that's good evidence of things improving and I think it would be good to commit this.
Summarizing the changes here

background subtraction when doing internal wavelength offsets (in the same way it's done when doing external wavelength corrections)
using correct variance when doing internal wavelength offsets ( see Internal wavelength calibration errors #2113 )
refactor by putting the background subtraction into separate function
When doing external wavelength correction use PSFs sampled across 20 points along the spectrum, and 20 fibers across the petal as opposed to one single wavelength point for one fiber
using a consistent and more reasonable weighting scheme for determined the 'mean' wavelength of the wavelength bin considered.
None of these changes fixes the Fixing 1 km/s radial velocity systematic #2380

One thing that I feel makes these changes harder to analyse is that these internal/external wavelength offsets for each fiber and wavelength bins are not saved anywhere (other than printed in the log and saved on average in MEANDY). I think for future improvements, it'd be beneficial to save these kinds of offsets (i.e. either in psf- file or some other kind of output file). It'd be ~ 2500 numbers per single frame, so it's not that much. But maybe that should be dealt with separately.

segasai · 2024-11-02T13:56:29Z

I've verified on the the test subset of data processed /global/cfs/cdirs/desi/spectro/redux/koposov/wavelength_test_bulk_new/
that repeated observations also show reduced scatter in velocities See figure for z-band arm measurements

Improvement is MAD change from .7 km/s to .6 km/s.
I've done a separate analysis when using blue only arm there. MAD improves from 1 km/s to .9 km/s
Also looking at the large offsets, the number of those is significantly reduced with new code (this is B arm)

get rid of useless warning

julienguy

I agree weighting with ivar x flux^2 x (flux>0) is better than ivar x flux x (flux>0) to get the effective wavelength in a bin.
Subtracting the continuum is a good idea to improve the precision on the cross-match (similarily to what was done to the match of the sky)
I see the effort to sample the PSF across wavelength and fibers to convolve the external reference spectrum (original code was using only wavelength and fiber for the PSF)
I ran the code on a random dark time exposure and 3 cameras and found tiny shifts <0.01A as expected
I have only one change request: you wrote a comment "The reason why we oversample is unclear to me".
It is there because the wavelength arrays with the boxcar extraction are not aligned and for this reason I thought oversampling before stacking would help. Given the careful study you have done, maybe you have seen that this does not help, in which case please change the code, otherwise, you may remove the comment.

I think we can merge after that. Thank you for all the work.

fix a few typos in comments

segasai · 2024-11-05T00:34:19Z

Thanks for looking over the patch @julienguy . On the resampling, My comment was just a thought (it wasn't obvious to me there would be benefit from oversampling), but I didn't verify that. So I've removed that comment.
I think the only other comment in the patch that's worth dealing with, concerns this line of code

desispec/py/desispec/trace_shifts.py

Line 725 in fe88bbd

x=np.tile(x[hw]+np.arange(-hw,hw+1)*(y[-1]-y[0])/(2*hw+1),(y.size,1))

(it's now line 688 in my patch). I did not fully follow the reason behind the y[-1]-y[0] in this original code

so my code is just a more straightforward 2d grid in x,y, but maybe I missed the reason for that.

segasai added 10 commits October 9, 2024 03:02

move reference spectra transforms into dedicated function.

31710ad

Also use different psfs to convolve the spectrum

additional updates

b35478e

move code into separate function to be reused

61c241d

log the internal offsets

5a1b754

subtract continuum when doing internal calirbation

7f500c5

make the continuum subtraction in self-calibration optional

d5c29cf

use actual variance not variance times flux

389f1e1

update window of medianing

9587395

deal with the fact that different functions oversample differently

764d413

therefore the filtering size needs to depend on that

make sure arcs are still processed with continuum as before

ee20370

segasai added the WIP Work in Progress label Oct 9, 2024

fix variable name

be4d316

sbailey reviewed Oct 14, 2024

View reviewed changes

switch to f-formatting in a couple of instances

a4050eb

segasai mentioned this pull request Oct 17, 2024

Fixing 1 km/s radial velocity systematic #2380

Open

use correct weights for central wavelength

2f6e80a

segasai removed the WIP Work in Progress label Oct 27, 2024

segasai added 2 commits November 2, 2024 07:50

update variable names

0f03f5c

get rid of useless warning

add functions docstrings

aa1cf83

julienguy self-requested a review November 4, 2024 23:08

julienguy requested changes Nov 4, 2024

View reviewed changes

get rid of comment on oversampling

b3c5ba3

fix a few typos in comments

few more typos

b08cb4e

segasai requested a review from julienguy November 5, 2024 00:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Trace fix improvements #2386

Trace fix improvements #2386

segasai commented Oct 9, 2024

coveralls commented Oct 9, 2024 •

edited

Loading

sbailey Oct 14, 2024

segasai commented Oct 27, 2024

segasai commented Nov 2, 2024

julienguy left a comment

segasai commented Nov 5, 2024

Trace fix improvements #2386

Are you sure you want to change the base?

Trace fix improvements #2386

Conversation

segasai commented Oct 9, 2024

coveralls commented Oct 9, 2024 • edited Loading

sbailey Oct 14, 2024

Choose a reason for hiding this comment

segasai commented Oct 27, 2024

segasai commented Nov 2, 2024

julienguy left a comment

Choose a reason for hiding this comment

segasai commented Nov 5, 2024

coveralls commented Oct 9, 2024 •

edited

Loading