Legendre nnls merge with main #307

craigwarner-ufastro · 2024-05-21T21:25:16Z

Created this branch to sort out the differences between main and legendre_nnls. Many changed had been made to main since legendre_nnls was branched off and some conflicts had to be manually resolved, particularly with bands in archetypes.py.

vector (in tdata) so that NNLS can be used instead of BVLS. Fixed bug when ValueError excepted in rebinning

negative legendre terms. Refactored a little to clean up. For archetypes with legendre terms, NNLS is now actually used with additional terms of -1*each legendre term so that we can use the faster NNLS to get the same result as BVLS.

because on CPU, the additional computation time for the larger array sizes dominates any time savings of NNLS vs BVLS

negative legendre terms on the GPU in lieu of BVLS into per_camera_coeff_with_least_square_batch. Moved prior_on_coeffs into zscan.py, called from within per_camera_coeff_with_least_square_batch. Pass prior_sigma, a scalar, insted of the array from fitz to archetypes. Bug fix on case where only some object types, not all, have an archetype.

terms with 0 coefficients and make the coeffs negative for negative terms. Refactored to remove the last bit of the BVLS-NNLS trick from archetypes.py and zfind.py by passing binned instead of tdata to per_camera_coeff_with_least_square_batch.

involve using a list of bands instead of ncam to be more general.

because results were different and timing as well.

…re_nnls_merge_with_main Resolved conflicts

coveralls · 2024-05-21T21:39:59Z

coverage: 38.513% (-0.5%) from 39.022%
when pulling 425d3fc on legendre_nnls_merge_with_main
into ca78381 on main.

craigwarner-ufastro · 2024-05-23T19:53:16Z

@sbailey
Filling in more context, this branch was created to rectify conflicts between legendre_nnls and main that had popped up due to changes to main after branching off legendre_nnls. The changes to the legendre_nnls branch included refactoring code such that everything related to the trick of using negative legendre terms and NNLS instead of BVLS was in per_camera_coeff_with_least_square_batch in zscan.py. Additionally, the output was changed to remove the 0 coefficients in the solution - when we create negative copies of the legendre terms and fit 2*nleg + narch terms, half of the coefficients in the fit are 0 - either the positive or negative version of each legendre term. If the negative legendre term has a non-zero coefficent, we multiply that coefficient by -1. Thus our output is exactly the same as if BVLS had been used.

When this was initially pushed out, there were conflicts, which I created a new branch to resolve.

The addition by Abhijeet of
bands (list): list of wavelength bands

instead of ncam as a parameter to per_camera_coeff_with_least_square_batch.

Moving the prior calculation out of fitz.py and into zscan.py which is where we want it. The best way to proceed moving the legendre-specific manipulations out of archetypes is what I sent you last Monday, that instead of passing tdata to per_camera_coeff_with_least_square_batch, we pass binned to per_camera_coeff_with_least_square_batch and assemble the tdata array there. This consolidates everything related to the BVLS/NNLS trick within that single method per_camera_coeff_with_least_square_batch.
The output COEFF should be collapsed to be the same size as if we had not added the negative legendre terms by removing the zero terms and if a negative term was nonzero, multiply that by -1.E.g. if we have a row [r_i, l1_c1_i, l2_c1_i, -l1_c1_i, -l2_c1_i, l1_c2_i, l2_c2_i, -l1_c2_i, -l2_c2_i, l1_c3_1, l2_c3_i, -l1_c3_i, -l2_c3_i]
Where r_i is the rebinned data for that wavelength and ln_cm is the nth legendre term for the mth camera for wavelength index i and let's say it equals
[1, 2, 0, 0, 3, 0, 0, 4, 5, 6, 7, 0, 0]
then we want that row in the output to be collapsed to
[1, 2, -3, -4, -5, 6, 7]

After attempting to resolve conflicts, and realizing I needed to update to a newer desi environment, everything ran happily but the results did not match those from legendre_nnls previously so I created this new branch to handle the merge.

After a lot of playing around with various branches in various desi environments, I have finally satisfied myself that everything is ok with my branch to merge into main. There were major output differences between running in environment 23.1 vs 24.4 AND major differences between my legendre_nnls and legendre_nnls_merge_with_main branches running under 24.4. I have finally concluded all of these are due to template changes and code changes unrelated to anything in my updates.If I run from the main branch without archetypes, I now get an identical output to legendre_nnls_merge_with_main. If I run with archetypes, I get an np.allclose agreement between them:

cdwarner@nid002624:/global/cfs/cdirs/desi/users/cdwarner/code> python compare_redrock_output.py /pscratch/sd/c/cdwarner/abhijeet-nnls-merge-244.fits /pscratch/sd/c/cdwarner/abhijeet-main-244.fits
col='Z'  ndiff=0  std=0.0
col='ZERR'  ndiff=0  std=0.0
col='CHI2'  ndiff=416  std=1.7036060042842667e-10

FYI here are the differences in my legendre_nnls branch with exact same code and environment 23.1 vs 24.4:

cdwarner@nid002624:/global/cfs/cdirs/desi/users/cdwarner/code> python compare_redrock_output.py /pscratch/sd/c/cdwarner/redrock-nnls-231.fits /pscratch/sd/c/cdwarner/redrock-nnls-244.fits 
col='Z'  ndiff=421  std=2.0492627134013167e-05
col='ZERR'  ndiff=446  std=4.637423724708316e-07
col='CHI2'  ndiff=412  std=42.88757255960495

And even bigger the difference between legendre_nnls and legendre_nnls_merge_with_main

cdwarner@nid002624:/global/cfs/cdirs/desi/users/cdwarner/code> python compare_redrock_output.py /pscratch/sd/c/cdwarner/redrock-nnls-244.fits /pscratch/sd/c/cdwarner/redrock-nnls-merge-244.fits 
col='Z'  ndiff=355  std=0.2950015931780382
col='ZERR'  ndiff=411  std=1.406718431201316e-05
col='CHI2'  ndiff=351  std=5.980265673243861e+83

Keep in mind that this is without archetypes! - so not touching anything I recently did. Clearly one file had a 9e99 flag set in one code version vs the other.With --archetypes,

cdwarner@nid002624:/global/cfs/cdirs/desi/users/cdwarner/code> python compare_redrock_output.py /pscratch/sd/c/cdwarner/abhijeet-nnls-244.fits /pscratch/sd/c/cdwarner/abhijeet-nnls-merge-244.fits 
col='Z'  ndiff=311  std=0.6753668041725044
col='ZERR'  ndiff=362  std=0.00024064833100371678
col='CHI2'  ndiff=447  std=3206.6975053943816

But again all of these differences seem to come from the templates and code independent of anything I did and the main branch seems to agree with my legendre_nnls_merge_with_main branch.

I did notice that there were changes in the templates between the various DESI environments and between the main/merged code versus legendre_nnls in the print statements on reading the templates.

Finally there is still a significant slowdown in the new version in the fine redshift scan that I am investigating that is independent of archetypes and any code I had touched in the legendre_nnls branch that I attribute to a change in templates as a null hypothesis.

craigwarner-ufastro and others added 9 commits February 27, 2024 13:48

Modified archetypes using BVLS to add -1*all legendre terms to the basis

c4132a5

vector (in tdata) so that NNLS can be used instead of BVLS. Fixed bug when ValueError excepted in rebinning

Only use additional negative legendre terms and NNLS when in GPU mode

7911c5b

because on CPU, the additional computation time for the larger array sizes dominates any time savings of NNLS vs BVLS

Merge branch 'main' into legendre_nnls

bbf187b

Refactor to bring more in-line with main where the conflicts mostly

fd76617

involve using a list of bands instead of ncam to be more general.

Created a new branch to handle the merging of legendre_nnls and main

a4a9531

because results were different and timing as well.

Merge branch 'main' of https://github.com/desihub/redrock into legend…

425d3fc

…re_nnls_merge_with_main Resolved conflicts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Legendre nnls merge with main #307

Legendre nnls merge with main #307

craigwarner-ufastro commented May 21, 2024

coveralls commented May 21, 2024

craigwarner-ufastro commented May 23, 2024

Legendre nnls merge with main #307

Are you sure you want to change the base?

Legendre nnls merge with main #307

Conversation

craigwarner-ufastro commented May 21, 2024

coveralls commented May 21, 2024

craigwarner-ufastro commented May 23, 2024