Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce package distribution size #1769

Merged
merged 91 commits into from
Jun 12, 2024
Merged

Reduce package distribution size #1769

merged 91 commits into from
Jun 12, 2024

Conversation

kbwestfall
Copy link
Collaborator

The primary goal of this PR is to reduce the size of the package distribution installed using pip. To do so, I

  1. refactored @tbowers7 cache system, largely to make the code more general,
  2. added more data files to the list being controlled by the cache
  3. removed the doc/ directory from the distribution

For item 1, I added the pypeitdata.py and cache.py files, which are refactored pieces of what was data/utils.py, and then I moved some of the load_* functions from data/utils.py into io.py. With this refactor, I've removed the pypeit.data module and replaced it with the pypeit.dataPaths instance; see the changes to the main __init__.py file. If this is accepted, I'd like to make the data/ directory exclusively for data files; i.e., it should not have any code in it.

After some testing, this reduces the size of the tar file from ~145MB for version 1.15.0 to about 12MB. The big ticket items were the test data files, some of the standards, and doc figures directory.

There's no hurry on this one, but I wanted to issue this so that everyone can start to comment.

@kbwestfall
Copy link
Collaborator Author

Test "pass". All of the failures are in NIRSPEC, presumably because of recent changes to the raw data available. I'll re-run after the NIRSPEC branch is merged.

Test Summary
--------------------------------------------------------
--- PYTEST PYPEIT UNIT TESTS PASSED  257 passed, 56 warnings in 399.69s (0:06:39) ---
--- PYTEST UNIT TESTS FAILED  3 failed, 130 passed, 171 warnings in 937.68s (0:15:37) ---
--- PYTEST VET TESTS PASSED  61 passed, 73 warnings in 5528.28s (1:32:08) ---
--- PYPEIT DEVELOPMENT SUITE FAILED 1/237 TESTS  ---
Failed tests:
    keck_nirspec/LOW_NIRSPEC-1 pypeit
Skipped tests:
Testing Started at 2024-03-07T21:33:16.697132
Testing Completed at 2024-03-08T10:08:43.978332
Total Time: 12:35:27.281200

Copy link
Collaborator

@rcooke-ast rcooke-ast left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is impressive @kbwestfall - job very well done! I only have a few minor points, but the new approach is a lot better than before. I also have one general question... If a user updates their version of PypeIt, will this require them to re-download all data files when needed (or, only if those data files have been updated)? It just wasn't clear to me how any updated (i.e. changed from one release to the next release) data files are being managed. Thanks again!

I have no major issues with this PR, so I'm approving.

doc/scripts/build_cache_data_tbl.py Show resolved Hide resolved
doc/scripts/make_example_files.py Show resolved Hide resolved
pypeit/core/wavecal/waveio.py Show resolved Hide resolved
pypeit/io.py Show resolved Hide resolved
pypeit/scripts/cache_github_data.py Show resolved Hide resolved
pypeit/scripts/install_ql_calibs.py Show resolved Hide resolved
pypeit/spectrographs/vlt_xshooter.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@tbowers7 tbowers7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great. Thanks for addressing the Windows path questions.

doc/installing.rst Outdated Show resolved Hide resolved
doc/installing.rst Show resolved Hide resolved
doc/scripts/build_cache_data_tbl.py Show resolved Hide resolved
doc/scripts/make_example_files.py Show resolved Hide resolved
doc/scripts/make_example_files.py Show resolved Hide resolved
pypeit/cache.py Show resolved Hide resolved
pypeit/pypeitdata.py Show resolved Hide resolved
pypeit/pypeitdata.py Outdated Show resolved Hide resolved
@kbwestfall
Copy link
Collaborator Author

There is a test failure, but it's because of a recent change to the files in the relevant Keck/HIRES dataset. The vet-test failure is related. So I think this one is ready to go.

Test Summary
--------------------------------------------------------
--- PYTEST PYPEIT UNIT TESTS PASSED  256 passed, 56 warnings in 431.72s (0:07:11) ---
--- PYTEST UNIT TESTS PASSED  148 passed, 167 warnings in 1038.48s (0:17:18) ---
--- PYTEST VET TESTS FAILED  1 failed, 60 passed, 96 warnings in 5677.00s (1:34:37) ---
--- PYPEIT DEVELOPMENT SUITE FAILED 1/239 TESTS  ---
Failed tests:
    keck_hires/J1723+2243_W241_RED_C5_ECH_-0.15_XD_0.90_2x2 pypeit
Skipped tests:
Testing Started at 2024-06-12T01:28:58.711750
Testing Completed at 2024-06-12T14:16:18.163584
Total Time: 12:47:19.451834

@kbwestfall kbwestfall merged commit 76daddb into develop Jun 12, 2024
23 checks passed
@kbwestfall kbwestfall deleted the dirs branch June 12, 2024 16:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants