Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement spectrum copy(), *, *= methods, --scale command-line option #285

Merged
merged 11 commits into from
Sep 1, 2023

Conversation

ajjackson
Copy link
Collaborator

@ajjackson ajjackson commented Sep 1, 2023

Closes #219
In order to make the --scale option beautiful, tweak the Python API so that any Spectrum can by multiplied by a scalar Real using * or *=. (These are implemented with "dunder" methods __mul__ and __imul__ respectively.)

In practice we will usually use __imul__, so makes sense to define __mul__ using it, creating a fresh Spectrum to write into. In principle it would be more efficient to create one with an "empty" data array, but as there is lots of code making modified array copies, life is a lot simpler with a consistent copy() method used as a base for this stuff. If copying the data becomes a noticeable bottleneck we could add an empty=False argument that allows us to construct with e.g. y_data=np.empty_like(self.y_data)*ureg(self.y_data_unit), but for now assume YAGNI!

With the new Python/numpy/Pint versions we can use np.copy() directly on an array Quantity, which reduces unit-shuffling boilerplate. 😄

To-do

  • Refactor spectra.py to make full use of Spectrum.copy()
  • Unit-test Spectrum.copy()
  • Unit-test *, *=
  • Implement --scale argument
  • Script-test --scale argument
  • Update CHANGELOG

@github-actions
Copy link
Contributor

github-actions bot commented Sep 1, 2023

Test Results

       22 files  ±  0         22 suites  ±0   55m 23s ⏱️ + 1m 56s
  1 049 tests +  9    1 043 ✔️ +  9    6 💤 ±0  0 ±0 
10 430 runs  +90  10 367 ✔️ +90  63 💤 ±0  0 ±0 

Results for commit 23c0ae3. ± Comparison against base commit cfb9611.

♻️ This comment has been updated with latest results.

@codecov-commenter
Copy link

codecov-commenter commented Sep 1, 2023

Codecov Report

Merging #285 (23c0ae3) into master (cfb9611) will increase coverage by 0.01%.
The diff coverage is 97.72%.

❗ Your organization is not using the GitHub App Integration. As a result you may experience degraded service beginning May 15th. Please install the GitHub App Integration for your organization. Read more.

@@            Coverage Diff             @@
##           master     #285      +/-   ##
==========================================
+ Coverage   95.78%   95.79%   +0.01%     
==========================================
  Files          28       28              
  Lines        3958     3992      +34     
  Branches      798      803       +5     
==========================================
+ Hits         3791     3824      +33     
- Misses         98       99       +1     
  Partials       69       69              
Files Changed Coverage Δ
euphonic/spectra.py 95.16% <97.22%> (+0.04%) ⬆️
euphonic/cli/dos.py 89.33% <100.00%> (+0.29%) ⬆️
euphonic/cli/intensity_map.py 96.72% <100.00%> (+0.11%) ⬆️
euphonic/cli/powder_map.py 92.05% <100.00%> (+0.10%) ⬆️
euphonic/cli/utils.py 91.15% <100.00%> (+0.05%) ⬆️

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@ajjackson
Copy link
Collaborator Author

API wrangling paid off, 11ab6cc was a very satisfying little commit to write 😅

return Spectrum1D(self.x_data, summed_y_data,
x_tick_labels=self.x_tick_labels,
metadata=metadata)
return Spectrum1D(np.copy(self.x_data),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this also want to be copyed?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A copy of Spectrum1DCollection would be a Spectrum1DCollection, but this method returns a Spectrum1D.

In principle we could make a copy from e.g. the first element of the collection, but then the constructor is getting called once to make the element from the y_data and again to copy it.

It feels "different enough" to justify breaking the pattern, but I don't have a strong opinion on whether it is better to do it one way or the other.

with pytest.raises(AssertionError):
check_spectrum1d(spec, spec * 2.)

def test_copy(self):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this something that wants to be fixtured?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you have a particular kind of fixture in mind?

I don't see an elegant way to parametrize all of these different operations, but pytest has a lot of features...

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, the awkward, ugly way to do it would be a list of tuples of ("attr", operation) and a loop through them, but that gets pretty ugly pretty quickly and cleanliness here is key.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could maybe define the check as a kind of teardown operation and make lots of little tests, but that's pesky in its own way.

I could tidy this a bit by defining an assert_different function that hides the pytest.raises?

Copy link
Collaborator Author

@ajjackson ajjackson Sep 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

e.g.

    def test_copy(self):
        def assert_different(spectrum, other) -> None:
            with pytest.raises(AssertionError):
                check_spectrum1dcollection(spectrum, other)
            
        spec = get_spectrum1dcollection('gan_bands.json')
        spec.metadata = {'Test': 'item', 'int': 1}

        spec_copy = spec.copy()
        # Copy should be same
        check_spectrum1dcollection(spec, spec_copy)

        # Until data is edited
        spec_copy._y_data *= 2
        assert_different(spec, spec_copy)

        spec_copy = spec.copy()
        spec_copy._x_data *= 2
        assert_different(spec, spec_copy)

        spec_copy = spec.copy()
        spec_copy.x_tick_labels = [(1, 'different')]
        assert_different(spec, spec_copy)

        spec_copy = spec.copy()
        spec_copy.metadata['Test'] = spec_copy.metadata['Test'].upper()
        assert_different(spec, spec_copy)

@ajjackson ajjackson marked this pull request as ready for review September 1, 2023 16:26
In practice we will usually want __imul__, so makes sense to define
__mul__ using it. This requires a fresh Spectrum to write into. In
principle it would be more efficient to create one with an
"empty" data array...

... but as there is lots of code making modified array copies, life is
a lot simpler with a consistent copy() method used as a base for this
stuff. If copying the data becomes a noticeable bottleneck we could
add an `empty=False` argument that allows us to construct with
e.g. `y_data=np.empty_like(self.y_data)*ureg(self.y_data_unit)`, but
for now assume YAGNI!

With the new Python/numpy/Pint versions we can use np.copy() directly
on an array Quantity, which reduces unit-shuffling boilerplate.
Shallow copies may lead to surprises when elements are mutated;
_usually_ it will contain literals (and the type-hinting says it is
_supposed_ to only take literals) but a user might put something
mutable in there.
@ajjackson
Copy link
Collaborator Author

Rebasing onto the updated tests, not expecting drama but would be better to find out here.

@ajjackson ajjackson merged commit 7042114 into master Sep 1, 2023
6 of 7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Arbitrary scale factors
3 participants