-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixed version parsing in AnacondaChannel #944
Fixed version parsing in AnacondaChannel #944
Conversation
self._packages = { | ||
_standarize_name(package_name): versions | ||
for package_name, versions in packages.items() | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are performing for
loop here and from_snowflake
I think it would be much better to encapsulate this logic in a single place (probably in from_snowflake
). Using for
loop instead of comprehensions may be better in such case. It will also make tests much cleaner (although you need to adjust them).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
return False | ||
|
||
def package_latest_version(self, package: Requirement) -> str | None: | ||
package_name = _standarize_name(package.name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are using this every time we work with package. Should we consider creating our own custom Requirement
that would implement this as property?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. I've added pre-commit hook checking whether requirements
library is not used instead
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you pushed the changes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done :)
versions = {parse(v) for v in self._packages[package_name]} | ||
except InvalidVersion: | ||
versions = self._packages[package_name] | ||
return str(max(versions)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we sure the max work properly here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For parsed versions - yes
For non-pep508 - might work not so good. The Anaconda example was jpeg
package, which was available in versions 9c
, 9d
and 9e
.
As it is only used in in snowpark package lookup
, I think good alternative might be simply list all possible versions for every package, WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sfc-gh-turbaszek changed in 45bd493
for line in requirements_file.read_text( | ||
file_size_limit_mb=DEFAULT_SIZE_LIMIT_MB | ||
).splitlines(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't there readlines
method?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not in pathlib - I'd have to convert that back to "with file.open(...)"
file_size_limit_mb=DEFAULT_SIZE_LIMIT_MB | ||
).splitlines(): | ||
# remove comments | ||
line = line.split("#")[0].strip() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't re.sub
be a bit cleaner?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
return [ | ||
Requirement.parse(req).to_name_and_version() | ||
for line in f | ||
if (req := line.split("#")[0].strip()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Duplicated code fragment, consider introducing a method for cleaning comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After refactor it is used only in snowflake.cli.plugins.snowpark.package_utils.parse_requirements
tests/snowpark/test_anaconda.py
Outdated
|
||
|
||
@patch("snowflake.cli.plugins.snowpark.package.anaconda.requests") | ||
def test_anaconda_packages_streamlit(mock_requests): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the purpose of this test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
¯\(ツ)/¯
I got it moved from test_common
, it looks like a duplicate of test_anaconda_packages
. Removed
My guess is that it was checking out some bugfix
7985773
to
429b2cb
Compare
5a63679
to
98233fb
Compare
98233fb
to
989e81d
Compare
|
||
download_result = package_utils.download_unavailable_packages( | ||
requirements=requirements, | ||
target_dir=packages_dir, | ||
ignore_anaconda=ignore_anaconda, | ||
anaconda=anaconda, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this method requires anaconda or packages available in anaconda? If the latter is true the we can do something like
anaconda_packages = [] if ignore_anaconda else AnacondaChannel.from_snowflake().packages
download_result = package_utils.download_unavailable_packages(anaconda_packages=anaconda_packages, ...)
In this way we have clear interface. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Refactored - I've added AnacondaChannel.empty()
, which ignores all packages. This way download_unavailable_packages
can forget about ignore_anacoda=true
case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice and clean 🚀
download_result.packages_available_in_anaconda, | ||
_write_snowflake_requirements_file( | ||
file_path=snowpark_paths.snowflake_requirements_file, | ||
anaconda=anaconda, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, do we really need anaconda?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Refactored - I moved this to AnacondaChannel
imports: List[str], | ||
stage_artifact_file: str, | ||
) -> bool: | ||
import logging | ||
|
||
log = logging.getLogger(__name__) | ||
resource_json = _convert_resource_details_to_dict(current_state) | ||
anaconda_packages = resource_json["packages"] | ||
deployed_packages = resource_json["packages"] | ||
log.info( | ||
"Found %s defined Anaconda packages in deployed %s...", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Found %s defined Anaconda packages in deployed %s...", | |
"Found %d defined Anaconda packages in deployed %s...", |
Nit which we can fix using this occasion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
|
||
if updated_package_list: | ||
diff = len(updated_package_list) - len(anaconda_packages) | ||
if _snowflake_requirements_differ(deployed_packages, packages): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are the packages
? Are those required_packages
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is contents of requirements.snowflake.txt
, which is packages
argument for create or replace
. It is named packages
through the whole deploy
logic
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we name this more appropriately? I think ambiguous names were one of the main problems of this part of code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed to snowflake_dependencies
snowflake: List[Requirement] | ||
other: List[Requirement] | ||
in_snowflake: List[Requirement] | ||
unavailable: List[Requirement] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One of the most important changes, love it! 🚀
* fix package availability parsing in Anaconda * fix non-pep508 version formats * update release notes * fix unit tests * fix dependency format in sql commands * fix calculating diff between anaconda and requirements * fix unit tests * fix unit test * review fixes part 1 * run pre-commit * Add pre-commit hook to use our implementation of Requirements * standarize name while parsing a requirement * fix unit tests IN PROGRESS * save anaconda package names in requirements.snowflake.txt * fix tests * fix unit tests * Lookup: return all possible versions if available versions cannot be parsed * smallfix * refactor * refactor anaconda * refactor * sort available versions in decreasing order * refactor ignore_anaconda logic
Pre-review checklist
Changes description
packaging.requirements
to check whether package version fulfills specstest_anaconda.py
instead oftest_utils
snowpark.common. _get_snowflake_packages_delta