Fixed version parsing in AnacondaChannel #944

sfc-gh-pczajka · 2024-03-27T16:34:45Z

Pre-review checklist

I've confirmed that instructions included in README.md are still correct after my changes in the codebase.
I've added or updated automated unit tests to verify correctness of my new code.
I've added or updated integration tests to verify correctness of my new code.
I've confirmed that my changes are working by executing CLI's commands manually.
I've confirmed that my changes are up-to-date with the target branch.
I've described my changes in the release notes.
I've described my changes in the section below.

Changes description

Change AnacondaChannel to support multiple versions of a package
Use packaging.requirements to check whether package version fulfills specs
Moved AnacondaChannel tests to test_anaconda.py instead of test_utils
Fix package comparison in snowpark.common. _get_snowflake_packages_delta

sfc-gh-turbaszek · 2024-03-29T09:30:14Z

src/snowflake/cli/plugins/snowpark/package/anaconda.py

+        self._packages = {
+            _standarize_name(package_name): versions
+            for package_name, versions in packages.items()
+        }


We are performing for loop here and from_snowflake I think it would be much better to encapsulate this logic in a single place (probably in from_snowflake). Using for loop instead of comprehensions may be better in such case. It will also make tests much cleaner (although you need to adjust them).

sfc-gh-turbaszek · 2024-03-29T09:30:57Z

src/snowflake/cli/plugins/snowpark/package/anaconda.py

+            return False
+
+    def package_latest_version(self, package: Requirement) -> str | None:
+        package_name = _standarize_name(package.name)


We are using this every time we work with package. Should we consider creating our own custom Requirement that would implement this as property?

Done. I've added pre-commit hook checking whether requirements library is not used instead

Have you pushed the changes?

sfc-gh-turbaszek · 2024-03-29T09:31:31Z

src/snowflake/cli/plugins/snowpark/package/anaconda.py

+            versions = {parse(v) for v in self._packages[package_name]}
+        except InvalidVersion:
+            versions = self._packages[package_name]
+        return str(max(versions))


Are we sure the max work properly here?

For parsed versions - yes
For non-pep508 - might work not so good. The Anaconda example was jpeg package, which was available in versions 9c, 9d and 9e.

As it is only used in in snowpark package lookup, I think good alternative might be simply list all possible versions for every package, WDYT?

@sfc-gh-turbaszek changed in 45bd493

sfc-gh-turbaszek · 2024-03-29T09:32:00Z

src/snowflake/cli/plugins/snowpark/package_utils.py

+        for line in requirements_file.read_text(
+            file_size_limit_mb=DEFAULT_SIZE_LIMIT_MB
+        ).splitlines():


Isn't there readlines method?

Not in pathlib - I'd have to convert that back to "with file.open(...)"

sfc-gh-turbaszek · 2024-03-29T09:37:31Z

src/snowflake/cli/plugins/snowpark/package_utils.py

+            file_size_limit_mb=DEFAULT_SIZE_LIMIT_MB
+        ).splitlines():
+            # remove comments
+            line = line.split("#")[0].strip()


Wouldn't re.sub be a bit cleaner?

sfc-gh-turbaszek · 2024-03-29T09:38:08Z

src/snowflake/cli/plugins/snowpark/package_utils.py

+            return [
+                Requirement.parse(req).to_name_and_version()
+                for line in f
+                if (req := line.split("#")[0].strip())


Duplicated code fragment, consider introducing a method for cleaning comments.

After refactor it is used only in snowflake.cli.plugins.snowpark.package_utils.parse_requirements

sfc-gh-turbaszek · 2024-03-29T09:39:08Z

tests/snowpark/test_anaconda.py

+
+
+@patch("snowflake.cli.plugins.snowpark.package.anaconda.requests")
+def test_anaconda_packages_streamlit(mock_requests):


What's the purpose of this test?

¯\(ツ)/¯
I got it moved from test_common, it looks like a duplicate of test_anaconda_packages. Removed
My guess is that it was checking out some bugfix

sfc-gh-turbaszek · 2024-04-04T14:13:11Z

src/snowflake/cli/plugins/snowpark/commands.py


            download_result = package_utils.download_unavailable_packages(
                requirements=requirements,
                target_dir=packages_dir,
                ignore_anaconda=ignore_anaconda,
+                anaconda=anaconda,


Does this method requires anaconda or packages available in anaconda? If the latter is true the we can do something like

anaconda_packages = [] if ignore_anaconda else AnacondaChannel.from_snowflake().packages download_result = package_utils.download_unavailable_packages(anaconda_packages=anaconda_packages, ...)

In this way we have clear interface. WDYT?

Refactored - I've added AnacondaChannel.empty(), which ignores all packages. This way download_unavailable_packages can forget about ignore_anacoda=true case

Nice and clean 🚀

sfc-gh-turbaszek · 2024-04-04T14:13:29Z

src/snowflake/cli/plugins/snowpark/commands.py

-                    download_result.packages_available_in_anaconda,
+                _write_snowflake_requirements_file(
+                    file_path=snowpark_paths.snowflake_requirements_file,
+                    anaconda=anaconda,


Same here, do we really need anaconda?

Refactored - I moved this to AnacondaChannel

sfc-gh-turbaszek · 2024-04-04T14:14:23Z

src/snowflake/cli/plugins/snowpark/common.py

    imports: List[str],
    stage_artifact_file: str,
 ) -> bool:
    import logging

    log = logging.getLogger(__name__)
    resource_json = _convert_resource_details_to_dict(current_state)
-    anaconda_packages = resource_json["packages"]
+    deployed_packages = resource_json["packages"]
    log.info(
        "Found %s defined Anaconda packages in deployed %s...",


Suggested change

"Found %s defined Anaconda packages in deployed %s...",

"Found %d defined Anaconda packages in deployed %s...",

Nit which we can fix using this occasion

sfc-gh-turbaszek · 2024-04-04T14:15:11Z

src/snowflake/cli/plugins/snowpark/common.py


-    if updated_package_list:
-        diff = len(updated_package_list) - len(anaconda_packages)
+    if _snowflake_requirements_differ(deployed_packages, packages):


What are the packages? Are those required_packages?

It is contents of requirements.snowflake.txt, which is packages argument for create or replace. It is named packages through the whole deploy logic

Should we name this more appropriately? I think ambiguous names were one of the main problems of this part of code

changed to snowflake_dependencies

sfc-gh-turbaszek · 2024-04-04T14:16:37Z

src/snowflake/cli/plugins/snowpark/models.py

-    snowflake: List[Requirement]
-    other: List[Requirement]
+    in_snowflake: List[Requirement]
+    unavailable: List[Requirement]


One of the most important changes, love it! 🚀

…parsed

* fix package availability parsing in Anaconda * fix non-pep508 version formats * update release notes * fix unit tests * fix dependency format in sql commands * fix calculating diff between anaconda and requirements * fix unit tests * fix unit test * review fixes part 1 * run pre-commit * Add pre-commit hook to use our implementation of Requirements * standarize name while parsing a requirement * fix unit tests IN PROGRESS * save anaconda package names in requirements.snowflake.txt * fix tests * fix unit tests * Lookup: return all possible versions if available versions cannot be parsed * smallfix * refactor * refactor anaconda * refactor * sort available versions in decreasing order * refactor ignore_anaconda logic

sfc-gh-pczajka requested a review from a team as a code owner March 27, 2024 16:34

sfc-gh-pczajka changed the title ~~fix package availability parsing in Anaconda~~ Fixed version parsing in AnacondaChannel Mar 27, 2024

sfc-gh-pczajka added 2.2.0 labels Mar 28, 2024

sfc-gh-turbaszek reviewed Mar 29, 2024

View reviewed changes

sfc-gh-pczajka added 10 commits April 2, 2024 16:06

fix package availability parsing in Anaconda

6299b18

fix non-pep508 version formats

ac0894f

update release notes

cb9457f

fix unit tests

eba4926

fix dependency format in sql commands

0ae2fd1

fix calculating diff between anaconda and requirements

69eae89

fix unit tests

23ebdbe

fix unit test

7f893db

review fixes part 1

e479ab2

run pre-commit

429b2cb

sfc-gh-pczajka force-pushed the SNOW-1250044-fix-anaconda-availability-check branch from 7985773 to 429b2cb Compare April 2, 2024 14:11

sfc-gh-pczajka added 4 commits April 2, 2024 16:33

Add pre-commit hook to use our implementation of Requirements

4e8f374

standarize name while parsing a requirement

9ec3d3a

fix unit tests IN PROGRESS

3df8cd0

Merge branch 'main' into SNOW-1250044-fix-anaconda-availability-check

14cd3ef

sfc-gh-pczajka force-pushed the SNOW-1250044-fix-anaconda-availability-check branch from 5a63679 to 98233fb Compare April 4, 2024 08:20

sfc-gh-pczajka added 2 commits April 4, 2024 14:01

save anaconda package names in requirements.snowflake.txt

0c43754

fix tests

989e81d

sfc-gh-pczajka force-pushed the SNOW-1250044-fix-anaconda-availability-check branch from 98233fb to 989e81d Compare April 4, 2024 12:02

Merge branch 'main' into SNOW-1250044-fix-anaconda-availability-check

0dbae45

sfc-gh-pczajka removed the 2.1.3 label Apr 4, 2024

fix unit tests

1decec9

sfc-gh-pczajka requested a review from sfc-gh-turbaszek April 4, 2024 12:16

sfc-gh-turbaszek reviewed Apr 4, 2024

View reviewed changes

sfc-gh-pczajka added 7 commits April 4, 2024 16:29

Lookup: return all possible versions if available versions cannot be …

45bd493

…parsed

smallfix

a53b156

Merge branch 'main' into SNOW-1250044-fix-anaconda-availability-check

071c8ec

refactor

0dfe66f

refactor anaconda

83a73f7

refactor

0cd9c13

sort available versions in decreasing order

e3b5a09

sfc-gh-pczajka requested review from sfc-gh-turbaszek April 5, 2024 13:01

sfc-gh-pczajka added 4 commits April 8, 2024 10:28

Merge branch 'main' into SNOW-1250044-fix-anaconda-availability-check

472ea75

refactor ignore_anaconda logic

a1c3ee2

Merge branch 'main' into SNOW-1250044-fix-anaconda-availability-check

0549559

Merge branch 'main' into SNOW-1250044-fix-anaconda-availability-check

b351a3a

sfc-gh-turbaszek approved these changes Apr 8, 2024

View reviewed changes

sfc-gh-pczajka merged commit 5066f3c into main Apr 8, 2024
11 checks passed

sfc-gh-pczajka deleted the SNOW-1250044-fix-anaconda-availability-check branch April 8, 2024 13:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixed version parsing in AnacondaChannel #944

Fixed version parsing in AnacondaChannel #944

sfc-gh-pczajka commented Mar 27, 2024 •

edited

Loading

sfc-gh-turbaszek Mar 29, 2024

sfc-gh-pczajka Mar 29, 2024

sfc-gh-turbaszek Mar 29, 2024

sfc-gh-pczajka Apr 2, 2024

sfc-gh-turbaszek Apr 2, 2024

sfc-gh-pczajka Apr 4, 2024

sfc-gh-turbaszek Mar 29, 2024

sfc-gh-pczajka Mar 29, 2024

sfc-gh-pczajka Apr 4, 2024

sfc-gh-turbaszek Mar 29, 2024

sfc-gh-pczajka Apr 3, 2024

sfc-gh-turbaszek Mar 29, 2024

sfc-gh-pczajka Apr 3, 2024

sfc-gh-turbaszek Mar 29, 2024

sfc-gh-pczajka Apr 3, 2024

sfc-gh-turbaszek Mar 29, 2024

sfc-gh-pczajka Mar 29, 2024 •

edited

Loading

sfc-gh-turbaszek Apr 4, 2024

sfc-gh-pczajka Apr 8, 2024

sfc-gh-turbaszek Apr 8, 2024

sfc-gh-turbaszek Apr 4, 2024

sfc-gh-pczajka Apr 5, 2024

sfc-gh-turbaszek Apr 4, 2024

sfc-gh-pczajka Apr 5, 2024

sfc-gh-pczajka Apr 5, 2024

sfc-gh-turbaszek Apr 4, 2024

sfc-gh-pczajka Apr 4, 2024

sfc-gh-turbaszek Apr 4, 2024

sfc-gh-pczajka Apr 5, 2024

sfc-gh-turbaszek Apr 4, 2024



		@patch("snowflake.cli.plugins.snowpark.package.anaconda.requests")
		def test_anaconda_packages_streamlit(mock_requests):

	"Found %s defined Anaconda packages in deployed %s...",
	"Found %d defined Anaconda packages in deployed %s...",

Fixed version parsing in AnacondaChannel #944

Fixed version parsing in AnacondaChannel #944

Conversation

sfc-gh-pczajka commented Mar 27, 2024 • edited Loading

Pre-review checklist

Changes description

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sfc-gh-pczajka Mar 29, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sfc-gh-pczajka commented Mar 27, 2024 •

edited

Loading

sfc-gh-pczajka Mar 29, 2024 •

edited

Loading