Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Snow 1260461 refactor snowpark build #937

Merged
merged 68 commits into from
Apr 2, 2024

Conversation

sfc-gh-pczajka
Copy link
Contributor

@sfc-gh-pczajka sfc-gh-pczajka commented Mar 26, 2024

Pre-review checklist

  • I've confirmed that instructions included in README.md are still correct after my changes in the codebase.
  • I've added or updated automated unit tests to verify correctness of my new code.
  • I've added or updated integration tests to verify correctness of my new code.
  • I've confirmed that my changes are working by executing CLI's commands manually.
  • I've confirmed that my changes are up-to-date with the target branch.
  • I've described my changes in the release notes.
  • I've described my changes in the section below.

Changes description

Cient-facing changes:

  • --pypi-download flag have no effect and will cause a warning. Create command always check against PyPi.
  • --check-anaconda-for-pypi-depts is replaced with --ignore-anaconda
  • --package-native-libraries is replaced with boolean flag --allow-shared-libraries
  • new --skip-version-check skips comparing versions of dependencies between requirements and Anaconda.
  • new --index-url flag sets up Base URL of the Python Package Index to use for package lookup.
  • more clear error messages

Major code changes:

  • refactor snow snowpark build command implementation, getting rid of snowpark_package function, adjusting flags according to https://docs.google.com/document/d/1yMaD1A5UUQCdeH6p1-IRaYsFTXZFArFAQEkD81Kd57o/edit#heading=h.4fts1bfc5rlo
  • refactor package_utils.download_packages:
    • more readable output format
    • structured logic
    • getting rid of hardcoding ".package" directory
    • reword error messages
    • changing input to List[Requirement] only -> requirements.other.txt is no longer needed
    • moving pre-check with Anaconda from commands to shared helper
  • adding more tests for snow snowpark build
  • remove dedup_and_sort_reqs, as it started conflicting with other formats of requirements (for example github url)

Base automatically changed from SNOW-1232599-refactor-package-create to main March 27, 2024 13:57
ignore_anaconda=ignore_anaconda,
skip_version_check=skip_version_check,
pip_index_url=index_url,
allow_shared_libraries=allow_shared_libraries_yesnoask,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that native libs checks happens later on, why download requires the argument?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Propably a leftover after merge - nice catch!

Comment on lines 355 to 360
allow_shared_libraries_yesnoask = {
True: YesNoAsk.YES,
False: YesNoAsk.NO,
}[allow_shared_libraries]
if deprecated_package_native_libraries != YesNoAsk.NO:
allow_shared_libraries_yesnoask = deprecated_package_native_libraries
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's try to remove this piece of code by using or in if

if package_utils.detect_and_log_shared_libraries(
                download_result.downloaded_packages_details
            )
     if not (allow_shared_libraries or resolve_allow_shared_libraries_yes_no_ask(...)):
         raise ClickException(
                    "Some packages contain shared (.so/.dll) libraries. "
                    "Try again with --allow-shared-libraries."
                )

WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense - I decided to merge them as they were passed as a single argument to the download_unavailable_packages function, but that's not a case anymore

Comment on lines 202 to 215
# The package is not in anaconda, so we have to pack it
log.info("Checking to see if packages have shared (.so/.dll) libraries...")
if detect_and_log_shared_libraries(download_result.downloaded_packages_details):
if not resolve_allow_shared_libraries_yes_no_ask(
allow_shared_libraries_yesnoask
):
raise ClickException(
"Some packages contain shared (.so/.dll) libraries. "
"Try again with --allow-shared-libraries."
)

zip_file = f"{get_package_name(name)}.zip"
zip_dir(dest_zip=Path(zip_file), source=packages_dir.path)
message = dedent(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed in the other thread we should split download from checking for native libraires. In this way code is easier to compose and test.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is already a separate part of command code (or I don't see something :D)

Copy link

gitguardian bot commented Mar 29, 2024

⚠️ GitGuardian has uncovered 1 secret following the scan of your pull request.

Please consider investigating the findings and remediating the incidents. Failure to do so may lead to compromising the associated services or software components.

🔎 Detected hardcoded secret in your pull request
GitGuardian id GitGuardian status Secret Commit Filename
- Generic Password 41d3f90 tests/test_connection.py View secret
🛠 Guidelines to remediate hardcoded secrets

The above secret(s) have been detected in your PR. Please take an appropriate action for each secret:

  • If it’s a true positive, remove the secret from source code, revoke it and migrate to a secure way of storing and accessing secrets (see http://go/secrets-and-code). Once that’s done, go to the incidents page linked in the “GitGuardian id” column (log in using SnowBiz Okta) and resolve the incident.
  • If it’s a false positive, go to the incidents page linked in the “GitGuardian id” column (log in using SnowBiz Okta) and ignore the incident.
  • If you didn’t add this secret (and only then), you can skip this check manually.

Note:

  • A secret is considered leaked from the moment it touches GitHub. Rewriting git history by force pushing or other means is not necessary and doesn’t change the fact that the secret has to be revoked.
  • This check has a “Skip: false positive” button. Don’t use it. It will mark all detected secrets as false positives but only in the context of this specific run - it won’t remember this action in subsequent check runs.

If you encounter any problems you can reach out to us on Slack: #gitguardian-secret-scanning-help


🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.

Our GitHub checks need improvements? Share your feedbacks!

sfc-gh-pczajka and others added 2 commits March 29, 2024 15:13
Co-authored-by: Tomasz Urbaszek <tomasz.urbaszek@snowflake.com>
@sfc-gh-pczajka sfc-gh-pczajka merged commit 9f1e792 into main Apr 2, 2024
11 checks passed
@sfc-gh-pczajka sfc-gh-pczajka deleted the SNOW-1260461-refactor-snowpark-build branch April 2, 2024 13:27
sfc-gh-sichen pushed a commit that referenced this pull request Oct 17, 2024
* Add ignore-anaconda flag

* rewrite download_packages to wheels

* refactor

* parse requirements from wheel files

* Read name from wheel metadata

* use wheel-convention of package name to lookup metadata

* use pip wheel instead of pip download

* fix tests

* fix user messages

* deprecate --pypi-download

* fix snowpark flow

* add --index-url flag

* add --skip-version-check flag

* add --allow-shared-libraries flag

* refactor package create command

* refactor

* fix integration tests

* fix; fix unit tests

* update release notes

* fix test_utils.py

* fix snowpark build

* (probably) fix windows integration tests

* remove get_anaconda_from_snowflake

* convert --allow-sharted-libraries flag to boolean

* add backwards compatibility support

* deprecate --pypi-download flag

* rename --check-anaconda-for-pypi-depts to --ignore-anaconda

* refactor snow snowpark build

* remove package argument from download_packages

* Update src/snowflake/cli/plugins/snowpark/package_utils.py

Co-authored-by: Tomasz Urbaszek <tomasz.urbaszek@snowflake.com>

* review fixes

* replace --package-native-libraries with --allow-shared-libraries

* add --skip-version-check and --index-url flags

* remove snowpark_package

* debug IN PROGRESS

* Add tests for snowpark build

* turn off anaconda check + add donwload mock to quicken unit tests

* remove dedup_and_sort_requirements

* get rid of requirements.other.txt

* refactor download_packages

* get rid of .packages directory

* fix tests

* update release notes

* update release notes

* fix tests

* review fixes

* update release notes

* (hopefully) fix integration test for Windows

* review fixes

* check whether Windows shared libraries detection works

* Adjust error messages; use package numpy with --ignore-anaconda flag in tests

* Update src/snowflake/cli/plugins/snowpark/commands.py

Co-authored-by: Tomasz Urbaszek <tomasz.urbaszek@snowflake.com>

* review fixes

* fix unit tests

* style fix

---------

Co-authored-by: Tomasz Urbaszek <tomasz.urbaszek@snowflake.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants