Skip to content

Commit

Permalink
Support default values in typing.List[dataclass] and typing.Dict[data…
Browse files Browse the repository at this point in the history
…class] (flyteorg#2603)

* fix: set dataclass member as optional if default value is provided

Signed-off-by: mao3267 <chenvincent610@gmail.com>

* lint

Signed-off-by: mao3267 <chenvincent610@gmail.com>

* feat: handle nested dataclass conversion in JsonParamType

Signed-off-by: mao3267 <chenvincent610@gmail.com>

* fix: handle errors caused by NoneType default value

Signed-off-by: mao3267 <chenvincent610@gmail.com>

* test: add nested dataclass unit tests

Signed-off-by: mao3267 <chenvincent610@gmail.com>

* Sagemaker dict determinism (flyteorg#2597)

* truncate sagemaker agent outputs

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* fix tests and update agent output

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* lint

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* fix test

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* add idempotence token to workflow

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* fix type

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* fix mixin

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* modify output handler

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* make the dictionary deterministic

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* nit

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

---------

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>
Signed-off-by: mao3267 <chenvincent610@gmail.com>

* refactor(core): Enhance return type extraction logic (flyteorg#2598)

Signed-off-by: Kevin Su <pingsutw@apache.org>
Signed-off-by: mao3267 <chenvincent610@gmail.com>

* Feat: Make exception raised by external command authenticator more actionable (flyteorg#2594)

Signed-off-by: Fabio Grätz <fabiogratz@googlemail.com>
Co-authored-by: Fabio Grätz <fabiogratz@googlemail.com>
Signed-off-by: mao3267 <chenvincent610@gmail.com>

* Fix: Properly re-raise non-grpc exceptions during refreshing of proxy-auth credentials in auth interceptor (flyteorg#2591)

Signed-off-by: Fabio Grätz <fabiogratz@googlemail.com>
Co-authored-by: Fabio Grätz <fabiogratz@googlemail.com>
Signed-off-by: mao3267 <chenvincent610@gmail.com>

* validate idempotence token length in subsequent tasks (flyteorg#2604)

* validate idempotence token length in subsequent tasks

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* remove redundant param

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* add tests

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

---------

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>
Signed-off-by: mao3267 <chenvincent610@gmail.com>

* Add nvidia-l4 gpu accelerator (flyteorg#2608)

Signed-off-by: Eduardo Apolinario <eapolinario@users.noreply.github.com>
Co-authored-by: Eduardo Apolinario <eapolinario@users.noreply.github.com>
Signed-off-by: mao3267 <chenvincent610@gmail.com>

* eliminate redundant literal conversion for `Iterator[JSON]` type (flyteorg#2602)

* eliminate redundant literal conversion for  type

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* add test

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* lint

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* add isclass check

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

---------

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>
Signed-off-by: mao3267 <chenvincent610@gmail.com>

* [FlyteSchema] Fix numpy problems (flyteorg#2619)

Signed-off-by: Future-Outlier <eric901201@gmail.com>
Signed-off-by: mao3267 <chenvincent610@gmail.com>

* add nim plugin (flyteorg#2475)

* add nim plugin

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* move nim to inference

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* import fix

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* fix port

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* add pod_template method

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* add containers

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* update

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* clean up

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* remove cloud import

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* fix extra config

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* remove decorator

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* add tests, update readme

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* add env

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* add support for lora adapter

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* minor fixes

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* add startup probe

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* increase failure threshold

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* remove ngc secret group

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* move plugin to flytekit core

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* fix docs

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* remove hf group

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* modify podtemplate import

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* fix import

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* fix ngc api key

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* fix tests

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* fix formatting

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* lint

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* docs fix

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* docs fix

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* update secrets interface

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* add secret prefix

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* fix tests

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* add urls

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* add urls

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* remove urls

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* minor modifications

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* remove secrets prefix; add failure threshold

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* add hard-coded prefix

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* add comment

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* make secrets prefix a required param

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* move nim to flytekit plugin

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* update readme

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* update readme

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* update readme

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

---------

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>
Signed-off-by: mao3267 <chenvincent610@gmail.com>

* [Elastic/Artifacts] Pass through model card (flyteorg#2575)

Signed-off-by: Yee Hing Tong <wild-endeavor@users.noreply.github.com>
Signed-off-by: mao3267 <chenvincent610@gmail.com>

* Remove pyarrow as a direct dependency (flyteorg#2228)

Signed-off-by: Thomas J. Fan <thomasjpfan@gmail.com>
Signed-off-by: mao3267 <chenvincent610@gmail.com>

* Boolean flag to show local container logs to the terminal (flyteorg#2521)

Signed-off-by: aditya7302 <aditya7302@gmail.com>
Signed-off-by: Kevin Su <pingsutw@apache.org>
Co-authored-by: Kevin Su <pingsutw@apache.org>
Signed-off-by: mao3267 <chenvincent610@gmail.com>

* Enable Ray Fast Register (flyteorg#2606)

Signed-off-by: Jan Fiedler <jan@union.ai>
Signed-off-by: mao3267 <chenvincent610@gmail.com>

* [Artifacts/Elastic] Skip partitions (flyteorg#2620)

Signed-off-by: Yee Hing Tong <wild-endeavor@users.noreply.github.com>
Signed-off-by: mao3267 <chenvincent610@gmail.com>

* Install flyteidl from master in plugins tests (flyteorg#2621)

Signed-off-by: Eduardo Apolinario <eapolinario@users.noreply.github.com>
Co-authored-by: Eduardo Apolinario <eapolinario@users.noreply.github.com>
Signed-off-by: mao3267 <chenvincent610@gmail.com>

* Using ParamSpec to show underlying typehinting (flyteorg#2617)

Signed-off-by: JackUrb <jack@datologyai.com>
Signed-off-by: mao3267 <chenvincent610@gmail.com>

* Support ArrayNode mapping over Launch Plans (flyteorg#2480)

* set up array node

Signed-off-by: Paul Dittamo <pvdittamo@gmail.com>

* wip array node task wrapper

Signed-off-by: Paul Dittamo <pvdittamo@gmail.com>

* support function like callability

Signed-off-by: Paul Dittamo <pvdittamo@gmail.com>

* temp check in some progress on python func wrapper

Signed-off-by: Paul Dittamo <pvdittamo@gmail.com>

* only support launch plans in new array node class for now

Signed-off-by: Paul Dittamo <pvdittamo@gmail.com>

* add map task array node implementation wrapper

Signed-off-by: Paul Dittamo <pvdittamo@gmail.com>

* ArrayNode only supports LPs for now

Signed-off-by: Paul Dittamo <pvdittamo@gmail.com>

* support local execute for new array node implementation

Signed-off-by: Paul Dittamo <pvdittamo@gmail.com>

* add local execute unit tests for array node

Signed-off-by: Paul Dittamo <pvdittamo@gmail.com>

* set exeucution version in array node spec

Signed-off-by: Paul Dittamo <pvdittamo@gmail.com>

* check input types for local execute

Signed-off-by: Paul Dittamo <pvdittamo@gmail.com>

* remove code that is un-needed for now

Signed-off-by: Paul Dittamo <pvdittamo@gmail.com>

* clean up array node class

Signed-off-by: Paul Dittamo <pvdittamo@gmail.com>

* improve naming

Signed-off-by: Paul Dittamo <pvdittamo@gmail.com>

* clean up

Signed-off-by: Paul Dittamo <pvdittamo@gmail.com>

* utilize enum execution mode to set array node execution path

Signed-off-by: Paul Dittamo <pvdittamo@gmail.com>

* default execution mode to FULL_STATE for new array node class

Signed-off-by: Paul Dittamo <pvdittamo@gmail.com>

* support min_successes for new array node

Signed-off-by: Paul Dittamo <pvdittamo@gmail.com>

* add map task wrapper unit test

Signed-off-by: Paul Dittamo <pvdittamo@gmail.com>

* set min successes for array node map task wrapper

Signed-off-by: Paul Dittamo <pvdittamo@gmail.com>

* update docstrings

Signed-off-by: Paul Dittamo <pvdittamo@gmail.com>

* Install flyteidl from master in plugins tests

Signed-off-by: Eduardo Apolinario <eapolinario@users.noreply.github.com>

* lint

Signed-off-by: Paul Dittamo <pvdittamo@gmail.com>

* clean up min success/ratio setting

Signed-off-by: Paul Dittamo <pvdittamo@gmail.com>

* lint

Signed-off-by: Paul Dittamo <pvdittamo@gmail.com>

* make array node class callable

Signed-off-by: Paul Dittamo <pvdittamo@gmail.com>

---------

Signed-off-by: Paul Dittamo <pvdittamo@gmail.com>
Signed-off-by: Eduardo Apolinario <eapolinario@users.noreply.github.com>
Co-authored-by: Eduardo Apolinario <eapolinario@users.noreply.github.com>
Signed-off-by: mao3267 <chenvincent610@gmail.com>

* Richer printing for some artifact objects (flyteorg#2624)

Signed-off-by: Yee Hing Tong <wild-endeavor@users.noreply.github.com>
Signed-off-by: mao3267 <chenvincent610@gmail.com>

* ci: Add Python 3.9 to build matrix (flyteorg#2622)

Signed-off-by: Kevin Su <pingsutw@apache.org>
Signed-off-by: Eduardo Apolinario <eapolinario@users.noreply.github.com>
Signed-off-by: Future-Outlier <eric901201@gmail.com>
Co-authored-by: Eduardo Apolinario <eapolinario@users.noreply.github.com>
Co-authored-by: Future-Outlier <eric901201@gmail.com>
Signed-off-by: mao3267 <chenvincent610@gmail.com>

* bump (flyteorg#2627)

Signed-off-by: Yee Hing Tong <wild-endeavor@users.noreply.github.com>
Signed-off-by: mao3267 <chenvincent610@gmail.com>

* Added alt prefix head to FlyteFile.new_remote (flyteorg#2601)

* Added alt prefix head to FlyteFile.new_remote

Signed-off-by: pryce-turner <pryce.turner@gmail.com>

* Added get_new_path method to FileAccessProvider, fixed new_remote method of FlyteFile

Signed-off-by: pryce-turner <pryce.turner@gmail.com>

* Updated tests and added new path creator to FlyteFile/Dir new_remote methods

Signed-off-by: pryce-turner <pryce.turner@gmail.com>

* Improved docstrings, fixed minor path sep bug, more descriptive naming, better test

Signed-off-by: pryce-turner <pryce.turner@gmail.com>

* Formatting

Signed-off-by: pryce-turner <pryce.turner@gmail.com>

---------

Signed-off-by: pryce-turner <pryce.turner@gmail.com>
Signed-off-by: mao3267 <chenvincent610@gmail.com>

* Feature gate for FlyteMissingReturnValueException (flyteorg#2623)

Signed-off-by: Kevin Su <pingsutw@apache.org>
Signed-off-by: mao3267 <chenvincent610@gmail.com>

* Remove use of multiprocessing from the OAuth client (flyteorg#2626)

* Remove use of multiprocessing from the OAuth client

Signed-off-by: Robert Deaton <robert.deaton@freenome.com>

* Lint

Signed-off-by: Robert Deaton <robert.deaton@freenome.com>

---------

Signed-off-by: Robert Deaton <robert.deaton@freenome.com>
Signed-off-by: mao3267 <chenvincent610@gmail.com>

* Update codespell in precommit to version 2.3.0 (flyteorg#2630)

Signed-off-by: mao3267 <chenvincent610@gmail.com>

* Fix Snowflake Agent Bug (flyteorg#2605)

* fix snowflake agent bug

Signed-off-by: Future-Outlier <eric901201@gmail.com>

* a work version

Signed-off-by: Future-Outlier <eric901201@gmail.com>

* Snowflake work version

Signed-off-by: Future-Outlier <eric901201@gmail.com>

* fix secret encode

Signed-off-by: Future-Outlier <eric901201@gmail.com>

* all works, I am so happy

Signed-off-by: Future-Outlier <eric901201@gmail.com>

* improve additional protocol

Signed-off-by: Future-Outlier <eric901201@gmail.com>

* fix tests

Signed-off-by: Future-Outlier <eric901201@gmail.com>

* Fix Tests

Signed-off-by: Future-Outlier <eric901201@gmail.com>

* update agent

Signed-off-by: Kevin Su <pingsutw@apache.org>

* Add snowflake test

Signed-off-by: Kevin Su <pingsutw@apache.org>

* nit

Signed-off-by: Kevin Su <pingsutw@apache.org>

* sd

Signed-off-by: Kevin Su <pingsutw@apache.org>

* snowflake loglinks

Signed-off-by: Future-Outlier <eric901201@gmail.com>

* add metadata

Signed-off-by: Future-Outlier <eric901201@gmail.com>

* secret

Signed-off-by: Kevin Su <pingsutw@apache.org>

* nit

Signed-off-by: Kevin Su <pingsutw@apache.org>

* remove table

Signed-off-by: Future-Outlier <eric901201@gmail.com>

* add comment for get private key

Signed-off-by: Future-Outlier <eric901201@gmail.com>

* update comments:

Signed-off-by: Future-Outlier <eric901201@gmail.com>

* Fix Tests

Signed-off-by: Future-Outlier <eric901201@gmail.com>

* update comments

Signed-off-by: Future-Outlier <eric901201@gmail.com>

* update comments

Signed-off-by: Future-Outlier <eric901201@gmail.com>

* Better Secrets

Signed-off-by: Future-Outlier <eric901201@gmail.com>

* use union secret

Signed-off-by: Future-Outlier <eric901201@gmail.com>

* Update Changes

Signed-off-by: Future-Outlier <eric901201@gmail.com>

* use if not get_plugin().secret_requires_group()

Signed-off-by: Future-Outlier <eric901201@gmail.com>

* Use Union SDK

Signed-off-by: Future-Outlier <eric901201@gmail.com>

* Update

Signed-off-by: Future-Outlier <eric901201@gmail.com>

* Fix Secrets

Signed-off-by: Future-Outlier <eric901201@gmail.com>

* Fix Secrets

Signed-off-by: Future-Outlier <eric901201@gmail.com>

* remove pacakge.json

Signed-off-by: Future-Outlier <eric901201@gmail.com>

* lint

Signed-off-by: Future-Outlier <eric901201@gmail.com>

* add snowflake-connector-python

Signed-off-by: Future-Outlier <eric901201@gmail.com>

* fix test_snowflake

Signed-off-by: Future-Outlier <eric901201@gmail.com>

* Try to fix tests

Signed-off-by: Future-Outlier <eric901201@gmail.com>

* fix tests

Signed-off-by: Future-Outlier <eric901201@gmail.com>

* Try Fix snowflake Import

Signed-off-by: Future-Outlier <eric901201@gmail.com>

* snowflake test passed

Signed-off-by: Future-Outlier <eric901201@gmail.com>

---------

Signed-off-by: Future-Outlier <eric901201@gmail.com>
Signed-off-by: Kevin Su <pingsutw@apache.org>
Co-authored-by: Kevin Su <pingsutw@apache.org>
Signed-off-by: mao3267 <chenvincent610@gmail.com>

* run test_missing_return_value on python 3.10+ (flyteorg#2637)

Signed-off-by: Kevin Su <pingsutw@apache.org>
Signed-off-by: mao3267 <chenvincent610@gmail.com>

* [Elastic] Fix context usage and apply fix to fork method (flyteorg#2628)

Signed-off-by: Yee Hing Tong <wild-endeavor@users.noreply.github.com>
Signed-off-by: mao3267 <chenvincent610@gmail.com>

* Add flytekit-omegaconf plugin (flyteorg#2299)

* add flytekit-hydra

Signed-off-by: mg515 <miha.garafolj@gmail.com>

* fix small typo readme

Signed-off-by: mg515 <miha.garafolj@gmail.com>

* ruff ruff

Signed-off-by: mg515 <miha.garafolj@gmail.com>

* lint more

Signed-off-by: mg515 <miha.garafolj@gmail.com>

* rename plugin into flytekit-omegaconf

Signed-off-by: mg515 <miha.garafolj@gmail.com>

* lint sort imports

Signed-off-by: mg515 <miha.garafolj@gmail.com>

* use flytekit logger

Signed-off-by: mg515 <miha.garafolj@gmail.com>

* use flytekit logger #2

Signed-off-by: mg515 <miha.garafolj@gmail.com>

* fix typing info in is_flatable

Signed-off-by: mg515 <miha.garafolj@gmail.com>

* use default_factory instead of mutable default value

Signed-off-by: mg515 <miha.garafolj@gmail.com>

* add python3.11 and python3.12 to setup.py

Signed-off-by: mg515 <miha.garafolj@gmail.com>

* make fmt

Signed-off-by: mg515 <miha.garafolj@gmail.com>

* define error message only once

Signed-off-by: mg515 <miha.garafolj@gmail.com>

* add docstring

Signed-off-by: mg515 <miha.garafolj@gmail.com>

* remove GenericEnumTransformer and tests

Signed-off-by: mg515 <miha.garafolj@gmail.com>

* fallback to TypeEngine.get_transformer(node_type) to find suitable transformer

Signed-off-by: mg515 <miha.garafolj@gmail.com>

* explicit valueerrors instead of asserts

Signed-off-by: mg515 <miha.garafolj@gmail.com>

* minor style improvements

Signed-off-by: mg515 <miha.garafolj@gmail.com>

* remove obsolete warnings

Signed-off-by: mg515 <miha.garafolj@gmail.com>

* import flytekit logger instead of instantiating our own

Signed-off-by: mg515 <miha.garafolj@gmail.com>

* docstrings in reST format

Signed-off-by: mg515 <miha.garafolj@gmail.com>

* refactor transformer mode

Signed-off-by: mg515 <miha.garafolj@gmail.com>

* improve docs

Signed-off-by: mg515 <miha.garafolj@gmail.com>

* refactor dictconfig class into smaller methods

Signed-off-by: mg515 <miha.garafolj@gmail.com>

* add unit tests for dictconfig transformer

Signed-off-by: mg515 <miha.garafolj@gmail.com>

* refactor of parse_type_description()

Signed-off-by: mg515 <miha.garafolj@gmail.com>

* add omegaconf plugin to pythonbuild.yaml

---------

Signed-off-by: mg515 <miha.garafolj@gmail.com>
Signed-off-by: Eduardo Apolinario <eapolinario@users.noreply.github.com>
Co-authored-by: Eduardo Apolinario <eapolinario@users.noreply.github.com>
Signed-off-by: mao3267 <chenvincent610@gmail.com>

* Adds extra-index-url to default image builder (flyteorg#2636)

Signed-off-by: Thomas J. Fan <thomasjpfan@gmail.com>
Co-authored-by: Kevin Su <pingsutw@apache.org>
Signed-off-by: mao3267 <chenvincent610@gmail.com>

* reference_task should inherit from PythonTask (flyteorg#2643)

Signed-off-by: Kevin Su <pingsutw@apache.org>
Signed-off-by: mao3267 <chenvincent610@gmail.com>

* Fix Get Agent Secret Using Key (flyteorg#2644)

Signed-off-by: Future-Outlier <eric901201@gmail.com>
Signed-off-by: mao3267 <chenvincent610@gmail.com>

* fix: prevent converting Flyte types as custom dataclasses

Signed-off-by: mao3267 <chenvincent610@gmail.com>

* fix: add None to output type

Signed-off-by: mao3267 <chenvincent610@gmail.com>

* test: add unit test for nested dataclass inputs

Signed-off-by: mao3267 <chenvincent610@gmail.com>

* test: add unit tests for nested dataclass, dataclass default value as None, and flyte type exceptions

Signed-off-by: mao3267 <chenvincent610@gmail.com>

* fix: handle NoneType as default value of list type dataclass members

Signed-off-by: mao3267 <chenvincent610@gmail.com>

* fix: add comments for `has_nested_dataclass` function

Signed-off-by: mao3267 <chenvincent610@gmail.com>

* fix: make lint

Signed-off-by: mao3267 <chenvincent610@gmail.com>

* fix: update tests regarding input through file and pipe

Signed-off-by: mao3267 <chenvincent610@gmail.com>

* Make JsonParamType convert faster

Signed-off-by: Future-Outlier <eric901201@gmail.com>

* make has_nested_dataclass func more clean and add tests for dataclass_with_optional_fields

Signed-off-by: Future-Outlier <eric901201@gmail.com>

* make logic more backward compatible

Signed-off-by: Future-Outlier <eric901201@gmail.com>

* fix: handle indexing errors in dict/list while checking nested dataclass, add comments

Signed-off-by: mao3267 <chenvincent610@gmail.com>

---------

Signed-off-by: mao3267 <chenvincent610@gmail.com>
Co-authored-by: Kevin Su <pingsutw@apache.org>
Co-authored-by: Future-Outlier <eric901201@gmail.com>
  • Loading branch information
3 people authored Aug 26, 2024
1 parent 54f0a46 commit 83b90fa
Show file tree
Hide file tree
Showing 7 changed files with 315 additions and 4 deletions.
9 changes: 8 additions & 1 deletion flytekit/core/type_engine.py
Original file line number Diff line number Diff line change
Expand Up @@ -360,6 +360,7 @@ def assert_type(self, expected_type: Type[DataClassJsonMixin], v: T):

expected_type = get_underlying_type(expected_type)
expected_fields_dict = {}

for f in dataclasses.fields(expected_type):
expected_fields_dict[f.name] = f.type

Expand Down Expand Up @@ -539,11 +540,13 @@ def _get_origin_type_in_annotation(self, python_type: Type[T]) -> Type[T]:
field.type = self._get_origin_type_in_annotation(field.type)
return python_type

def _fix_structured_dataset_type(self, python_type: Type[T], python_val: typing.Any) -> T:
def _fix_structured_dataset_type(self, python_type: Type[T], python_val: typing.Any) -> T | None:
# In python 3.7, 3.8, DataclassJson will deserialize Annotated[StructuredDataset, kwtypes(..)] to a dict,
# so here we convert it back to the Structured Dataset.
from flytekit.types.structured import StructuredDataset

if python_val is None:
return python_val
if python_type == StructuredDataset and type(python_val) == dict:
return StructuredDataset(**python_val)
elif get_origin(python_type) is list:
Expand Down Expand Up @@ -575,9 +578,13 @@ def _make_dataclass_serializable(self, python_val: T, python_type: Type[T]) -> t
return self._make_dataclass_serializable(python_val, get_args(python_type)[0])

if hasattr(python_type, "__origin__") and get_origin(python_type) is list:
if python_val is None:
return None
return [self._make_dataclass_serializable(v, get_args(python_type)[0]) for v in cast(list, python_val)]

if hasattr(python_type, "__origin__") and get_origin(python_type) is dict:
if python_val is None:
return None
return {
k: self._make_dataclass_serializable(v, get_args(python_type)[1])
for k, v in cast(dict, python_val).items()
Expand Down
43 changes: 42 additions & 1 deletion flytekit/interaction/click_types.py
Original file line number Diff line number Diff line change
@@ -1,11 +1,12 @@
import dataclasses
import datetime
import enum
import json
import logging
import os
import pathlib
import typing
from typing import cast
from typing import cast, get_args

import rich_click as click
import yaml
Expand All @@ -22,6 +23,7 @@
from flytekit.types.file import FlyteFile
from flytekit.types.iterator.json_iterator import JSONIteratorTransformer
from flytekit.types.pickle.pickle import FlytePickleTransformer
from flytekit.types.schema.types import FlyteSchema


def is_pydantic_basemodel(python_type: typing.Type) -> bool:
Expand Down Expand Up @@ -305,11 +307,50 @@ def convert(
if value is None:
raise click.BadParameter("None value cannot be converted to a Json type.")

FLYTE_TYPES = [FlyteFile, FlyteDirectory, StructuredDataset, FlyteSchema]

def has_nested_dataclass(t: typing.Type) -> bool:
"""
Recursively checks whether the given type or its nested types contain any dataclass.
This function is typically called with a dictionary or list type and will return True if
any of the nested types within the dictionary or list is a dataclass.
Note:
- A single dataclass will return True.
- The function specifically excludes certain Flyte types like FlyteFile, FlyteDirectory,
StructuredDataset, and FlyteSchema from being considered as dataclasses. This is because
these types are handled separately by Flyte and do not need to be converted to dataclasses.
Args:
t (typing.Type): The type to check for nested dataclasses.
Returns:
bool: True if the type or its nested types contain a dataclass, False otherwise.
"""

if dataclasses.is_dataclass(t):
# FlyteTypes is not supported now, we can support it in the future.
return t not in FLYTE_TYPES

return any(has_nested_dataclass(arg) for arg in get_args(t))

parsed_value = self._parse(value, param)

# We compare the origin type because the json parsed value for list or dict is always a list or dict without
# the covariant type information.
if type(parsed_value) == typing.get_origin(self._python_type) or type(parsed_value) == self._python_type:
# Indexing the return value of get_args will raise an error for native dict and list types.
# We don't support native list/dict types with nested dataclasses.
if get_args(self._python_type) == ():
return parsed_value
elif isinstance(parsed_value, list) and has_nested_dataclass(get_args(self._python_type)[0]):
j = JsonParamType(get_args(self._python_type)[0])
return [j.convert(v, param, ctx) for v in parsed_value]
elif isinstance(parsed_value, dict) and has_nested_dataclass(get_args(self._python_type)[1]):
j = JsonParamType(get_args(self._python_type)[1])
return {k: j.convert(v, param, ctx) for k, v in parsed_value.items()}

return parsed_value

if is_pydantic_basemodel(self._python_type):
Expand Down
3 changes: 3 additions & 0 deletions tests/flytekit/unit/cli/pyflyte/my_wf_input.json
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,9 @@
},
"p": "None",
"q": "tests/flytekit/unit/cli/pyflyte/testdata",
"r": [{"i": 1, "a": ["h", "e"]}],
"s": {"x": {"i": 1, "a": ["h", "e"]}},
"t": {"i": [{"i":1,"a":["h","e"]}]},
"remote": "tests/flytekit/unit/cli/pyflyte/testdata",
"image": "tests/flytekit/unit/cli/pyflyte/testdata"
}
17 changes: 17 additions & 0 deletions tests/flytekit/unit/cli/pyflyte/my_wf_input.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -30,5 +30,22 @@ o:
- tests/flytekit/unit/cli/pyflyte/testdata/df.parquet
p: 'None'
q: tests/flytekit/unit/cli/pyflyte/testdata
r:
- i: 1
a:
- h
- e
s:
x:
i: 1
a:
- h
- e
t:
i:
- i: 1
a:
- h
- e
remote: tests/flytekit/unit/cli/pyflyte/testdata
image: tests/flytekit/unit/cli/pyflyte/testdata
6 changes: 6 additions & 0 deletions tests/flytekit/unit/cli/pyflyte/test_run.py
Original file line number Diff line number Diff line change
Expand Up @@ -201,6 +201,12 @@ def test_pyflyte_run_cli(workflow_file):
"Any",
"--q",
DIR_NAME,
"--r",
json.dumps([{"i": 1, "a": ["h", "e"]}]),
"--s",
json.dumps({"x": {"i": 1, "a": ["h", "e"]}}),
"--t",
json.dumps({"i": [{"i":1,"a":["h","e"]}]}),
],
catch_exceptions=False,
)
Expand Down
13 changes: 11 additions & 2 deletions tests/flytekit/unit/cli/pyflyte/workflow.py
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,9 @@ class MyDataclass(DataClassJsonMixin):
i: int
a: typing.List[str]

@dataclass
class NestedDataclass(DataClassJsonMixin):
i: typing.List[MyDataclass]

class Color(enum.Enum):
RED = "RED"
Expand All @@ -61,8 +64,11 @@ def print_all(
o: typing.Dict[str, typing.List[FlyteFile]],
p: typing.Any,
q: FlyteDirectory,
r: typing.List[MyDataclass],
s: typing.Dict[str, MyDataclass],
t: NestedDataclass,
):
print(f"{a}, {b}, {c}, {d}, {e}, {f}, {g}, {h}, {i}, {j}, {k}, {l}, {m}, {n}, {o}, {p}, {q}")
print(f"{a}, {b}, {c}, {d}, {e}, {f}, {g}, {h}, {i}, {j}, {k}, {l}, {m}, {n}, {o}, {p}, {q}, {r}, {s}, {t}")


@task
Expand Down Expand Up @@ -93,14 +99,17 @@ def my_wf(
o: typing.Dict[str, typing.List[FlyteFile]],
p: typing.Any,
q: FlyteDirectory,
r: typing.List[MyDataclass],
s: typing.Dict[str, MyDataclass],
t: NestedDataclass,
remote: pd.DataFrame,
image: StructuredDataset,
m: dict = {"hello": "world"},
) -> Annotated[StructuredDataset, subset_cols]:
x = get_subset_df(df=remote) # noqa: shown for demonstration; users should use the same types between tasks
show_sd(in_sd=x)
show_sd(in_sd=image)
print_all(a=a, b=b, c=c, d=d, e=e, f=f, g=g, h=h, i=i, j=j, k=k, l=l, m=m, n=n, o=o, p=p, q=q)
print_all(a=a, b=b, c=c, d=d, e=e, f=f, g=g, h=h, i=i, j=j, k=k, l=l, m=m, n=n, o=o, p=p, q=q, r=r, s=s, t=t)
return x


Expand Down
Loading

0 comments on commit 83b90fa

Please sign in to comment.