Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/inherit config support #39

Merged
merged 18 commits into from
Feb 29, 2024
Merged

Conversation

maronuu
Copy link
Collaborator

@maronuu maronuu commented Nov 21, 2023

Support a config file generated dynamically.

In a basic usage, we have all config files to be registered to luigi initially when starting docker container from an image, since we usually copy these files to container local.
However, in some advanced cases, we generate additional config files dynamically after starting docker container.
This kind of file is not specified in Dockerfile, so when starting child jobs, they do not have this file and failed when utilizing the dynamically generated config files.
So, this PR adds a feature where the master job passes the config info to children via gokart cache.

Currently, only one file is supported, but in the future we'll support multiple dynamic config files / other format of files.

@maronuu maronuu marked this pull request as ready for review November 21, 2023 10:47
@maronuu maronuu requested a review from Hi-king November 21, 2023 10:48
kannon/master.py Outdated Show resolved Hide resolved
kannon/master.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@yokomotod yokomotod left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's better if kannon doesn't depend TASK_WORKSPACE_DIRECTORY which is not required env var on gokart.

On the other hand, kannon already has some code based on TASK_WORKSPACE_DIRECTORY so I think it's ok to rely on it for now.

But I would like to consider about workspace_directory design consistency with gokart.

@maronuu
Copy link
Collaborator Author

maronuu commented Feb 26, 2024

We replace TASK_WORKSPACE_DIRECTORY with root_task.workspace_directory constant.

Requiring TASK_WORKSPACE_DIRECTORY is not so good since the kannon library itself limit the name and usage of the env var. This env var has been used to get the path of remote cache like GCS and S3. It can be replaced with root_task.workspace_directory for the same purpose.

Note that users are still required to specify remote workspace directory info via ini config files. When you use some env vars in them, you have to pass them to env_to_inherit.

Copy link
Member

@kitagry kitagry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@@ -56,6 +55,26 @@ def __init__(
self.task_id_to_job_name: dict[str, str] = dict()

def build(self, root_task: gokart.TaskOnKart) -> None:
# TODO: support multiple dynamic config files
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you like to support this, I think it is better to change the parameter name plural form ( dynamic_config_paths ) for compatibility. And then you assert like following

assert len(dynamic_config_paths) == 1, 'Currently kannon doesn't support multiple dynamic config files'

@maronuu maronuu merged commit 053f80b into main Feb 29, 2024
5 checks passed
@maronuu maronuu deleted the feature/inherit_config_support branch February 29, 2024 04:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants