Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

adjust COPY behavior for files and directories to match desired struc… #3047

Closed
wants to merge 3 commits into from

Conversation

samhita-alla
Copy link
Contributor

@samhita-alla samhita-alla commented Jan 10, 2025

Tracking issue

Why are the changes needed?

This update modifies the COPY logic to ensure files and directories are copied directly to /root/ without preserving the entire source path structure.

What changes were proposed in this pull request?

  • Ensures files are copied directly to /root/ without preserving the full source path structure.
  • If a directory is copied, only its contents are placed in /root/ (instead of copying the directory itself).

Generated COPY commands:

COPY --chown=flytekit services/AgentService/main.py /root/
COPY --chown=flytekit common/utils.py /root/
COPY --chown=flytekit configs/ /root/configs/
COPY --chown=flytekit services/configs /root/configs/

How was this patch tested?

Setup process

Screenshots

Check all the applicable boxes

  • I updated the documentation accordingly.
  • All new and existing tests passed.
  • All commits are signed-off.

Related PRs

Docs link

Summary by Bito

This PR optimizes Docker image building by simplifying COPY command behavior in the image specification builder. The changes streamline how files are copied directly to the /root/ directory, eliminating unnecessary path nesting and ensuring a cleaner directory structure. The modification prevents full source path preservation, making Docker images more maintainable. A security concern was identified regarding potential path traversal issues in the implementation.

Unit tests added: False

Estimated effort to review (1-5, lower is better): 1

…ture

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>
Signed-off-by: Samhita Alla <aallasamhita@gmail.com>
@flyte-bot
Copy link
Contributor

flyte-bot commented Jan 10, 2025

Code Review Agent Run #83a536

Actionable Suggestions - 1
  • flytekit/image_spec/default_builder.py - 1
    • Inconsistent COPY destination paths in Dockerfile · Line 340-343
Review Details
  • Files reviewed - 1 · Commit Range: 05447c0..8805268
    • flytekit/image_spec/default_builder.py
  • Files skipped - 0
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

AI Code Review powered by Bito Logo

@flyte-bot
Copy link
Contributor

flyte-bot commented Jan 10, 2025

Changelist by Bito

This pull request implements the following key changes.

Key Change Files Impacted
Feature Improvement - Docker Image Build Optimization

default_builder.py - Modified COPY commands to simplify file and directory path handling in Docker builds

Comment on lines +340 to +343
copy_commands.append(f"COPY --chown=flytekit {src_path.as_posix()} /root/{src_path.name}/")
else:
shutil.copy(src_path, dst_path)
copy_commands.append(f"COPY --chown=flytekit {src_path.as_posix()} /root/{src_path.parent.as_posix()}/")
copy_commands.append(f"COPY --chown=flytekit {src_path.as_posix()} /root/")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconsistent COPY destination paths in Dockerfile

Consider consolidating the COPY commands to use a consistent destination path /root/ for both files and directories. Currently, directories are copied to /root/{name}/ while files are copied to /root/, which could lead to confusion.

Code suggestion
Check the AI-generated fix before applying
 -                copy_commands.append(f"COPY --chown=flytekit {src_path.as_posix()} /root/{src_path.name}/")
 -                copy_commands.append(f"COPY --chown=flytekit {src_path.as_posix()} /root/")
 +                copy_commands.append(f"COPY --chown=flytekit {src_path.as_posix()} /root/")
 +                copy_commands.append(f"COPY --chown=flytekit {src_path.as_posix()} /root/")

Code Review Run #83a536


Is this a valid issue, or was it incorrectly flagged by the Agent?

  • it was incorrectly flagged

Copy link

codecov bot commented Jan 10, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 82.79%. Comparing base (dfa8f04) to head (8805268).

Additional details and impacted files
@@             Coverage Diff             @@
##           master    #3047       +/-   ##
===========================================
+ Coverage   47.23%   82.79%   +35.55%     
===========================================
  Files         202        3      -199     
  Lines       21355      186    -21169     
  Branches     2744        0     -2744     
===========================================
- Hits        10088      154     -9934     
+ Misses      10776       32    -10744     
+ Partials      491        0      -491     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>
@flyte-bot
Copy link
Contributor

flyte-bot commented Jan 10, 2025

Code Review Agent Run #27f2b7

Actionable Suggestions - 0
Review Details
  • Files reviewed - 1 · Commit Range: 8805268..3d9f905
    • flytekit/image_spec/default_builder.py
  • Files skipped - 0
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

AI Code Review powered by Bito Logo

Copy link
Member

@thomasjpfan thomasjpfan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm unsure about doing this. For example, this would work locally but not remotely if the structure is not preserved:

MY_LOCAL_FILE = "my_dir/data/my_file.txt"

image = ImageSpec(
    copy=[MY_LOCAL_FILE]
)

@task(container_image=image)
def look_at_file() -> str:
    with open(MY_LOCAL_FILE, "r") as f:
        return f.read()

The other issue is with files with the same name, but in different directories:

image = ImageSpec(
    copy=["dir1/data/my_file.txt", "dir2/my_file.txt"]
)

@samhita-alla
Copy link
Contributor Author

samhita-alla commented Jan 10, 2025

@thomasjpfan we don't really need to preserve the path for files as the user usually expects the files to be available in the working directory. but i understand your concerns. how about we let the user specify the destination path? i can modify the code accordingly.

the file will be copied to /root, so the path must be: with open(f"/root/{MY_LOCAL_FILE}", "r") as f:

i also think we shouldn't set /root as the default because, if the base image has a working directory like /app, we would need to copy the files or directories there instead, right?

@thomasjpfan
Copy link
Member

how about we let the user specify the destination path? i can modify the code accordingly.

I'm okay with this, but I want to keep support for list[str]. Setting the destintation can end up being very verbose when setting many files.

the file will be copied to /root, so the path must be: with open(f"/root/{MY_LOCAL_FILE}", "r") as f:

I just tried this:

from flytekit import ImageSpec, task

MY_FILE = "hello.txt"

image = ImageSpec(copy=[MY_FILE], registry="localhost:30000")


@task(container_image=image)
def read() -> str:
    with open(MY_FILE) as f:
        return f.read()

where hello.txt is next to my workflow and it works.

i also think we shouldn't set /root as the default because, if the base image has a working directory like /app, we would need to copy the files or directories there instead, right?

For the image builder, it hard codes the workdir as /root. So even if the base image is /app, the image builder changes it to /root. The question becomes, "should we do this"? My concern is when the base image working directory is /, in that case, I rather change it to /root.

@samhita-alla
Copy link
Contributor Author

where hello.txt is next to my workflow and it works.

hmm.. could be because the working directory is set to root by default.

My concern is when the base image working directory is /, in that case, I rather change it to /root.

since image spec doesn't support setting a custom workdir, i agree that we need to default to /root.

i’ll close this issue for now. i think specifying destination paths is something we can plan for down the road.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants