Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move the dependency libraries listed in pyproject.toml to extra #590

Open
syou6162 opened this issue Jan 16, 2025 · 5 comments · May be fixed by #594
Open

Move the dependency libraries listed in pyproject.toml to extra #590

syou6162 opened this issue Jan 16, 2025 · 5 comments · May be fixed by #594

Comments

@syou6162
Copy link
Contributor

Currently, running pip install datacontract-cli takes a long time. For example, I mainly use datacontract-cli with dbt, but it also installs things that could be omitted from the installation, such as duckdb / opentelemetry-exporter. Also, when using renovate or dependabot, pull requests for updates appear for libraries that are not actually used.

To solve these issues, I'm thinking of moving some libraries to extra.

duckdb = [
  "duckdb>=1.0.0",
  "soda-core-duckdb>=3.4.3,<3.5.0",
]

serve = [
  "fastapi==0.115.6",
  "uvicorn==0.34.0",
]

opentelemetry = [
  "opentelemetry-exporter-otlp-proto-grpc~=1.16",
  "opentelemetry-exporter-otlp-proto-http~=1.16",
]

I think the case of duckdb is fine, but I would like to hear your opinions on how to divide the following cases as group names.

  • Subcommands like serve
  • Tool names that are unrelated to the server type (e.g. opentelemetry)
@jochenchrist
Copy link
Contributor

Agree. My thoughts:

  • Keep duckdb, which is very much used
  • An extra "web" (instead of serve) for web server and API
  • Not sure if opentelemetry exporter is used much. We could remove it completely.

@jochenchrist
Copy link
Contributor

Removal of opentelemetry integration: #593

@syou6162 syou6162 linked a pull request Jan 17, 2025 that will close this issue
4 tasks
@simonharrer
Copy link
Contributor

I would keep duckdb in the core for now. Let's focus on creating a web extra.

@syou6162
Copy link
Contributor Author

I would keep duckdb in the core for now

@simonharrer Really? I was just about to start preparing a pull request to move duckdb to extra. Can you tell me why you want to keep duckdb in core?

I also just checked that the following commit has been added to the main branch.

Because datacontract-cli supports a large number of tools for import and export, the number of required dependent libraries tends to be huge. For this reason, there are cases where it is necessary to relax dependencies to resolve conflicts, as in this commit. Adjusting these dependencies is not only a problem for the development of datacontract-cli itself, but also for people like me who use datacontract-cli as a library (in fact, I was very troubled by this dependency library adjustment today). In order to avoid this as much as possible, I think it is better to keep the core as simple as possible and move duckdb to extra.

@simonharrer
Copy link
Contributor

The existing extras are very clear to the user when to install them. You are using bigquery? You need the bigquery extra.

duckdb, however, is used internally for executing the test command for various server options. It's not like there is a DuckDB server, and you execute against it, but rather it is an internal library for the core. It's not always obvious to the user that it uses duckdb, which makes it hard to know upfront to install it. That's why I am somewhat reluctant to put it into an extra. Does this help?

Regarding the web extra, that is obviously necessary for the datacontract serve command. Very clear relationship.

Regarding the commit: we need to provide an [all] installation for the CLI and ensure that it works. That's very hard to do without setting some libraries to specific versions. And this conflicts with you requiring a more relaxed way as you want to use it as a library. Not sure how to handle that in a good way. Do you have any idea? Open for ideas here. :-)

And thanks for helping out with these dependency nightmare! Really appreciated. :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants