Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Online verification of artifacts #1115

Open
wagoodman opened this issue Jul 22, 2022 · 3 comments
Open

Online verification of artifacts #1115

wagoodman opened this issue Jul 22, 2022 · 3 comments
Labels
enhancement New feature or request help-wanted Extra attention is needed online Requires access to online data question Further information is requested

Comments

@wagoodman
Copy link
Contributor

This issue is meant to be a spot to host discussion on a couple of related topics:

  • should syft gather information from external sources (e.g. maven.org, pypi.org, rubygems.org, etc.) in order in enrich the information provided in the SBOM with select point-in-time external package data? For example, for all jars found, use the SHA1 of the Jar to search for an authoritative ArtifactID and GroupID for the Jar (since sometimes the packaged data is inaccurate or missing).

  • should syft verify information found within a scanned artifacts against external sources? And if so, list verification claims directly in the SBOM?

This issue has intentionally been left open-ended to gather feedback and specific use cases from the community.

@wagoodman wagoodman added enhancement New feature or request help-wanted Extra attention is needed question Further information is requested labels Jul 22, 2022
@joshbressers
Copy link
Contributor

These are great questions @wagoodman

My knee jerk reaction to this was "why wouldn't it!"

But upon further thought, I'm less certain.

Today Syft is functionally a tool for taking a snapshot of some collection of artifacts (I know it does some other things like convert between SBOM formats, but that's a different discussion, let's just pretend it only collects details right now).

I think you describe in both of these cases is a second pass operation. Pass one is to collection details, pass two is to enrich the data. If we make Syft do both of these, we will be adding A LOT of new functionality to Syft to increase complexity.

Maybe the real question is should Syft do one thing, or should Syft do many things?

There would be massive value to use this information to enrich an SBOM, I suspect we all agree we want a tool to do this. Should Syft do it or should a new tool do it?

@westonsteimel
Copy link
Contributor

I had always envisioned some other tool after syft doing the enrichment. Then you have a trail from the original sbom and an option to use the enrichment or not without complicating syft with more options to do everything

@cjnosal
Copy link

cjnosal commented Oct 28, 2022

While it looks like a decision is already made via #1158 I wanted to link to #1129 as a sample for this discussion. As some ecosystems don't commit full dependency tree info (e.g. spring boot pom.xml) into the git repo, an un-enriched scan isn't able to produce a complete and accurate sbom from the source repo.

While there's a few directions enrichment can go in, they all have tradeoffs:

  • wait for the build process (e.g. the jar manifests are more complete), but this is a slower feedback loop
  • resolve external sources similar to how the build would, adding a larger footprint to Syft and requiring user configuration
  • require the project directory to be "initialized" or have the build toolchain present, adding overhead to CI integration

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help-wanted Extra attention is needed online Requires access to online data question Further information is requested
Projects
Status: No status
Development

No branches or pull requests

4 participants