-
-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for fetching meta data from deps.dev #1457
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: san-zrl <san@zurich.ibm.com>
Signed-off-by: san-zrl <san@zurich.ibm.com>
feature/DepsDev analyzer
Coverage summary from CodacySee diff coverage on Codacy
Coverage variation details
Coverage variation is the difference between the coverage for the head and common ancestor commits of the pull request branch: Diff coverage details
Diff coverage is the percentage of lines that are covered by tests out of the coverable lines that the pull request added or modified: See your quality gate settings Change summary preferencesCodacy stopped sending the deprecated coverage status on June 5th, 2024. Learn more |
Thanks @n1ckl0sk0rtge! I've not forgotten about this PR, I'll try to get it reviewed this weekend! Apologies for the delay. |
/** | ||
* {@inheritDoc} | ||
*/ | ||
public RepositoryType supportedRepositoryType() { | ||
return null; // Supported values for type are cargo, golang, maven, npm, nuget and pypi. | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The analyzer would still need to be registered with RepositoryAnalyzerFactory
, based on the PURL types it supports:
Lines 33 to 44 in 941dd0f
private static final Map<String, Supplier<IMetaAnalyzer>> ANALYZER_SUPPLIERS = Map.of( | |
PackageURL.StandardTypes.COMPOSER, ComposerMetaAnalyzer::new, | |
PackageURL.StandardTypes.GEM, GemMetaAnalyzer::new, | |
PackageURL.StandardTypes.GOLANG, GoModulesMetaAnalyzer::new, | |
PackageURL.StandardTypes.HEX, HexMetaAnalyzer::new, | |
PackageURL.StandardTypes.MAVEN, MavenMetaAnalyzer::new, | |
PackageURL.StandardTypes.NPM, NpmMetaAnalyzer::new, | |
PackageURL.StandardTypes.NUGET, NugetMetaAnalyzer::new, | |
PackageURL.StandardTypes.PYPI, PypiMetaAnalyzer::new, | |
PackageURL.StandardTypes.CARGO, CargoMetaAnalyzer::new, | |
"cpan", CpanMetaAnalyzer::new | |
); |
The current model assumes at most one analyzer per PURL type though, so we'd need to adjust this to support multiple. Question then is, should one take priority over the other? Do we execute all applicable analyzers, and if so, how do we merge all results back into one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering if perhaps we should switch to deps.dev entirely for all public components. The internal
status is already provided to the repository meta analyzer as per Protobuf definition:
hyades/proto/src/main/proto/org/dependencytrack/repometaanalysis/v1/repo_meta_analysis.proto
Lines 45 to 47 in 941dd0f
// Whether the component is internal to the organization. | |
// Internal components will only be looked up in internal repositories. | |
optional bool internal = 2; |
In that case, we would not only source the repository from deps.dev, but also:
- Latest version
- Publish timestamp of latest version
- Publish timestamp of current version
- Hashes of current version (not sure if deps.dev provides that?)
I think that would be simpler than trying to support multiple analyzers per PURL.
Does that sound reasonable? Granted that change would be a bit larger.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nscuro took me a while to review. Yes, thats a good point. So do you think the Map can hold List<Supplier<IMetaAnalyzer>>
and that we add some priority mechanism to IMetaAnalyzer
? What we could also do is having DepsDevMetaAnalyzer
as a base class for all the other MetaAnayzer. This would result in having a basic implementation for all PURL typs in DepsDevMetaAnalyzer
which can be specialised and overridden. Then using the same ANALYZER_SUPPLIERS
-map we can select and migrate step by step for each purl type.
The analyzer would still need to be registered with
RepositoryAnalyzerFactory
, based on the PURL types it supports:Lines 33 to 44 in 941dd0f
private static final Map<String, Supplier<IMetaAnalyzer>> ANALYZER_SUPPLIERS = Map.of( PackageURL.StandardTypes.COMPOSER, ComposerMetaAnalyzer::new, PackageURL.StandardTypes.GEM, GemMetaAnalyzer::new, PackageURL.StandardTypes.GOLANG, GoModulesMetaAnalyzer::new, PackageURL.StandardTypes.HEX, HexMetaAnalyzer::new, PackageURL.StandardTypes.MAVEN, MavenMetaAnalyzer::new, PackageURL.StandardTypes.NPM, NpmMetaAnalyzer::new, PackageURL.StandardTypes.NUGET, NugetMetaAnalyzer::new, PackageURL.StandardTypes.PYPI, PypiMetaAnalyzer::new, PackageURL.StandardTypes.CARGO, CargoMetaAnalyzer::new, "cpan", CpanMetaAnalyzer::new ); The current model assumes at most one analyzer per PURL type though, so we'd need to adjust this to support multiple. Question then is, should one take priority over the other? Do we execute all applicable analyzers, and if so, how do we merge all results back into one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What we could also do is having DepsDevMetaAnalyzer as a base class for all the other MetaAnayzer. This would result in having a basic implementation for all PURL typs in DepsDevMetaAnalyzer which can be specialised and overridden. Then using the same ANALYZER_SUPPLIERS-map we can select and migrate step by step for each purl type.
Yes, I really like this idea!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In that case, we would not only source the repository from deps.dev, but also:
- Latest version
- Publish timestamp of latest version
- Publish timestamp of current version
- Hashes of current version (not sure if deps.dev provides that?)
DevDeps accepts purls with and without version tags but returns different content:
If the version is not part of the purl we get a json object containing information about all available versions. If the version is part of the purl, we get the information about the given version enriched with additional data such as SOURCE_REPO. From this we could extract
- Given version
- Publish timestamp of this version
- version's source repo (if provided)
In DT we have purls with version tags. Finding out the latest version of a package would be complicated given the way DevDeps purlLookup works. We would have to query DevDeps twice, once without version tag and a second time with version tag. Probably not a good idea...
Description
This PR adds basic capabilities to fetch meta data from deps.dev for a given PURL. With this code, only the related source code repositories will be fetched and stored.