Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for fetching meta data from deps.dev #1457

Open
wants to merge 9 commits into
base: main
Choose a base branch
from
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@ public class MetaModel implements Serializable {
private Component component;
private String latestVersion;
private Date publishedTimestamp;
private String sourceRepository;

public MetaModel(){
}
Expand All @@ -54,4 +55,11 @@ public Date getPublishedTimestamp() {
public void setPublishedTimestamp(final Date publishedTimestamp) {
this.publishedTimestamp = publishedTimestamp;
}

public String getSourceRepository() {
return sourceRepository;
}
public void setSourceRepository(String sourceRepository) {
this.sourceRepository = sourceRepository;
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,115 @@
/*
* This file is part of Dependency-Track.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*
* SPDX-License-Identifier: Apache-2.0
* Copyright (c) OWASP Foundation. All Rights Reserved.
*/
package org.dependencytrack.repometaanalyzer.repositories;

import java.io.IOException;
import java.io.InputStream;
import java.net.URLEncoder;
import java.nio.charset.StandardCharsets;
import java.util.Iterator;
import java.util.Optional;

import org.apache.http.HttpEntity;
import org.apache.http.HttpStatus;
import org.apache.http.client.methods.CloseableHttpResponse;
import org.dependencytrack.persistence.model.Component;
import org.dependencytrack.persistence.model.RepositoryType;
import org.dependencytrack.repometaanalyzer.model.MetaModel;
import org.dependencytrack.repometaanalyzer.util.PurlUtil;
import org.json.JSONArray;
import org.json.JSONException;
import org.json.JSONObject;
import org.json.JSONTokener;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

import com.github.packageurl.PackageURL;

public class DepsDevMetaAnalyzer extends AbstractMetaAnalyzer {
private static final Logger LOGGER = LoggerFactory.getLogger(DepsDevMetaAnalyzer.class);
private static final String DEFAULT_BASE_URL = "https://api.deps.dev/v3alpha";
private static final String API_URL = "/purl/%s";
private static final String SOURCE_REPO = "SOURCE_REPO";

DepsDevMetaAnalyzer() {
this.baseUrl = DEFAULT_BASE_URL;
}

@Override
public boolean isApplicable(Component component) {
return component.getPurl() != null;
}

/**
* {@inheritDoc}
*/
public RepositoryType supportedRepositoryType() {
return null; // Supported values for type are cargo, golang, maven, npm, nuget and pypi.
}
Comment on lines +59 to +64
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The analyzer would still need to be registered with RepositoryAnalyzerFactory, based on the PURL types it supports:

private static final Map<String, Supplier<IMetaAnalyzer>> ANALYZER_SUPPLIERS = Map.of(
PackageURL.StandardTypes.COMPOSER, ComposerMetaAnalyzer::new,
PackageURL.StandardTypes.GEM, GemMetaAnalyzer::new,
PackageURL.StandardTypes.GOLANG, GoModulesMetaAnalyzer::new,
PackageURL.StandardTypes.HEX, HexMetaAnalyzer::new,
PackageURL.StandardTypes.MAVEN, MavenMetaAnalyzer::new,
PackageURL.StandardTypes.NPM, NpmMetaAnalyzer::new,
PackageURL.StandardTypes.NUGET, NugetMetaAnalyzer::new,
PackageURL.StandardTypes.PYPI, PypiMetaAnalyzer::new,
PackageURL.StandardTypes.CARGO, CargoMetaAnalyzer::new,
"cpan", CpanMetaAnalyzer::new
);

The current model assumes at most one analyzer per PURL type though, so we'd need to adjust this to support multiple. Question then is, should one take priority over the other? Do we execute all applicable analyzers, and if so, how do we merge all results back into one?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if perhaps we should switch to deps.dev entirely for all public components. The internal status is already provided to the repository meta analyzer as per Protobuf definition:

// Whether the component is internal to the organization.
// Internal components will only be looked up in internal repositories.
optional bool internal = 2;

In that case, we would not only source the repository from deps.dev, but also:

  • Latest version
  • Publish timestamp of latest version
  • Publish timestamp of current version
  • Hashes of current version (not sure if deps.dev provides that?)

I think that would be simpler than trying to support multiple analyzers per PURL.

Does that sound reasonable? Granted that change would be a bit larger.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nscuro took me a while to review. Yes, thats a good point. So do you think the Map can hold List<Supplier<IMetaAnalyzer>> and that we add some priority mechanism to IMetaAnalyzer? What we could also do is having DepsDevMetaAnalyzer as a base class for all the other MetaAnayzer. This would result in having a basic implementation for all PURL typs in DepsDevMetaAnalyzer which can be specialised and overridden. Then using the same ANALYZER_SUPPLIERS-map we can select and migrate step by step for each purl type.

The analyzer would still need to be registered with RepositoryAnalyzerFactory, based on the PURL types it supports:

private static final Map<String, Supplier<IMetaAnalyzer>> ANALYZER_SUPPLIERS = Map.of(
PackageURL.StandardTypes.COMPOSER, ComposerMetaAnalyzer::new,
PackageURL.StandardTypes.GEM, GemMetaAnalyzer::new,
PackageURL.StandardTypes.GOLANG, GoModulesMetaAnalyzer::new,
PackageURL.StandardTypes.HEX, HexMetaAnalyzer::new,
PackageURL.StandardTypes.MAVEN, MavenMetaAnalyzer::new,
PackageURL.StandardTypes.NPM, NpmMetaAnalyzer::new,
PackageURL.StandardTypes.NUGET, NugetMetaAnalyzer::new,
PackageURL.StandardTypes.PYPI, PypiMetaAnalyzer::new,
PackageURL.StandardTypes.CARGO, CargoMetaAnalyzer::new,
"cpan", CpanMetaAnalyzer::new
);

The current model assumes at most one analyzer per PURL type though, so we'd need to adjust this to support multiple. Question then is, should one take priority over the other? Do we execute all applicable analyzers, and if so, how do we merge all results back into one?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What we could also do is having DepsDevMetaAnalyzer as a base class for all the other MetaAnayzer. This would result in having a basic implementation for all PURL typs in DepsDevMetaAnalyzer which can be specialised and overridden. Then using the same ANALYZER_SUPPLIERS-map we can select and migrate step by step for each purl type.

Yes, I really like this idea!

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case, we would not only source the repository from deps.dev, but also:

  • Latest version
  • Publish timestamp of latest version
  • Publish timestamp of current version
  • Hashes of current version (not sure if deps.dev provides that?)

DevDeps accepts purls with and without version tags but returns different content:
If the version is not part of the purl we get a json object containing information about all available versions. If the version is part of the purl, we get the information about the given version enriched with additional data such as SOURCE_REPO. From this we could extract

  • Given version
  • Publish timestamp of this version
  • version's source repo (if provided)

In DT we have purls with version tags. Finding out the latest version of a package would be complicated given the way DevDeps purlLookup works. We would have to query DevDeps twice, once without version tag and a second time with version tag. Probably not a good idea...


/**
* {@inheritDoc}
*/
public MetaModel analyze(final Component component) {
final MetaModel meta = new MetaModel(component);
final PackageURL purl = component.getPurl();
if (purl != null) {
PackageURL coords = PurlUtil.silentPurlCoordinatesOnly(purl);
if (coords != null) {
String encodedCoords = URLEncoder.encode(coords.canonicalize(), StandardCharsets.UTF_8);
final String url = String.format(baseUrl + API_URL, encodedCoords);
try (final CloseableHttpResponse response = processHttpRequest(url)) {
if (response.getStatusLine().getStatusCode() == HttpStatus.SC_OK &&
response.getEntity() != null) {
Optional<String> sourceRepo = extractSourceRepo(response.getEntity());
sourceRepo.ifPresent(meta::setSourceRepository);
}
} catch (IOException | JSONException e) {
handleRequestException(LOGGER, e);
}
}
}
return meta;
}

private Optional<String> extractSourceRepo(HttpEntity entity) throws IOException, JSONException {
try (InputStream in = entity.getContent()) {
JSONObject version = new JSONObject(new JSONTokener(in)).getJSONObject("version");

// Try to read the repo url from the links section
JSONArray links = version.getJSONArray("links");
if (links != null) {
Iterator<Object> it = links.iterator();
while(it.hasNext()) {
JSONObject link = (JSONObject)it.next();
if (SOURCE_REPO.equals(link.getString("label"))) {
return Optional.of(link.getString("url"));
}
}
}
}
return Optional.empty();
}

@Override
public String getName() {
return this.getClass().getSimpleName();
}

}
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,8 @@
import com.github.packageurl.MalformedPackageURLException;
import com.github.packageurl.PackageURL;

import static com.github.packageurl.PackageURLBuilder.aPackageURL;

public final class PurlUtil {

private PurlUtil() {
Expand Down Expand Up @@ -57,4 +59,24 @@ public static PackageURL parsePurlCoordinatesWithoutVersion(final String purl) {
""", e);
}
}

/**
* @param original the purl
* @return the purl coordinates or null
*/
public static PackageURL silentPurlCoordinatesOnly(final PackageURL original) {
if (original == null) {
return null;
}
try {
return aPackageURL()
.withType(original.getType())
.withNamespace(original.getNamespace())
.withName(original.getName())
.withVersion(original.getVersion())
.build();
} catch (MalformedPackageURLException e) {
return null;
}
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
/*
* This file is part of Dependency-Track.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*
* SPDX-License-Identifier: Apache-2.0
* Copyright (c) OWASP Foundation. All Rights Reserved.
*/
package org.dependencytrack.repometaanalyzer.repositories;

import org.apache.http.impl.client.HttpClients;
import org.dependencytrack.persistence.model.Component;
import org.dependencytrack.repometaanalyzer.model.MetaModel;
import org.junit.Assert;
import org.junit.jupiter.api.Test;
import org.junit.jupiter.api.BeforeEach;

class DepsDevMetaAnalyzerTest {
private static IMetaAnalyzer analyzer;

@BeforeEach
void beforeEach() {
analyzer = new DepsDevMetaAnalyzer();
analyzer.setHttpClient(HttpClients.createDefault());
}

@Test
void testRepoFound() {
Component component = new Component();
component.setPurl("pkg:maven/com.googlecode.owasp-java-html-sanitizer/java10-shim@20240325.1");

Assert.assertTrue(analyzer.isApplicable(component));
Assert.assertNull(analyzer.supportedRepositoryType());
MetaModel metaModel = analyzer.analyze(component);
Assert.assertEquals("https://github.com/OWASP/java-html-sanitizer", metaModel.getSourceRepository());
}

@Test
void testRepoNotFound() {
Component component = new Component();
component.setPurl("pkg:maven/org.apache.httpcomponents/httpclient@4.5.14");

Assert.assertTrue(analyzer.isApplicable(component));
Assert.assertNull(analyzer.supportedRepositoryType());
MetaModel metaModel = analyzer.analyze(component);
Assert.assertNull(metaModel.getSourceRepository());
}
}
Loading