Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reachable slices + Bug fixes #656

Merged
merged 8 commits into from
Oct 21, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .github/workflows/repotests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -155,6 +155,8 @@ jobs:
bin/cdxgen.js -p -t java repotests/java-sec-code -o bomresults/bom-java-sec-code.json --required-only
bin/cdxgen.js -p -t java repotests/java-sec-code -o bomresults/bom-java-sec-code.json --filter postgres --filter json
bin/cdxgen.js -p -t java repotests/java-sec-code -o bomresults/bom-java-sec-code.json --only spring
bin/cdxgen.js -p -t java repotests/java-sec-code -o repotests/java-sec-code/bom.json --deep
node bin/evinse.js -i repotests/java-sec-code/bom.json -o bomresults/java-sec-code.evinse.json -l java --with-reachables -p repotests/java-sec-code
bin/cdxgen.js -p -r -t java repotests/shiftleft-java-example -o bomresults/bom-java.json --generate-key-and-sign
node bin/evinse.js -i bomresults/bom-java.json -o bomresults/bom-java.evinse.json -l java --with-data-flow -p repotests/shiftleft-java-example
SBOM_SIGN_ALGORITHM=RS512 SBOM_SIGN_PRIVATE_KEY=bomresults/private.key SBOM_SIGN_PUBLIC_KEY=bomresults/public.key bin/cdxgen.js -p -r -t github repotests/shiftleft-java-example -o bomresults/bom-github.json
Expand Down
2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -367,6 +367,7 @@ cdxgen can retain the dependency tree under the `dependencies` attribute for a s
| USE_GOSUM | Set to `true` or `1` to generate BOMs for golang projects using go.sum as the dependency source of truth, instead of go.mod |
| CDXGEN_TIMEOUT_MS | Default timeout for known execution involving maven, gradle or sbt |
| CDXGEN_SERVER_TIMEOUT_MS | Default timeout in server mode |
| CDXGEN_MAX_BUFFER | Max buffer for stdout and stderr. Defaults to 100MB |
| CLJ_CMD | Set to override the clojure cli command |
| LEIN_CMD | Set to override the leiningen command |
| SBOM_SIGN_ALGORITHM | Signature algorithm. Some valid values are RS256, RS384, RS512, PS256, PS384, PS512, ES256 etc |
Expand All @@ -377,6 +378,7 @@ cdxgen can retain the dependency tree under the `dependencies` attribute for a s
| CDX_MAVEN_INCLUDE_TEST_SCOPE | Whether test scoped dependencies should be included from Maven projects, Default: true |
| ASTGEN_IGNORE_DIRS | Comma separated list of directories to ignore while analyzing using babel. The environment variable is also used by atom and astgen. |
| ASTGEN_IGNORE_FILE_PATTERN | Ignore regex to use |
| PYPI_URL | Override pypi url. Default: https://pypi.org/pypi/ |

## Plugins

Expand Down
10 changes: 10 additions & 0 deletions bin/evinse.js
Original file line number Diff line number Diff line change
Expand Up @@ -98,6 +98,12 @@ const args = yargs(hideBin(process.argv))
default: false,
type: "boolean"
})
.option("with-reachables", {
description:
"Enable auto-tagged reachable slicing. Requires SBOM generated with --deep mode.",
default: false,
type: "boolean"
})
.option("usages-slices-file", {
description: "Use an existing usages slices file.",
default: "usages.slices.json"
Expand All @@ -106,6 +112,10 @@ const args = yargs(hideBin(process.argv))
description: "Use an existing data-flow slices file.",
default: "data-flow.slices.json"
})
.option("reachables-slices-file", {
description: "Use an existing reachables slices file.",
default: "reachables.slices.json"
})
.option("print", {
alias: "p",
type: "boolean",
Expand Down
22 changes: 20 additions & 2 deletions binary.js
Original file line number Diff line number Diff line change
@@ -1,5 +1,11 @@
import { platform as _platform, arch as _arch, tmpdir } from "node:os";
import { existsSync, mkdtempSync, readFileSync, rmSync } from "node:fs";
import { platform as _platform, arch as _arch, tmpdir, homedir } from "node:os";
import {
existsSync,
mkdirSync,
mkdtempSync,
readFileSync,
rmSync
} from "node:fs";
import { join, dirname, basename } from "node:path";
import { spawnSync } from "node:child_process";
import { PackageURL } from "packageurl-js";
Expand Down Expand Up @@ -284,6 +290,13 @@ export const getOSPackages = (src) => {
const allTypes = new Set();
if (TRIVY_BIN) {
let imageType = "image";
const trivyCacheDir = join(homedir(), ".cache", "trivy");
try {
mkdirSync(join(trivyCacheDir, "db"), { recursive: true });
mkdirSync(join(trivyCacheDir, "java-db"), { recursive: true });
} catch (err) {
// ignore errors
}
if (existsSync(src)) {
imageType = "rootfs";
}
Expand All @@ -292,12 +305,17 @@ export const getOSPackages = (src) => {
const args = [
imageType,
"--skip-db-update",
"--skip-java-db-update",
"--offline-scan",
"--skip-files",
"**/*.jar",
"--no-progress",
"--exit-code",
"0",
"--format",
"cyclonedx",
"--cache-dir",
trivyCacheDir,
"--output",
bomJsonFile
];
Expand Down
69 changes: 64 additions & 5 deletions docker.js
Original file line number Diff line number Diff line change
Expand Up @@ -323,6 +323,9 @@ export const parseImageName = (fullImageName) => {
fullImageName = fullImageName.replace(":" + nameObj.tag, "");
}
}
if (fullImageName && fullImageName.startsWith("library/")) {
fullImageName = fullImageName.replace("library/", "");
}
// The left over string is the repo name
nameObj.repo = fullImageName;
return nameObj;
Expand All @@ -333,7 +336,9 @@ export const parseImageName = (fullImageName) => {
*/
export const getImage = async (fullImageName) => {
let localData = undefined;
let pullData = undefined;
const { repo, tag, digest } = parseImageName(fullImageName);
let repoWithTag = `${repo}:${tag !== "" ? tag : ":latest"}`;
// Fetch only the latest tag if none is specified
if (tag === "" && digest === "") {
fullImageName = fullImageName + ":latest";
Expand Down Expand Up @@ -379,6 +384,14 @@ export const getImage = async (fullImageName) => {
}
}
}
try {
localData = await makeRequest(`images/${repoWithTag}/json`);
if (localData) {
return localData;
}
} catch (err) {
// ignore
}
try {
localData = await makeRequest(`images/${repo}/json`);
} catch (err) {
Expand All @@ -397,7 +410,7 @@ export const getImage = async (fullImageName) => {
}
// If the data is not available locally
try {
const pullData = await makeRequest(
pullData = await makeRequest(
`images/create?fromImage=${fullImageName}`,
"POST"
);
Expand All @@ -415,15 +428,42 @@ export const getImage = async (fullImageName) => {
return undefined;
}
} catch (err) {
// continue regardless of error
try {
if (DEBUG_MODE) {
console.log(`Re-trying the pull with the name ${repoWithTag}.`);
}
pullData = await makeRequest(
`images/create?fromImage=${repoWithTag}`,
"POST"
);
} catch (err) {
// continue regardless of error
}
}
try {
if (DEBUG_MODE) {
console.log(`Trying with ${repo}`);
console.log(`Trying with ${repoWithTag}`);
}
localData = await makeRequest(`images/${repoWithTag}/json`);
if (localData) {
return localData;
}
localData = await makeRequest(`images/${repo}/json`);
} catch (err) {
try {
if (DEBUG_MODE) {
console.log(`Trying with ${repo}`);
}
localData = await makeRequest(`images/${repo}/json`);
if (localData) {
return localData;
}
} catch (err) {
// continue regardless of error
}
try {
if (DEBUG_MODE) {
console.log(`Trying with ${fullImageName}`);
}
localData = await makeRequest(`images/${fullImageName}/json`);
} catch (err) {
// continue regardless of error
Expand Down Expand Up @@ -701,7 +741,26 @@ export const exportImage = async (fullImageName) => {
})
);
} catch (err) {
console.error(err);
if (localData && localData.Id) {
console.log(`Retrying with ${localData.Id}`);
try {
await stream.pipeline(
client.stream(`images/${localData.Id}/get`),
x({
sync: true,
preserveOwner: false,
noMtime: true,
noChmod: true,
strict: true,
C: tempDir,
portable: true,
onwarn: () => {}
})
);
} catch (err) {
console.log(err);
}
}
}
}
// Continue with extracting the layers
Expand Down
29 changes: 27 additions & 2 deletions docs/ADVANCED.md
Original file line number Diff line number Diff line change
Expand Up @@ -133,13 +133,18 @@ Options:
directory. Useful to improve the recall for cal
lstack evidence. [boolean] [default: false]
--annotate Include contents of atom slices as annotations
[boolean] [default: true]
[boolean] [default: false]
--with-data-flow Enable inter-procedural data-flow slicing.
[boolean] [default: false]
--with-reachables Enable auto-tagged reachable slicing. Requires
SBOM generated with --deep mode.
[boolean] [default: false]
--usages-slices-file Use an existing usages slices file.
[default: "usages.slices.json"]
--data-flow-slices-file Use an existing data-flow slices file.
[default: "data-flow.slices.json"]
--reachables-slices-file Use an existing reachables slices file.
[default: "reachables.slices.json"]
-p, --print Print the evidences as table [boolean]
--version Show version number [boolean]
-h Show help [boolean]
Expand All @@ -151,18 +156,38 @@ To generate an SBOM with evidence for a java project.
evinse -i bom.json -o bom.evinse.json <path to the application>
```

By default, only occurrence evidences are determined by creating usages slices. To generate callstack evidence, pass `--with-data-flow`
By default, only occurrence evidences are determined by creating usages slices. To generate callstack evidence, pass either `--with-data-flow` or `--with-reachables`.

#### Reachability-based callstack evidence

atom supports reachability-based slicing for Java applications. Two necessary prerequisites for this slicing mode are that the input SBOM must be generated in deep mode (with --deep argument) and must be placed within the application directory.

```shell
cd <path to the application>
cdxgen -t java --deep -o bom.json .
evinse -i bom.json -o bom.evinse.json --with-reachables .
```

This is because

#### Data Flow based slicing

Often reachability cannot be computed reliably due to the presence of wrapper libraries or mitigating layers. In such cases, data-flow based slicing can be used to compute callstack using a reverse reachability algorithm. This is however a time and resource-consuming operation and might even require atom to be run externally in [java mode](https://cyclonedx.github.io/cdxgen/#/ADVANCED?id=use-atom-in-java-mode).

```shell
evinse -i bom.json -o bom.evinse.json --with-data-flow <path to the application>
```

#### Performance tuning

To improve performance, you can cache the generated usages and data-flow slices file along with the bom file.

```shell
evinse -i bom.json -o bom.evinse.json --usages-slices-file usages.json --data-flow-slices-file data-flow.json --with-data-flow <path to the application>
```

#### Other languages

For JavaScript or TypeScript projects, pass `-l javascript`.

```shell
Expand Down
2 changes: 2 additions & 0 deletions docs/ENV.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@ The following environment variables are available to configure the bom generatio
| FETCH_LICENSE | Set this variable to `true` or `1` to fetch license information from the registry. npm and golang |
| USE_GOSUM | Set to `true` or `1` to generate BOMs for golang projects using go.sum as the dependency source of truth, instead of go.mod |
| CDXGEN_TIMEOUT_MS | Default timeout for known execution involving maven, gradle or sbt |
| CDXGEN_MAX_BUFFER | Max buffer for stdout and stderr. Defaults to 100MB |
| CDXGEN_SERVER_TIMEOUT_MS | Default timeout in server mode |
| CLJ_CMD | Set to override the clojure cli command |
| LEIN_CMD | Set to override the leiningen command |
Expand All @@ -36,3 +37,4 @@ The following environment variables are available to configure the bom generatio
| CDX_MAVEN_INCLUDE_TEST_SCOPE | Whether test scoped dependencies should be included from Maven projects, Default: true |
| ASTGEN_IGNORE_DIRS | Comma separated list of directories to ignore while analyzing using babel. The environment variable is also used by atom and astgen. |
| ASTGEN_IGNORE_FILE_PATTERN | Ignore regex to use |
| PYPI_URL | Override pypi url. Default: https://pypi.org/pypi/ |
Loading
Loading