Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JS reachables #660

Merged
merged 4 commits into from
Oct 23, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 5 additions & 1 deletion .github/workflows/repotests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -147,11 +147,14 @@ jobs:
- uses: actions/checkout@v4
with:
repository: 'hoolicorp/java-sec-code'
path: 'repotests/java-sec-code'
path: 'repotests/java-sec-code'
- uses: dtolnay/rust-toolchain@stable
- name: repotests
run: |
bin/cdxgen.js -p -t js --no-recurse -o bom.json .
bin/evinse.js -l js -i bom.json -o bom.evinse.json --with-reachables .
bin/cdxgen.js -p -t java repotests/java-sec-code -o bomresults/bom-java-sec-code.json
bin/cdxgen.js -p -t java --author foo --author bar repotests/java-sec-code -o bomresults/bom-java-sec-code.json
bin/cdxgen.js -p -t java repotests/java-sec-code -o bomresults/bom-java-sec-code.json --required-only
bin/cdxgen.js -p -t java repotests/java-sec-code -o bomresults/bom-java-sec-code.json --filter postgres --filter json
bin/cdxgen.js -p -t java repotests/java-sec-code -o bomresults/bom-java-sec-code.json --only spring
Expand All @@ -166,6 +169,7 @@ jobs:
node bin/evinse.js -i bomresults/bom-vue.json -o bomresults/bom-vue.evinse.json -l javascript --with-data-flow -p repotests/meetingsdk-vuejs-sample
CDXGEN_DEBUG_MODE=debug ASTGEN_IGNORE_DIRS="" FETCH_LICENSE=false bin/cdxgen.js -p -t js repotests/sveltejs-examples -o bomresults/bom-svelte.json
CDXGEN_DEBUG_MODE=debug ASTGEN_IGNORE_DIRS="" node bin/evinse.js -i bomresults/bom-svelte.json -o bomresults/bom-svelte.evinse.json -l javascript --with-data-flow -p repotests/sveltejs-examples
CDXGEN_DEBUG_MODE=debug ASTGEN_IGNORE_DIRS="" node bin/evinse.js -i bomresults/bom-svelte.json -o bomresults/bom-svelte.evinse.json -l javascript --with-reachables -p repotests/sveltejs-examples
FETCH_LICENSE=1 bin/cdxgen.js -p -t js repotests/shiftleft-ts-example --required-only -o bomresults/bom-ts.json --validate
FETCH_LICENSE=false bin/cdxgen.js -p -r -t go repotests/shiftleft-go-example -o bomresults/bom-go.json --validate
FETCH_LICENSE=true bin/cdxgen.js -p -r -t csharp repotests/vulnerable_net_core -o bomresults/bom-csharp2.json --validate
Expand Down
71 changes: 37 additions & 34 deletions README.md

Large diffs are not rendered by default.

8 changes: 8 additions & 0 deletions bin/cdxgen.js
Original file line number Diff line number Diff line change
Expand Up @@ -143,6 +143,7 @@ const args = yargs(hideBin(process.argv))
"Validate the generated SBOM using json schema. Defaults to true. Pass --no-validate to disable."
})
.option("evidence", {
hidden: true,
type: "boolean",
default: false,
description: "Generate SBOM with evidence for supported languages. WIP"
Expand All @@ -165,8 +166,15 @@ const args = yargs(hideBin(process.argv))
description:
"Include components only containining this word in purl. Useful to generate BOM with first party components alone. Multiple values allowed."
})
.option("author", {
description:
"The person(s) who created the BOM. Set this value if you're intending the modify the BOM and claim authorship.",
default: "OWASP Foundation"
})
.completion("completion", "Generate bash/zsh completion")
.array("filter")
.array("only")
.array("author")
.option("auto-compositions", {
type: "boolean",
default: true,
Expand Down
37 changes: 33 additions & 4 deletions bin/evinse.js
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,12 @@ import { homedir, platform as _platform } from "node:os";
import process from "node:process";
import { analyzeProject, createEvinseFile, prepareDB } from "../evinser.js";
import { validateBom } from "../validator.js";
import { printCallStack, printOccurrences, printServices } from "../display.js";
import {
printCallStack,
printOccurrences,
printServices,
printReachables
} from "../display.js";
import { findUpSync } from "find-up";
import { load as _load } from "js-yaml";

Expand Down Expand Up @@ -65,7 +70,18 @@ const args = yargs(hideBin(process.argv))
alias: "l",
description: "Application language",
default: "java",
choices: ["java", "jar", "javascript", "python", "android", "cpp"]
choices: [
"java",
"jar",
"js",
"ts",
"javascript",
"py",
"python",
"android",
"c",
"cpp"
]
})
.option("db-path", {
description: `Atom slices DB path. Default ${ATOM_DB}`,
Expand Down Expand Up @@ -121,6 +137,18 @@ const args = yargs(hideBin(process.argv))
type: "boolean",
description: "Print the evidences as table"
})
.example([
[
"$0 -i bom.json -o bom.evinse.json -l java .",
"Generate a Java SBOM with evidence for the current directory"
],
[
"$0 -i bom.json -o bom.evinse.json -l java --with-reachables .",
"Generate a Java SBOM with occurrence and reachable evidence for the current directory"
]
])
.completion("completion", "Generate bash/zsh completion")
.epilogue("for documentation, visit https://cyclonedx.github.io/cdxgen")
.config(config)
.scriptName("evinse")
.version()
Expand All @@ -129,8 +157,8 @@ const args = yargs(hideBin(process.argv))
const evinseArt = `
███████╗██╗ ██╗██╗███╗ ██╗███████╗███████╗
██╔════╝██║ ██║██║████╗ ██║██╔════╝██╔════╝
█████╗ ██║ ██║██║██╔██╗ ██║███████╗█████╗
██╔══╝ ╚██╗ ██╔╝██║██║╚██╗██║╚════██║██╔══╝
█████╗ ██║ ██║██║██╔██╗ ██║███████╗█████╗
██╔══╝ ╚██╗ ██╔╝██║██║╚██╗██║╚════██║██╔══╝
███████╗ ╚████╔╝ ██║██║ ╚████║███████║███████╗
╚══════╝ ╚═══╝ ╚═╝╚═╝ ╚═══╝╚══════╝╚══════╝
`;
Expand All @@ -151,6 +179,7 @@ console.log(evinseArt);
if (args.print) {
printOccurrences(bomJson);
printCallStack(bomJson);
printReachables(sliceArtefacts);
printServices(bomJson);
}
}
Expand Down
2 changes: 2 additions & 0 deletions bin/verify.js
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,8 @@ const args = yargs(hideBin(process.argv))
default: "public.key",
description: "Public key in PEM format. Default public.key"
})
.completion("completion", "Generate bash/zsh completion")
.epilogue("for documentation, visit https://cyclonedx.github.io/cdxgen")
.scriptName("cdx-verify")
.version()
.help("h").argv;
Expand Down
1 change: 1 addition & 0 deletions data/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,3 +18,4 @@ Contents of data directory and their purpose.
| spdx.schema.json | jsonschema for validation |
| vendor-alias.json | List to correct the group names. Used while parsing .jar files |
| wrapdb-releases.json | Database of all available meson wraps. Generated using contrib/wrapdb.py. |
| frameworks-list.json | List of string fragments to categorize components into frameworks |
128 changes: 128 additions & 0 deletions data/frameworks-list.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,128 @@
{
"all": [
"System.Web",
"System.ServiceModel",
"System.Data",
"spring",
"flask",
"django",
"beego",
"chi",
"echo",
"github.com/gin-gonic/gin",
"gorilla",
"rye",
"httprouter",
"akka",
"dropwizard",
"vertx",
"gwt",
"jax-rs",
"jax-ws",
"jsf",
"play",
"spark",
"struts",
"angular",
"react",
"next",
"ember",
"express",
"knex",
"vue",
"aiohttp",
"bottle",
"cherrypy",
"drt",
"falcon",
"hug",
"pyramid",
"sanic",
"tornado",
"vibora",
"koa",
"-sdk",
"org.apache",
"appfuse",
"drools",
"jbpm",
"activiti",
"barracuda",
"birt",
"biojava",
"bluecove",
"bouncycastle",
"cascading",
"deeplearning4j",
"eclipselink",
"geoapi",
"geotools",
"hibernate",
"hsqldb",
"ibatis",
"javassist",
"jersey",
"jetty",
"jfreechart",
"jhipster",
"jmonkeyengine",
"jsf",
"keycloak",
"liquibase",
"lwjgl",
"micronaut",
"mybatis",
"netty",
"neuroph",
"opencv",
"orientdb",
"ormlite",
"payara",
"primefaces",
"quarkus",
"quartz",
"sax",
"slf4j",
"jasper",
"spock",
"thymeleaf",
"vaadin",
"vertx",
"wildfly",
"zkoss",
"org.ow2.asm",
"backbone",
"dojo",
"ember",
"enyo",
"extjs",
"jquery",
"jqwidgets",
"knockout",
"mootools",
"prototypejs",
"qooxdoo",
"openui5",
"solidjs",
"sproutcore",
"svelte",
"wakanda",
"webix",
"github.com/aerogo/aero",
"github.com/aofei/air",
"github.com/go-the-way/anoweb",
"github.com/appist/appy",
"github.com/ungerik/go-rest",
"goa.design/goa",
"github.com/aceld/zinx",
"github.com/dolab/gogo",
"github.com/yarf-framework/yarf",
"github.com/norunners/vert",
"pkg:cargo/rocket",
"pkg:cargo/actix",
"pkg:cargo/nickel",
"pkg:cargo/yew",
"pkg:cargo/azul",
"pkg:cargo/conrod"
]
}
34 changes: 34 additions & 0 deletions display.js
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
import { existsSync, readFileSync } from "fs";
import { createStream, table } from "table";

// https://github.com/yangshun/tree-node-cli/blob/master/src/index.js
Expand Down Expand Up @@ -277,3 +278,36 @@ const recursePrint = (depMap, subtree, level, shownList, treeGraphics) => {
}
}
};

export const printReachables = (sliceArtefacts) => {
const reachablesSlicesFile = sliceArtefacts.reachablesSlicesFile;
if (!existsSync(reachablesSlicesFile)) {
return;
}
const purlCounts = {};
const reachablesSlices = JSON.parse(
readFileSync(reachablesSlicesFile, "utf-8")
);
for (const areachable of reachablesSlices.reachables || []) {
const purls = areachable.purls || [];
for (const apurl of purls) {
purlCounts[apurl] = (purlCounts[apurl] || 0) + 1;
}
}
const sortedPurls = Object.fromEntries(
Object.entries(purlCounts).sort(([, a], [, b]) => b - a)
);
const data = [["Package URL", "Reachable Flows"]];
for (const apurl of Object.keys(sortedPurls)) {
data.push([apurl, "" + sortedPurls[apurl]]);
}
const config = {
header: {
alignment: "center",
content: "Reachable Components\nGenerated with \u2665 by cdxgen"
}
};
if (data.length > 1) {
console.log(table(data, config));
}
};
49 changes: 38 additions & 11 deletions docs/ADVANCED.md
Original file line number Diff line number Diff line change
Expand Up @@ -158,21 +158,29 @@ evinse -i bom.json -o bom.evinse.json <path to the application>

By default, only occurrence evidences are determined by creating usages slices. To generate callstack evidence, pass either `--with-data-flow` or `--with-reachables`.

#### Reachability-based callstack evidence
#### Reachability-based call stack evidence

atom supports reachability-based slicing for Java applications. Two necessary prerequisites for this slicing mode are that the input SBOM must be generated in deep mode (with --deep argument) and must be placed within the application directory.
atom supports reachability-based evidence generation for Java, JavaScript, and TypeScript applications. Reachability refers to data flows that originate from entry points (sources) ending at a sink (which are invocations to external libraries). The technique used is called "Forward-Reachability".

Two necessary prerequisites for this slicing mode are that the input SBOM must be generated with cdxgen and in deep mode (only for java, jars type) and must be placed within the application directory.

```shell
cd <path to the application>
cdxgen -t java --deep -o bom.json .
evinse -i bom.json -o bom.evinse.json --with-reachables .
evinse -i bom.json -o bom.evinse.json -l java --with-reachables .
```

This is because
For JavaScript and TypeScript applications, deep mode is optional.

```shell
cd <path to the application>
cdxgen -t js -o bom.json .
evinse -i bom.json -o bom.evinse.json -l js --with-reachables .
```

#### Data Flow based slicing
#### Data flow-based call stack evidence

Often reachability cannot be computed reliably due to the presence of wrapper libraries or mitigating layers. In such cases, data-flow based slicing can be used to compute callstack using a reverse reachability algorithm. This is however a time and resource-consuming operation and might even require atom to be run externally in [java mode](https://cyclonedx.github.io/cdxgen/#/ADVANCED?id=use-atom-in-java-mode).
Often reachability cannot be computed reliably due to the presence of wrapper libraries or mitigating layers. Further, the repository being analyzed could be a common module containing only the sink methods without entry points (sources). In such cases, data-flow-based slicing can be used to compute call stack using a "Reverse-Reachability" algorithm. This is however a time and resource-consuming operation and might even require atom to be run externally in [java mode](https://cyclonedx.github.io/cdxgen/#/ADVANCED?id=use-atom-in-java-mode).

```shell
evinse -i bom.json -o bom.evinse.json --with-data-flow <path to the application>
Expand Down Expand Up @@ -241,14 +249,13 @@ cdxgen -t docker -o bom.json <image name>
Why not?

```shell
cdxgen -t js -o bom.json -p --no-recurse
evinse -i bom.json -o bom.evinse.json -l javascript
cdxgen -t js -o bom.json -p --no-recurse .
evinse -i bom.json -o bom.evinse.json -l javascript --with-reachables .

# Don't be surprised to see the service endpoint offered by cdxgen!
# Don't be surprised to see the service endpoint offered by cdxgen.
# Review the reachables.slices.json and file any vulnerabilities or bugs!
```

It is currently not possible to generate data-flow evidence for cdxgen in constant time since the graph is too large for pre-computation. If you have experience with source code analysis, please suggest some improvements to the [atom](https://github.com/AppThreat/atom) project.

## Use Atom in Java mode

For large projects (> 1 million lines of code), atom must be invoked separately for the slicing operation. Follow the instructions below.
Expand Down Expand Up @@ -284,3 +291,23 @@ ATOM_DB = join(homedir(), "AppData", "Local", ".atomdb");
// Mac
ATOM_DB = join(homedir(), "Library", "Application Support", ".atomdb");
```

## Customize metadata.authors in BOM

Use the argument `--author` to override the author name. Use double quotes when the name includes spaces. Multiple values are allowed.

```
cdxgen --author "OWASP Foundation" --author "Apache Foundation" -t java ...
```

## Generate bash/zsh command completions

Run the commands such as cdxgen, evinse etc with completion as the argument.

```shell
cdxgen completion >> ~/.zshrc

# cdxgen completion >> ~/.bashrc

# evinse completion >> ~/.zshrc
```
3 changes: 3 additions & 0 deletions docs/CLI.md
Original file line number Diff line number Diff line change
Expand Up @@ -119,6 +119,9 @@ Options:
--only Include components only containining this word in
purl. Useful to generate BOM with first party co
mponents alone. Multiple values allowed. [array]
--author The person(s) who created the BOM. Set this value
if you're intending the modify the BOM and claim
authorship.[array] [default: "OWASP Foundation"]
--auto-compositions Automatically set compositions when the BOM was f
iltered. Defaults to true
[boolean] [default: true]
Expand Down
Loading