Skip to content

Commit

Permalink
Trim science pack. Support for conda (#18)
Browse files Browse the repository at this point in the history
Conda build support

Signed-off-by: Prabhu Subramanian <prabhu@appthreat.com>
  • Loading branch information
prabhu authored Oct 15, 2023
1 parent c47125d commit 5daa48a
Show file tree
Hide file tree
Showing 19 changed files with 312 additions and 655 deletions.
7 changes: 7 additions & 0 deletions .github/workflows/pr.yml
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,13 @@ jobs:
python3 -m poetry config virtualenvs.create false
python3 -m poetry install --no-cache
poetry run flake8 chenpy --count --select=E9,F63,F7,F82 --show-source --statistics
- name: Test conda
if: runner.os == 'Linux'
run: |
# Test conda build
$CONDA/bin/conda update -n base -c defaults conda
$CONDA/bin/conda install anaconda-client conda-build
$CONDA/bin/conda build --output-folder ./conda-out/ ./conda/
- uses: actions/cache@v3
with:
path: |
Expand Down
6 changes: 6 additions & 0 deletions .github/workflows/release.yml
Original file line number Diff line number Diff line change
Expand Up @@ -57,9 +57,15 @@ jobs:
if: startsWith(github.ref, 'refs/tags/')
run: |
poetry publish --build --username $PYPI_USERNAME --password $PYPI_PASSWORD
$CONDA/bin/conda update -n base -c defaults conda
$CONDA/bin/conda token set $ANACONDA_TOKEN
$CONDA/bin/conda config --set anaconda_upload yes
$CONDA/bin/conda build --output-folder ./conda-out/ ./conda/
env:
PYPI_USERNAME: ${{ secrets.PYPI_USERNAME }}
PYPI_PASSWORD: ${{ secrets.PYPI_PASSWORD }}
ANACONDA_TOKEN: ${{ secrets.ANACONDA_TOKEN }}
continue-on-error: true
- run: sha512sum target/chen.zip > target/chen.zip.sha512
- name: Create Release
if: startsWith(github.ref, 'refs/tags/')
Expand Down
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -55,3 +55,4 @@ chen.zip
**/__pycache__/
.vscode/
project/metals.sbt
conda-out/
73 changes: 68 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,23 +13,55 @@ Code Hierarchy Exploration Net (chen) is an advanced exploration toolkit for you

- Rust (For rocksdb-py compilation)

## Installation
## Getting started

chen container image has everything needed to get started.

### Interactive console

To start the interactive console, run `chennai` command.

```shell
docker run --rm -v /tmp:/tmp -v $HOME:$HOME -v $(pwd):/app:rw -it ghcr.io/appthreat/chen chennai
```

### Jupyter notebook server

```shell
docker run --rm -v /tmp:/tmp -v $HOME:$HOME -v $(pwd):/app:rw -it ghcr.io/appthreat/chen jupyter notebook --ip 0.0.0.0 --port 9000 --no-browser --allow-root
```

### Chennai server mode

`chennai` could also be run as an HTTP server.

```shell
docker run --rm -v /tmp:/tmp -v $HOME:$HOME -v $(pwd):/app:rw -p 8080:8080 -it ghcr.io/appthreat/chen chennai --server
```

**Defaults:**

- Port 8080
- Username chenadmin
- Password chenpassword

## Local Installation

```shell
# Install atom
sudo npm install -g @appthreat/atom
# Install atom and cdxgen
sudo npm install -g @appthreat/atom @cyclonedx/cdxgen --omit=optional

# Install chen from pypi
pip install appthreat-chen
```

To download the chen distribution including the science pack.
To download the chen distribution.

```shell
chen --download
```

To generate custom graphs and models with atom for data science, download the scientific pack which installs support for PyTorch ecosystem.
To generate custom graphs and models with atom for data science, download the scientific pack which installs support for PyTorch ecosystem. [conda](https://docs.conda.io/projects/conda/en/stable/user-guide/install/index.html) is recommended for best experience.

```shell
chen --download --with-science
Expand All @@ -40,11 +72,14 @@ Once the download finishes, the command will display the download location along
```shell
[21:53:36] INFO To run chennai console, add the following environment variables to your .zshrc or .bashrc:
export JAVA_OPTS="-Xmx16G"
export JAVA_TOOL_OPTIONS="-Dfile.encoding=UTF-8 -Djna.library.path=<lib dir>"
export SCALAPY_PYTHON_LIBRARY=python3.10
export CHEN_HOME=/home/user/.local/share/chen
export PATH=$PATH:/home/user/.local/share/chen/platform:/home/user/.local/share/chen/platform/bin:
```

It is important to set these environment variables without which the console commands would fail with errors.

## Running the console

Type `chennai` to launch the console.
Expand Down Expand Up @@ -110,6 +145,34 @@ Refer to the documentation site to learn more about the commands.
- TypeScript
- Python
## Troubleshooing
### Commands throw errors in chennai console
You might see errors like this in chennai console.
```shell
chennai> help
-- [E006] Not Found Error: -----------------------------------------------------
1 |help
|^^^^
|Not found: help
|-----------------------------------------------------------------------------
| Explanation (enabled by `-explain`)
|- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
| The identifier for `help` is not bound, that is,
| no declaration for this identifier can be found.
| That can happen, for example, if `help` or its declaration has either been
| misspelt or if an import is missing.
-----------------------------------------------------------------------------
1 error found
```
This error is mostly due to missing python .so (linux), .dll (windows) or .dylib (mac) file. Ensure the two environment variables below are set correctly.
- SCALAPY_PYTHON_LIBRARY - Use values such as python3.10, python3.11 based on the version installed. On Windows, there are no dots. python311
- JAVA_TOOL_OPTIONS - jna.library.path must be set to the python lib directory
## Origin of chen
chen is a fork of the popular [joern](https://github.com/joernio/joern) project. We deviate from the joern project in the following ways:
Expand Down
2 changes: 1 addition & 1 deletion build.sbt
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
name := "chen"
ThisBuild / organization := "io.appthreat"
ThisBuild / version := "0.0.17"
ThisBuild / version := "0.0.18"
ThisBuild / scalaVersion := "3.3.1"

val cpgVersion = "1.4.22"
Expand Down
114 changes: 91 additions & 23 deletions chenpy/cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,15 +3,17 @@

import argparse
import os
import shutil
import subprocess
import sys

import oras.client
from rich.progress import Progress

import chenpy.config as config
from chenpy.client import ChenDistributionRegistry
from chenpy.logger import LOG, console
from chenpy.utils import USE_SHELL, get_version, max_memory, unzip_unsafe
from chenpy.utils import USE_SHELL, check_command, get_version, max_memory, unzip_unsafe

try:
os.environ["PYTHONIOENCODING"] = "utf-8"
Expand Down Expand Up @@ -75,6 +77,9 @@ def fix_envs():
py_version = py_version.replace(".", "")
platform_dir = os.path.join(config.chen_home, "platform")
platform_bin_dir = os.path.join(config.chen_home, "platform", "bin")
pylib_dir, lpy_version = detect_python_lib_path()
if lpy_version and py_version != lpy_version:
py_version = lpy_version
if sys.platform == "win32":
LOG.info(
"To run chennai console, set the following user environment variables:"
Expand All @@ -84,21 +89,19 @@ def fix_envs():
"To run chennai console, add the following environment variables to your .zshrc or .bashrc:"
)
console.print(
f"""export JAVA_OPTS="-Xmx{max_memory}"\nexport SCALAPY_PYTHON_LIBRARY={py_version}\nexport CHEN_HOME={config.chen_home}\nexport PATH=$PATH{os.pathsep}{platform_dir + os.pathsep + platform_bin_dir + os.pathsep}"""
f'export JAVA_OPTS="-Xmx{max_memory}"\nexport JAVA_TOOL_OPTIONS="-Dfile.encoding=UTF-8 -Djna.library.path={pylib_dir}"\nexport SCALAPY_PYTHON_LIBRARY={py_version}\nexport CHEN_HOME="{config.chen_home}"\nexport PATH=$PATH{os.pathsep}"{platform_dir + os.pathsep + platform_bin_dir + os.pathsep}"'
)
if not os.getenv("JAVA_HOME"):
LOG.info(
"Ensure Java >= 17 up to 20 is installed. Set the environment variable JAVA_HOME to point the correct java directory."
)
try:
import networkx
except Exception:
LOG.info(
"Scientific dependencies missing. Please refer to the documentation to install the science pack or use the official chen container image."
"Ensure Java >= 17 up to 20 is installed. Set the environment variable JAVA_HOME to point the correct "
"java directory."
)
LOG.info(
"After setting the values, restart the terminal and type chennai to launch the console."
)
os.environ[
"JAVA_TOOL_OPTIONS"
] = f'"-Dfile.encoding=UTF-8 -Djna.library.path={pylib_dir}"'
os.environ["SCALAPY_PYTHON_LIBRARY"] = py_version
os.environ["CHEN_HOME"] = config.chen_home
os.environ["CLASSPATH"] = (
Expand All @@ -114,14 +117,17 @@ def fix_envs():
def download_chen_distribution(overwrite=False, science_pack=False):
if os.path.exists(os.path.join(config.chen_home, "platform")):
if not overwrite:
if science_pack:
install_py_modules("science")
fix_envs()
return
LOG.debug(
"Existing chen distribution at %s would be overwritten", config.chen_home
)
LOG.debug(
LOG.info(
"About to download chen distribution from %s", config.chen_distribution_url
)
req_files = []
oras_client = oras.client.OrasClient(registry=ChenDistributionRegistry())
paths_list = oras_client.pull(
target=config.chen_distribution_url,
Expand Down Expand Up @@ -155,26 +161,88 @@ def download_chen_distribution(overwrite=False, science_pack=False):
os.chmod(os.path.join(dirname, filename), 0o755)
except Exception:
pass
# Install the science pack
if req_files:
install_py_modules("database")
if science_pack:
install_py_modules("science")
fix_envs()
# Install the science pack
if req_files:
install_py_modules("database")
if science_pack:
install_py_modules("science")
fix_envs()


def install_py_modules(pack="database"):
"""
Install the required science modules
"""
LOG.debug("About to install the science pack using cpu-only configuration")
req_file = os.path.join(config.chen_home, f"chen-{pack}-requirements.txt")
if os.path.exists(req_file):
subprocess.check_call(
[sys.executable, "-m", "pip", "install", "-r", req_file],
stdout=subprocess.DEVNULL,
shell=USE_SHELL,
if check_command("conda"):
LOG.info(
"About to install the science pack using conda. A new environment called 'chenpy-local' will be created."
)
with Progress(transient=True) as progress:
conda_install_script = """conda create --name chenpy-local python=3.11 -y
conda install -n chenpy-local conda-libmamba-solver -y
conda install -n chenpy-local -c conda-forge networkx --solver=libmamba -y
conda install -n chenpy-local -c pytorch pytorch torchtext cpuonly --solver=libmamba -y
pip install pyg_lib -f https://data.pyg.org/whl/torch-2.1.0+cpu.html
conda install -n chenpy-local -c conda-forge packageurl-python nbconvert jupyter_core jupyter_client notebook --solver=libmamba -y
conda install -n chenpy-local -c conda-forge httpx websockets orjson rich appdirs psutil gitpython --solver=libmamba -y"""
for line in conda_install_script.split("\n"):
if line.strip():
task = progress.add_task(line, start=False, total=100)
subprocess.check_call(
line.split(" "),
stdout=subprocess.DEVNULL,
stderr=subprocess.DEVNULL,
shell=USE_SHELL,
)
progress.stop_task(task)
progress.update(task, completed=100)
LOG.info(
"Activate the conda environment using the command 'conda activate chenpy-local'"
)
else:
LOG.info("About to install the science pack using cpu-only configuration")
req_file = os.path.join(config.chen_home, f"chen-{pack}-requirements.txt")
if os.path.exists(req_file):
subprocess.check_call(
[sys.executable, "-m", "pip", "install", "-r", req_file],
stdout=subprocess.DEVNULL,
stderr=subprocess.DEVNULL,
shell=USE_SHELL,
)


def detect_python_lib_path():
lib_dir = ""
py_version = ""
if sys.platform == "win32":
python_exe_path = shutil.which("python")
if python_exe_path:
lib_dir = os.sep.join(python_exe_path.split(os.sep)[:-1])
if lib_dir.endswith("Scripts") or os.getenv("VIRTUAL_ENV"):
LOG.info(
"Unable to identify the directory containing python3.dll. Set the value for jna.library.path "
"in JAVA_TOOL_OPTIONS environment variable correctly."
)
lib_dir = ""
return lib_dir, py_version
cp = subprocess.run(
["python3-config", "--ldflags", "--embed"],
capture_output=True,
encoding="utf-8",
shell=USE_SHELL,
)
if cp.returncode != 0:
LOG.info("Unable to identify the python lib directory using python3-config")
return lib_dir, py_version
stdout = cp.stdout
if stdout:
tmpA = stdout.split(" ")
if tmpA and len(tmpA) > 2:
if tmpA[0].startswith("-L"):
lib_dir = os.sep.join(tmpA[0].replace("-L", "").split(os.sep)[:-2])
if tmpA[1].startswith("-l"):
py_version = tmpA[1].replace("-l", "")
return lib_dir, py_version


def main():
Expand Down
19 changes: 0 additions & 19 deletions chenpy/graph.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,6 @@
DATABASE_PACK_AVAILABLE = False

try:
import pydotplus
import torch
from torch import Tensor
from torch_geometric.data import Data
Expand All @@ -30,24 +29,6 @@
SCIENCE_PACK_AVAILABLE = False


def convert_dot(data):
"""
Function to convert dot data into graph
"""
if not data:
return None
graph_list = []
data_list = data
if isinstance(data, str):
data_list = [data]
for d in data_list:
with tempfile.NamedTemporaryFile(prefix="graph", suffix=".dot") as fp:
pydotplus.parser.parse_dot_data(d).write(fp.name)
G = nx.Graph(nx.nx_agraph.read_dot(fp))
graph_list.append(G)
return graph_list[0] if len(graph_list) == 1 else graph_list


def _hash_label(label, digest_size):
return calculate_hash(label, digest_size=digest_size)

Expand Down
2 changes: 1 addition & 1 deletion chenpy/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,13 +14,13 @@
import orjson
import pkg_resources
import psutil
import rich.progress
from packageurl import PackageURL
from packageurl.contrib import purl2url
from psutil._common import bytes2human
from rich.console import Console
from rich.json import JSON
from rich.panel import Panel
import rich.progress
from rich.progress import Progress
from rich.syntax import Syntax
from rich.table import Table
Expand Down
Loading

0 comments on commit 5daa48a

Please sign in to comment.