Skip to content

Commit

Permalink
Merge branch 'main' of github.com:kupl/PyinderArtifact
Browse files Browse the repository at this point in the history
  • Loading branch information
오원석 authored and 오원석 committed Sep 10, 2024
2 parents cab5eef + 4e22d53 commit 9423187
Show file tree
Hide file tree
Showing 12 changed files with 781 additions and 56 deletions.
7 changes: 4 additions & 3 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
FROM ocaml/opam:ubuntu-22.04-ocaml-4.10

USER root
RUN apt-get update && apt-get install -y git python3.9 python3.10 software-properties-common python3-pip
RUN apt-get update && apt-get install -y git python3.10 software-properties-common python3-pip cloc
RUN add-apt-repository ppa:deadsnakes/ppa
RUN update-alternatives --install /usr/bin/python python /usr/bin/python3.10 1
ENV HOME /home/opam
Expand All @@ -14,7 +14,8 @@ RUN pip3 install --upgrade pip
RUN pip3 install GitPython
RUN (cd Pyinder ; pip3 install -r requirements.txt)

RUN pip3 install pyright==1.1.339 mypy==1.9.0 pytype==2024.4.11 numpy pandas
RUN pip3 install pyright==1.1.339 mypy==1.9.0 pytype==2024.4.11 numpy pandas matplotlib venn
RUN apt-get install -y python3.9

# Set up environemnt
RUN echo "alias pyinder='PYTHONPATH=${HOME}/Pyinder/..:\$PYTHONPATH python3 -m Pyinder.client.pyre'" >> /home/opam/.bashrc
Expand All @@ -28,7 +29,7 @@ RUN eval $(opam config env)
# Copy files
COPY configuration ${HOME}/configuration
COPY run ${HOME}/run
COPY run ${HOME}/eval
COPY eval ${HOME}/eval

RUN chmod +x ${HOME}/run/run.sh
RUN chmod +x ${HOME}/run/run_pyright.sh
Expand Down
276 changes: 248 additions & 28 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,12 +4,19 @@

This repository is for the implementation of Pyinder announced in the paper
"Towards Effective Static Type-Error Detection for Python" in ASE 2024.
Our tool, Pyinder, is a static type analysis tool for Python that is built on top of Pyre.

[Towards Effective Static Type-Error Detection for Python](https://drive.google.com/file/d/1t2J4fNyWScao9xwRcORBigKkO8dVJmsB/view?usp=sharing): Accepted Version

## Setup

### Prerequisites

We recommend the following environment for running Pyinder:
- **OS**: A Linux System with [Docker](https://docs.docker.com/get-docker/) support
- **Hardware**: x86 CPU; 32GB RAM; 50GB Storage
- **Hardware**: x86 CPU; 64GB RAM; 50GB Storage

(the minimum RAM requirement has been confirmed to be 32GB)

Before setting up the environment, please make sure that you have [Docker](https://docs.docker.com/get-docker/) installed.

Expand All @@ -22,8 +29,8 @@ docker --version

Clone the repository:

```
https://github.com/kupl/PyinderArtifact.git
```bash
git clone https://github.com/kupl/PyinderArtifact.git
cd PyinderArtifact
```

Expand All @@ -42,10 +49,10 @@ After building the Docker image, you can run the container:

```bash
# Run docker image
docker run --name pyinder-container --memory-reservation 24G -it pyinder:1.0
docker run --name pyinder-container --memory-reservation 32G -it pyinder:1.0
```

We recommend setting the memory reservation to 24GB for the container to fully run Pyinder.
We recommend setting the memory reservation to 32GB for the container to fully run Pyinder on the large projects.

### Clone Benchmarks and Setting Configuration

Expand Down Expand Up @@ -87,37 +94,154 @@ python run/change_core_async.py

Then, you are ready to run Pyinder and other tools!

### Run Tools
## Evaluation

#### Kick the tires
### Kick the tires

You can run all tools with specific project by `-p <project>` option:
Before running experiments on all projects, first run a simple test to check if the tools are working properly.
We provide a simple test with the [luigi](https://github.com/spotify/luigi) project:

```bash
# It will take about 30 minutes
# to run all tools with the airflow project.
# to run all tools with the luigi project.
cd ~
cd run
# Run Pyinder with airflow projects
python pyinder_run.py -p airflow
# Run Mypy with airflow projects
python mypy_run.py -p airflow
# Run Pytype with airflow projects
python pytype_run.py -p airflow
# Run Pyright with airflow projects
python pyright_run.py -p airflow
# Run tools with luigi projects
python pyinder_run.py -p luigi
python mypy_run.py -p luigi
python pytype_run.py -p luigi
python pyright_run.py -p luigi
```

You also run specific version of the project by `-n <number>` option:
---
<details>
<summary>Click to see the output</summary>

You can see the output of each tool in the console like this:

```bash
python pyinder_run.py -p airflow -n 3831
# All tools are the same as above.
# luigi-1836 is analyzed... Finished process in 77.26377391815186 seconds.
# luigi-4 is analyzed... Finished process in 85.76695346832275 seconds.
# luigi-14 is analyzed... Finished process in 74.57044434547424 seconds.
```

Even though you test specific project, you can see the results by [following these steps](#postprocess-and-understanding-the-results).
The result of each tool is stored in the `~/result/<each-tool>/<luigi-proejct>/result.json` directory (e.g., `~/result/pyinder/luigi-1836/result.json`).
</details>

---
Next, post-process the results to collect the type errors and check the results:

#### Full Evaluation
```bash
cd ~
cd run

# Change the result log file to json file
python pyinder_change_json.py
python mypy_change_json.py
python pytype_change_json.py

# Filter out the type errors from the results
python filter_error.py

# Run cloc to check the per kloc results
cd ~
cd eval
python cloc.py
```

---
<details>
<summary>Click to see the output</summary>

When you run `*_change_json.py`, you can see the output that shows the success on luigi projects:

```bash
airflow-3831 is analyzed... Failed
...
luigi-1836 is analyzed... Done!
...
luigi-4 is analyzed... Done!
luigi-14 is analyzed... Done!
...
sympy-44 is analyzed... Failed
```

After running `python filter_error.py`, you can see the filtered results in the `~/result/<each-tool>/<luigi-project>/filter_error.json` directory (e.g., `~/result/pyinder/luigi-1836/filter_error.json`).

The command `python cloc.py` makes the `~/cloc` directory that contains the results of cloc.
</details>

---
Finally, you can see the results by following these steps:

```bash
cd ~
cd eval
python check_alarm.py # show the number of alarms by each tool
python check_correct.py # show the number of detecting type errors by each tool
python check_time.py # show the time taken by each tool
```

---
<details>
<summary>Click to see the output</summary>

> Note: The results can be slightly different from the paper because the tools and [typeshed](https://github.com/python/typeshed) can be updated.
The command `python check_alarm.py` shows the number of alarms by each tool:

```bash
Project Pyinder Mypy Pyre Pytype Pyright
airflow-3831 N/A N/A N/A N/A N/A
luigi-1836 82 85 N/A 0 144
...
luigi-4 104 117 N/A 0 179
luigi-14 79 75 N/A 0 138
...
sympy-44 N/A N/A N/A N/A N/A
Total 265 277 0 0 461
Per 1k LOC 6.73 7.04 N/A 0.0 11.71
```

The command `python check_correct.py` shows the number of detecting type errors by each tool:

```bash
Project Pyinder Mypy Pyre Pytype Pyright
airflow-3831 E E E E E
...
luigi-1836 O O E X O
...
luigi-4 X X E X X
luigi-14 O X E X O
...
sympy-44 E E E E E
Correct 2 1 0 0 2
```

The command `python check_time.py` shows the time taken by each tool:

```bash
Project Pyinder Mypy Pyre Pytype Pyright
airflow-3831 N/A N/A N/A N/A N/A
...
luigi-1836 77.26 6.64 N/A 27.55 8.8
...
luigi-4 85.77 4.69 N/A 149.27 10.04
luigi-14 74.57 4.21 N/A 546.52 8.55
...
sympy-44 N/A N/A N/A N/A N/A
Total 237.6 15.54 0 723.34 27.39
Per 1k LOC 6.04 0.39 N/A 18.38 0.7
```
</details>

---
Then, you can see the results in the console or check the csv files in the `~/eval` directory.
- `alarm_result.csv`: the number of alarms by each tool
- `correct_result.csv`: the number of detecting type errors by each tool
- `time_result.csv`: the time taken by each tool

### Full Evaluation

You can run all tools with all projects by following the instructions:

Expand All @@ -130,23 +254,39 @@ python pytype_run.py
python pyright_run.py
```

- Pyinder: <18 hours
- Mypy: <2 hours
- Pytype: may take more than 3 days...
- Pyright: <4 hours
When we ran full evaluation on our machine (2x Intel(R) Xeon(R) Silver 4214, 128GB), the time was measured as follows:
- Pyinder: about 12 hours
- Mypy: about 2 hours
- Pytype: about 5 days
- Pyright: about 3 hours

Even if you skip specific tools, you can see the results by [following these steps](#postprocess-and-understanding-the-results).

#### Output

The output of each tool is stored in the `~/result/<each-tool>` directory (e.g., `~/result/pyinder/airflow-3831`).

#### Other Options

You can run all tools with specific project by `-p <project>` option:

```bash
python pyinder_run.py -p luigi
# Other tools are the same as above.
```

You also run specific version of the project by `-n <number>` option:

```bash
python pyinder_run.py -p luigi -n 1836
# Other tools are the same as above.
```

### Postprocess and Understanding the Results

All tools generate other warnings than type errors, so you need to filter out the type errors from the results.
At first, you have to change result log file to json file of each tool:


```bash
cd ~
cd run
Expand All @@ -169,5 +309,85 @@ python filter_error.py

You can see the filtered results in the `~/result/<each-tool>` directory (e.g., `~/result/pyinder/airflow-3831/filter_error.json`).

Before checking the results, you have to run cloc to check the per kloc results:

```bash
cd ~
cd eval
python cloc.py
```

Then, you can see the results by following these steps:

```bash
cd ~
cd eval
python check_alarm.py # show the number of alarms by each tool
python check_correct.py # show the number of detecting type errors by each tool
python check_time.py # show the time taken by each tool
```

The csv files are generated in the `~/eval` directory:
- `alarm_result.csv`: the number of alarms by each tool (Table 2)
- `correct_result.csv`: the number of detecting type errors by each tool (Figure 4)
- `time_result.csv`: the time taken by each tool (Table 2)
> Note : The results can be slightly different from the paper because the tools and [typeshed](https://github.com/python/typeshed) are updated.
Moreover, you can see the venn diagram (Figure 4) of the results by this script:

```bash
# in the eval directory
python draw_venn.py
```

It makes `result_venn.pdf` in the `~/eval` directory, and you can see the venn diagram of the results.

## Install Pyre

Before you install [Pyre](https://github.com/facebook/pyre-check), make sure not to install Pyinder (because Pyinder is built on the top of Pyre, so it causes linking issues...)

### Installation

You can install Pyre easily:

```bash
pip install pyre-check
```

You can see detailed instructions in officiatl documents ([Installation](https://pyre-check.org/docs/installation/))

### Run Pyre

You can run Pyre in a similar way to the other tools:

```bash
cd ~
cd run
python pyre_run.py
```

### See the Results

You have to do postprocess in a similary way to the other tools:

```bash
cd ~
cd run
python pyre_change_json.py
python filter_error.py
```

Then, you can see the results:

```bash
cd ~
cd eval
python cloc.py
python check_alarm.py
python check_correct.py
python check_time.py
```

## Contact

## Install Pyre
If you have any questions, please contact us at [wonseok_oh@korea.ac.kr](mailto:wonseok_oh@korea.ac.kr)
4 changes: 2 additions & 2 deletions configuration/config/typebugs/luigi-1836/pytype.cfg
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
[pytype]

# Space-separated list of files or directories to process.
inputs = typebugs/luigi-1836/luigi, typebugs/luigi-1836/tests
inputs = typebugs/luigi-1836/luigi, typebugs/luigi-1836/test

# Keep going past errors to analyze as many files as possible.
keep_going = True
Expand All @@ -28,4 +28,4 @@ disable =
# Don't report errors.
report_errors = True

strict-primitive-comparisons = True
strict-primitive-comparisons = True
Loading

0 comments on commit 9423187

Please sign in to comment.