FAIL

FAIL is an LLM based pipeline to collect and analyze software failures from the news. This paper can be found here https://arxiv.org/abs/2406.08221.

A database of software failures from 2010 to 2022 curated by FAIL is available at: http://ec2-44-222-209-185.compute-1.amazonaws.com:8000/dashboard/

Unfortunately, the website is currently broken, and we are working on fixing it. In the meantime, our raw database is available as a JSON dump at: https://drive.google.com/file/d/1Ajwxj2PKVunTJHT98naD9s7lc6kmptHu/view
You can also view our database as a SQL dump at
The JSON data contains a list of articles and incidents. Incidents have a generated postmortem report as described in Table 1 of the paper. Please note that there are occasionall missing (N/A) values in the reports. An analysis of this data is described in Section 4 of the paper.

About

This repository contains all of the artifacts of the project.

Our artifact includes the following (ordered by sections):

Item	Description	Corresponding content in the paper
Source code of FAIL	Source code of the pipeline to automatically analyze failures from the news	Section 3 - FAIL: Design and implementation
Manual Analysis and Evaluation	Manual analysis of incidents and evaluation of the pipeline	Section 3.2 - Component Development and Validation
Results	Source code to plot figures and gather other results from the paper	Section 4 - Analyzing the FAIL DB

Citations

For citations, please refer to the following:

APA Citation: Anandayuvaraj, D., Campbell, M., Tewari, A., & Davis, J. C. (2024). FAIL: Analyzing Software Failures from the News Using LLMs. Proceedings of the 2024 IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE.

BibTeX Citation:

@inproceedings{ananday2024fail,
  title={FAIL: Analyzing Software Failures from the News Using LLMs},
  author={Anandayuvaraj, Dharun and Campbell, Matthew and Tewari, Arav and Davis, James C},
  booktitle={Proceedings of the 2024 IEEE/ACM International Conference on Automated Software Engineering (ASE)},
  year={2024},
  organization={IEEE}
}

Setting up the pipeline

Setting up the docker containers for the pipeline

Build and spin up the container:

$ docker compose -f local.yml up --build -d

Apply the migrations in the container:

$ docker compose -f local.yml run --rm django python manage.py makemigrations
$ docker compose -f local.yml run --rm django python manage.py migrate

Make sure the container is running:
```
$ docker compose -f local.yml up
```

Tip

Note: To log stdout/stderr for pipeline command that is detached, follow these steps:

Run the command:

$ docker compose -f local.yml run -e OPENAI_API_KEY -d --name failures_run_command django python -m failures classify

Wait for the command to finish running, then export the log:

$ docker logs failures_run_command >> failures_run_command.log 2>&1

Then kill the container:
```
$ docker rm failures_run_command
```

Admin Site

Create an admin account:

$ docker compose -f local.yml run --rm django python manage.py createsuperuser

Access the site administration page at /admin/

Running the pipeline

Step 1. Searching and scraping news articles

Option 1. Using the Admin Site and the Command Line

Navigate to the /admin/ page and log in.
Click on Search Queries underneath the Articles section.
Click on ADD SEARCH QUERY +.
Enter a search query and click SAVE.

Run a scrape from the command line:

$ docker compose -f local.yml run --rm django python -m failures scrape

Option 2. Using Only the Command Line

Display the help text:

$ docker compose -f local.yml run --rm django python -m failures scrape --help

Run a scrape:

$ docker compose -f local.yml run --rm django python -m failures scrape --keyword "keyword"

Tip

To scrape news for specific years, specific sources, specific key words, modify and run: tests/commands/exp_RunQueriesCommand.py

Step 2. Classifying articles on whether they report on software failures

Display the help text:

$ docker compose -f local.yml run --rm django python -m failures classifyfailure --help

Classify articles:

$ docker compose -f local.yml run -e OPENAI_API_KEY --rm django python -m failures classifyfailure

Step 3. Classifying articles on whether they contain enough information for failure analysis

Display the help text:

$ docker compose -f local.yml run --rm django python -m failures classifyanalyzable --help

Classify articles:

$ docker compose -f local.yml run -e OPENAI_API_KEY --rm django python -m failures classifyanalyzable

Step 4. Merge articles reporting on the same failure into incidents

Display the help text:

$ docker compose -f local.yml run --rm django python -m failures merge --help

Merge articles:

$ docker compose -f local.yml run -e OPENAI_API_KEY --rm django python -m failures merge

Step 6. Create a postmortem report for each incident

Display the help text:

$ docker compose -f local.yml run --rm django python -m failures postmortemincidentautovdb --help

Create postmortem reports:

$ docker compose -f local.yml run -e OPENAI_API_KEY --rm django python -m failures postmortemincidentautovdb

Note

Step 5 (RAG) from the paper is within this command.

Plot figures and gather statistics used in the paper

Display the help text:

$ docker compose -f local.yml run --rm django python -m failures results --help

Create postmortem reports:

$ docker compose -f local.yml run -e OPENAI_API_KEY --rm django python -m failures results

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.envs/.local		.envs/.local
.github		.github
compose		compose
config		config
failures		failures
locale		locale
requirements		requirements
results		results
risks		risks
tests		tests
.dockerignore		.dockerignore
.editorconfig		.editorconfig
.gitattributes		.gitattributes
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.pylintrc		.pylintrc
LICENSE		LICENSE
README.md		README.md
local.yml		local.yml
manage.py		manage.py
production.yml		production.yml
pytest.ini		pytest.ini
setup.cfg		setup.cfg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FAIL

About

Citations

Setting up the pipeline

Setting up the docker containers for the pipeline

Admin Site

Running the pipeline

Step 1. Searching and scraping news articles

Step 2. Classifying articles on whether they report on software failures

Step 3. Classifying articles on whether they contain enough information for failure analysis

Step 4. Merge articles reporting on the same failure into incidents

Step 6. Create a postmortem report for each incident

Plot figures and gather statistics used in the paper

Manual analysis and evaluation

Step 2 to Step 4: Manual analysis and evaluation: results/manual_analysis/Step2_to_Step4.xlsx

Step 6: Manual analysis and evaluation: results/manual_analysis/Step6.xlsx

To evaluate the steps of the pipeline automatically, follow tests/README.rst

About

Releases 1

Packages

Languages

License

PurdueDualityLab/ASE24-FAIL-FailureAnalysisInvestigationWithLLMs

Folders and files

Latest commit

History

Repository files navigation

FAIL

About

Citations

Setting up the pipeline

Setting up the docker containers for the pipeline

Admin Site

Running the pipeline

Step 1. Searching and scraping news articles

Step 2. Classifying articles on whether they report on software failures

Step 3. Classifying articles on whether they contain enough information for failure analysis

Step 4. Merge articles reporting on the same failure into incidents

Step 6. Create a postmortem report for each incident

Plot figures and gather statistics used in the paper

Manual analysis and evaluation

Step 2 to Step 4: Manual analysis and evaluation: results/manual_analysis/Step2_to_Step4.xlsx

Step 6: Manual analysis and evaluation: results/manual_analysis/Step6.xlsx

To evaluate the steps of the pipeline automatically, follow tests/README.rst

About

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages