Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing scipy module #11

Merged
merged 3 commits into from
Oct 9, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -162,3 +162,5 @@ cython_debug/
#.idea/
venv/Lib/site-packages
venv
*.xlsx
*.xlsx.png
20 changes: 7 additions & 13 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,6 @@ Tool to analyze and visualize dependencies between cells in Excel spreadsheets i

Will generate a graph of the dependencies between cells in an Excel spreadsheet. Data extracted with `openpyxl` (<https://foss.heptapod.net/openpyxl/openpyxl>), the graph is generated with the `networkx` library (<https://networkx.org/>) and is visualized using `matplotlib`.

This is a simple tool and maybe even naïve in its approach - it was hacked together in two evenings and would benefit from some refactoring and more features. It is meant as a starting point for further development.
<br clear="right"/>

## Definitions
Expand Down Expand Up @@ -39,10 +38,6 @@ graph TD

```

The way the graph is built is by iterating over all cells in the spreadsheet and extracting the references in the formula of each cell. The references are then added as edges in the graph.

A cell within a range is considered a dependency of the range itself, but not of the other cells in the range.

## Installation from pypi package

PyPi project: [graphedexcel](https://pypi.org/project/graphedexcel/)
Expand All @@ -66,13 +61,11 @@ pip install -e .
python -m graphedexcel <path_to_excel_file> [--verbose] [--no-visualize] [--keep-direction] [--open-image]
```

Depending on the size of the spreadsheet you might want to adjust the plot configuration in the code to to make the graph more readable (remove labels, decrease widths and sizes etc)

In [graph_visualizer.py](src/graph_visualizer.py) you will find three configuration for small, medium and large graphs. You can adjust the configuration to your needs.
Depending on the size of the spreadsheet you might want to adjust the plot configuration in the code to to make the graph more readable (remove labels, decrease widths and sizes etc) - you can find the configuration in [graph_visualizer.py](src/graphedexcel/graph_visualizer.py) with settings for small, medium and large graphs. You can adjust the configuration to your needs - but this only working if you run from source.

### Arguments

`--verbose` will dump formula cell contents during (more quiet)
`--verbose` will dump formula cell contents during (more noisy)

`--no-visualize` will skip the visualization step and only print the summary (faster)

Expand All @@ -82,7 +75,7 @@ In [graph_visualizer.py](src/graph_visualizer.py) you will find three configurat

## Sample output

The following is the output of running the script on the provided `docs/Book1.xlsx` file.
The following is the output of running the script on the sample `docs/Book1.xlsx` file.

```bash
=== Dependency Graph Summary ===
Expand Down Expand Up @@ -114,14 +107,16 @@ Graph visualization saved to images/.\Book1.xlsx.png

## Sample plot

More in `/images` folder.
More in `docs/images` folder.

![Sample graph](docs/images/simplified_1.xlsx5.png)

## Tests

Just run pytest in the root folder.

```bash
pytest test_cell_reference_extraction.py
pytest
```

## Contribute
Expand All @@ -136,4 +131,3 @@ You can help with the following, that I have thought of so far:
- Improve the visualization and the ease of configuration
- Add more examples
- Add more documentation
- Package the script for easier installation and use with PyPi
7 changes: 6 additions & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,12 @@ classifiers = [
"Operating System :: OS Independent",
]

dependencies = ["networkx>=3.3", "openpyxl>=3.1", "matplotlib>=3.9"]
dependencies = [
"networkx>=3.3",
"openpyxl>=3.1",
"matplotlib>=3.9",
"scipy>=1.14",
]

[project.optional-dependencies]
test = ["black==21.9b0", "pytest==8.3"]
Expand Down