Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Packageify & Migrate to Py3 #17

Open
wants to merge 73 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
73 commits
Select commit Hold shift + click to select a range
855934c
rm old releases
Dec 12, 2024
c1a2018
create changelog
Dec 12, 2024
2bc0e63
rm old changelog
Dec 12, 2024
443c19d
init env yaml
Dec 12, 2024
d4950b4
init pyproject toml
Dec 12, 2024
53d4595
update image paths
Dec 12, 2024
6d90354
mv docs
Dec 12, 2024
d89596e
mv images to docs
Dec 12, 2024
316cf76
mv images to docs
Dec 12, 2024
45e2d95
mv old docs
Dec 12, 2024
f0c8765
mv test data to tests
Dec 12, 2024
61727dd
add init
Dec 12, 2024
ac30fb0
migrate main module to py3
Dec 12, 2024
f7885d5
Ignore dynamic _version
Adamtaranto Dec 17, 2024
d951e55
Limit to Py 3.12.8 until next Biopython release
Adamtaranto Dec 17, 2024
7b7cfb3
Manage build with Hatch
Adamtaranto Dec 17, 2024
67a056d
pytest config
Adamtaranto Dec 17, 2024
3e8770a
Update for new args
Adamtaranto Dec 17, 2024
9c31e9c
init automated pytest action
Adamtaranto Dec 17, 2024
7d3afde
init auto ruff formatting
Adamtaranto Dec 17, 2024
52253a8
Archived code for testing
Adamtaranto Dec 17, 2024
6bacb46
update commands
Adamtaranto Dec 17, 2024
0993d4a
init empty test file
Adamtaranto Dec 17, 2024
2c4c6bb
use logging module
Adamtaranto Dec 17, 2024
f57101e
Attempt to modularize
Adamtaranto Dec 17, 2024
97b6c4b
Fix non-LF line endings on test data
Adamtaranto Dec 17, 2024
88c4e76
New entry point and arg handling!
Adamtaranto Dec 17, 2024
8994f60
Style fixes by Ruff
Adamtaranto Dec 17, 2024
e3136a4
archive unreleased flexidot v2.0.1
Adamtaranto Dec 17, 2024
3131d4a
Add plotting module
Adamtaranto Dec 17, 2024
1fcdaa2
rm current release link
Adamtaranto Dec 17, 2024
3c9237e
rm bool arg logging
Adamtaranto Dec 17, 2024
2bbb04a
Fix unicode str encoding and rm logging bool arg
Adamtaranto Dec 17, 2024
470181b
Add TODO notes
Adamtaranto Jan 6, 2025
1197289
linting
Adamtaranto Jan 6, 2025
b41c69c
direct test if true
Adamtaranto Jan 8, 2025
1bb665a
simplify example cases
Adamtaranto Jan 8, 2025
c42f393
increase log source detail
Adamtaranto Jan 8, 2025
abdc3ca
Add arg summary + rm timers
Adamtaranto Jan 8, 2025
284dc63
rm erroneous logging arg
Adamtaranto Jan 8, 2025
107db81
Add input check functions
Adamtaranto Jan 8, 2025
03e9d0c
rm time track
Adamtaranto Jan 8, 2025
064aa6a
Fix colormap selection bugs
Adamtaranto Jan 8, 2025
5c3bb1f
mv argparser to utils
Adamtaranto Jan 13, 2025
a0e4158
mv parser + user to set max_n
Adamtaranto Jan 13, 2025
f722561
clean prints
Adamtaranto Jan 13, 2025
46662ca
add max_n report
Adamtaranto Jan 13, 2025
827d278
Improve error handling when loading or combining fasta input
Adamtaranto Jan 13, 2025
b2f175b
streamline match finding, fix double counting on self-self
Adamtaranto Jan 13, 2025
9dec42d
basic kmer matching tests
Adamtaranto Jan 13, 2025
6aa5616
test set for TIR plotting
Adamtaranto Jan 13, 2025
8edc344
change max_n defaults
Adamtaranto Jan 13, 2025
deb4e20
time track returns delta
Adamtaranto Jan 13, 2025
0a06f70
fix norev
Adamtaranto Jan 13, 2025
448f6c1
Add gap
Adamtaranto Jan 13, 2025
35d2b16
update logging in read_seq
Adamtaranto Jan 13, 2025
72dc493
Standardise logging
Adamtaranto Jan 13, 2025
471bbd7
Tidy up logging
Adamtaranto Jan 13, 2025
021f135
cleanup temp degap or concat fasta files
Adamtaranto Jan 13, 2025
aed7300
seq for test: multi file input, gap, skip N, and ambiguity
Adamtaranto Jan 13, 2025
3ad34f0
unused test file
Adamtaranto Jan 13, 2025
97ec908
rename test data
Adamtaranto Jan 13, 2025
43abbbe
init concat check
Adamtaranto Jan 13, 2025
00c9e8b
fix msg fmt
Adamtaranto Jan 13, 2025
ee15a83
Validate all examples
Adamtaranto Jan 13, 2025
5d694c4
Add example data files
Adamtaranto Jan 13, 2025
dfe983f
link to testing data
Adamtaranto Jan 13, 2025
74aca53
Style fixes by Ruff
Adamtaranto Jan 13, 2025
b954e4c
Add note only_vs_first_seq
Adamtaranto Jan 13, 2025
f940f7a
v2.0.0 changelog draft entry
Adamtaranto Jan 13, 2025
7acac34
Check outdir exists
Adamtaranto Jan 13, 2025
0ae0395
Add outdir opt
Adamtaranto Jan 13, 2025
2fe1674
Add citation message on completion.
Adamtaranto Jan 13, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
37 changes: 37 additions & 0 deletions .github/workflows/pytest.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
name: Python Tests

on: [pull_request]

jobs:
test:
runs-on: ubuntu-latest

strategy:
matrix:
python-version: ["3.8", "3.9", "3.10", "3.11", "3.12"]

steps:
# Checkout the latest commit associated with the PR
- uses: actions/checkout@v4

- name: Debug matrix value
run: echo "Python version is ${{ matrix.python-version }}"

# Set up Miniconda
- name: Set up Miniconda
uses: conda-incubator/setup-miniconda@v2
with:
auto-update-conda: true # Optional: update Conda to the latest version
python-version: ${{ matrix.python-version }}

# Install any additional dependencies not included in the pyproject.toml file
- name: Install additional dependencies
run: |
pip install '.[tests]' # Install all dependencies, including test-specific ones
shell: bash -l {0}

# Run pytest on the specified directory
- name: Run tests
run: |
pytest tests
shell: bash -l {0}
23 changes: 23 additions & 0 deletions .github/workflows/ruff.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
name: Ruff Formatting
on: [pull_request]
jobs:
ruff:
if: ${{ github.actor != 'dependabot[bot]' }} # Do not run on commits created by dependabot
runs-on: ubuntu-latest
permissions:
# Give the default GITHUB_TOKEN write permission to commit and push the changed files.
contents: write # Allows reading and writing repository contents (e.g., commits)
pull-requests: write # Allows reading and writing pull requests
steps:
- uses: actions/checkout@v4
with:
ref: ${{ github.sha }}
token: ${{ secrets.GITHUB_TOKEN }}
- uses: chartboost/ruff-action@v1
with:
src: './src/flexidot'
args: 'format --target-version py310'
- uses: stefanzweifel/git-auto-commit-action@v5
id: auto-commit-action
with:
commit_message: 'Style fixes by Ruff'
70 changes: 70 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
# Mac stuff
.DS_Store

# Versioning
src/flexidot/_version.py

# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class

# C extensions
*.so

# Distribution / packaging
.Python
env/
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
*.egg-info/
.installed.cfg
*.egg

# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec

# Installer logs
pip-log.txt
pip-delete-this-directory.txt

# Unit test / coverage reports
htmlcov/
.tox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
.hypothesis/

# Jupyter Notebook
.ipynb_checkpoints

# pyenv
.python-version

# dotenv
.env

# virtualenv
.venv
venv/
ENV/

# mypy
.mypy_cache/
35 changes: 34 additions & 1 deletion code/README.md → CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,39 @@
# FlexiDot version changes

![alt text](https://github.com/molbio-dresden/flexidot/blob/master/images/Selfdotplots_banner4.png "FlexiDot self dotplots")
![alt text](https://github.com/molbio-dresden/flexidot/blob/master/docs/images/Selfdotplots_banner4.png "FlexiDot self dotplots")

## Version 2.0.0
*Jan 2025*

This release is a major refactor of the Flexidot codebase that migrates to Python 3 and a modern package structure.

**[Faster run time]:**
- When comparing a sequence to an identical sequence `find_match_pos_diag()` will recycle kmer counts from the first seq. Saves 33% runtime.

**[New features]:**

- Flexidot and its dependancies are now pip installable - uses Hatch and pyproject.toml
- Versioning is now managed dynamically using git tags
- Some basic tests for the core `find_match_pos_diag()` function have been added
- cmd line options are now managed with argparse
- Repo includes env yaml to set up conda env for flexidot
- Check that input files exist
- Auto cleanup temp files
- Add action to run pytests
- Add action to format code with Rust
- Uses logging module to manage status logging (removed time logging)

**[Changed defaults]:**
- Several cmd line options have been renamed or have changes to their expected input formatting. See --help.
- If not using the `--wobble_conversion` option then kmers containing any Ns will be skipped by default.
- If `--wobble_conversion` is set then `--max_n` determines the max percentage of Ns that will be tolerated in a kmer. Default changed to 10% from hard coded 49%.

**[Bugfixes]:**
- Fix depreciation issue with numpy creating an ndarray from ragged nested sequences in `find_match_pos_diag()` Closes issue #15
- Read files with r instead of rb
- Fix unicode issue referenced in #10

<br>

## Version 1.06
*14.04.2019*
Expand Down
Loading