Skip to content

Commit

Permalink
Merge branch 'release/0.18.0'
Browse files Browse the repository at this point in the history
  • Loading branch information
siebrenf committed Jan 11, 2023
2 parents 6205d50 + 149d79c commit 7451274
Show file tree
Hide file tree
Showing 101 changed files with 4,888 additions and 4,819 deletions.
45 changes: 45 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,51 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/).

## Unreleased

## [0.18.0] - 2023-01-11

### Added

- `gimme scan` and `gimme maelstrom` now accept a random seed for (most) operations
- for (optimal) deterministic behaviour, delete the cache and then run the command with a seed
- `Scanner` now accepts a `np.random.RandomState` and `progress` on init.
- `progress=None` (the default) should print progress bars to the command line only, not to file.
- `Scanner.set_genome` now accepts the optional argument `genomes_dir`
- `gimmemotifs.maelstrom.Moap.create` now accepts a `np.random.RandomState`.
- `gimmemotifs.maelstrom.run_maelstrom` now accepts a `np.random.RandomState`.

### Changed

- `gimme diff` (`diff_plot()` to be exact) will now print to stdout, like all other functions
- now using the logger instead of print/sys.stderr.write in many more places
- string formatting now (mostly) done with f-strings
- refactored Fasta class
- split `scanner.py` into 3 submodules:
- `scanner/__init__.py` with the exported functions
- `scanner/base.py` with the Scanner class
- `scanner/utils.py` with the rest
- `gimmemotifs/maelstrom.py` renamed to `gimmemotifs/maelstrom/_init__.py`
- `rank.py` and `moap.py` are now submodules of maelstrom.

### Fixed

- `gimme maelstrom` works with or without xgboost (but will give a warning without xgboost)
- fixed warning "in validate_matrix(): Row sums in df are not close to 1. Reormalizing rows..."
- fixed multiprocess.Pool Warnings
- fixed a pandas copywarning (in `gc_bin_bedfile()` to be exact)
- fixed warnings when leaving files open
- fixed deprecation warning in maelstrom (and in tests)
- fixed futurewarning in report.py
- silence warnings from external tools in motif prediction (`pp_predict_motifs()` to be exact)
- updated last references from `Motif.pwm_scan` and `Motif.pwm_scan_all` to `Motif.scan` and `Motif.scan_all` respectively
- typo in `gimme motifs` output ("%matches background" to "% matches background")
- `Scanner` now uses a cheaper method to determine a genome's identity
- (filesize + name instead of the md5sum of the whole genome's contents)
- `gimme motifs` gives an informative error when `fraction` is not within 0-1.
- `gimme threshold` works again

### Removed

- removed old python2 code (scanning with MOODS & import shenanigans)

## [0.17.2] - 2022-10-12

Expand Down
61 changes: 28 additions & 33 deletions compile_externals.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
import sys
import os
from subprocess import Popen, PIPE
import sys
from distutils import log
from subprocess import PIPE, Popen


def compile_simple(name, src_dir="src"):
Expand All @@ -23,37 +23,35 @@ def compile_simple(name, src_dir="src"):
except Exception:
return

Popen(
[gcc, "-o%s" % name, "%s.c" % name, "-lm"], cwd=path, stdout=PIPE
).communicate()
Popen([gcc, f"-o{name}", f"{name}.c", "-lm"], cwd=path, stdout=PIPE).communicate()
if os.path.exists(os.path.join(path, name)):
return True


def compile_configmake(name, binary, configure=True, src_dir="src"):
path = os.path.join(src_dir, "%s" % name)

if not os.path.exists(path):
return

if configure:
Popen(
["chmod", "+x", "./configure"], cwd=path, stdout=PIPE, stderr=PIPE
).communicate()
stdout, stderr = Popen(
["./configure"], cwd=path, stdout=PIPE, stderr=PIPE
).communicate()
print(stdout)
print(stderr)

stdout, stderr = Popen(
["make -j 4"], cwd=path, stdout=PIPE, stderr=PIPE, shell=True
).communicate()
print(stdout)
print(stderr)

if os.path.exists(os.path.join(path, binary)):
return True
# def compile_configmake(name, binary, configure=True, src_dir="src"):
# path = os.path.join(src_dir, str(name))
#
# if not os.path.exists(path):
# return
#
# if configure:
# Popen(
# ["chmod", "+x", "./configure"], cwd=path, stdout=PIPE, stderr=PIPE
# ).communicate()
# stdout, stderr = Popen(
# ["./configure"], cwd=path, stdout=PIPE, stderr=PIPE
# ).communicate()
# print(stdout)
# print(stderr)
#
# stdout, stderr = Popen(
# ["make -j 4"], cwd=path, stdout=PIPE, stderr=PIPE, shell=True
# ).communicate()
# print(stdout)
# print(stderr)
#
# if os.path.exists(os.path.join(path, binary)):
# return True


def print_result(result):
Expand All @@ -63,10 +61,7 @@ def print_result(result):
log.info("... ok")


def compile_all(prefix=None, src_dir="src"):
# are we in the conda build environment?
conda_build = os.environ.get("CONDA_BUILD")

def compile_all(src_dir="src"):
sys.stderr.write("compiling BioProspector")
sys.stderr.flush()
result = compile_simple("BioProspector", src_dir=src_dir)
Expand Down
10 changes: 5 additions & 5 deletions docs/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -238,7 +238,7 @@ Now we can use this file for scanning.
f = Fasta("test.fa")
m = motif_from_consensus("TGAsTCA")
m.pwm_scan(f)
m.scan(f)
.. code-block:: python
Expand All @@ -247,11 +247,11 @@ Now we can use this file for scanning.
This return a dictionary with the sequence names as keys.
The value is a list with positions where the motif matches.
Here, as the AP1 motif is a palindrome, you see matches on both forward and reverse strand.
This is more clear when we use ``pwm_scan_all()`` that returns position, score and strand for every match.
This is more clear when we use ``scan_all()`` that returns position, score and strand for every match.

.. code-block:: python
m.pwm_scan_all(f)
m.scan_all(f)
.. code-block:: python
Expand All @@ -267,7 +267,7 @@ Use ``scan_rc=False`` to only scan the forward orientation.

.. code-block:: python
m.pwm_scan_all(f, nreport=1, scan_rc=False)
m.scan_all(f, nreport=1, scan_rc=False)
.. code-block:: python
Expand Down Expand Up @@ -295,7 +295,7 @@ If you only want the best match per sequence, is a utility function called ``sca
result = scan_to_best_match("test.fa", motifs)
for motif, matches in result.items():
for match in matches:
print("{}\t{}\t{}".format(motif, match[1], match[0]))
print(f"{motif}\t{match[1]}\t{match[0]}")
.. code-block:: python
Expand Down
26 changes: 13 additions & 13 deletions docs/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,13 +12,13 @@

import os
import sys
import mock

import mock # noqa: available on readthedocs

sys.path.insert(0, os.path.abspath('..'))
sys.path.insert(0, os.path.abspath(".."))
# c_metrics is created, but rtd can't find it for some reason
# to fix this we do a mock import
MOCK_MODULES = ['gimmemotifs.c_metrics']
MOCK_MODULES = ["gimmemotifs.c_metrics"]
for mod_name in MOCK_MODULES:
sys.modules[mod_name] = mock.Mock()

Expand All @@ -30,9 +30,9 @@

# -- Project information -----------------------------------------------------

project = 'GimmeMotifs'
copyright = '2022, Simon van Heeringen, licensed under CC BY 4.0'
author = 'Simon van Heeringen, Siebren Frölich, Maarten van der Sande'
project = "GimmeMotifs"
copyright = "2022, Simon van Heeringen, licensed under CC BY 4.0"
author = "Simon van Heeringen, Siebren Frölich, Maarten van der Sande"

# Major, minor and hotfix versions
version = __version__.split("+")[0]
Expand All @@ -46,20 +46,20 @@
# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
# ones.
extensions = [
'sphinx.ext.autodoc', # automatic documentation from docstrings
'sphinx.ext.coverage', # gather documentation coverage stats
'sphinx.ext.napoleon', # recognize numpy & google style docstrings
'sphinx.ext.autosummary', # Create neat summary tables
"sphinx.ext.autodoc", # automatic documentation from docstrings
"sphinx.ext.coverage", # gather documentation coverage stats
"sphinx.ext.napoleon", # recognize numpy & google style docstrings
"sphinx.ext.autosummary", # Create neat summary tables
# 'numpydoc',
]

# Add any paths that contain templates here, relative to this directory.
templates_path = ['_templates']
templates_path = ["_templates"]

# List of patterns, relative to source directory, that match files and
# directories to ignore when looking for source files.
# This pattern also affects html_static_path and html_extra_path.
exclude_patterns = ['_build', 'Thumbs.db', '.DS_Store']
exclude_patterns = ["_build", "Thumbs.db", ".DS_Store"]


# -- Options for HTML output -------------------------------------------------
Expand All @@ -72,7 +72,7 @@
# Add any paths that contain custom static files (such as style sheets) here,
# relative to this directory. They are copied after the builtin static files,
# so a file named "default.css" will overwrite the builtin "default.css".
html_static_path = ['_static']
html_static_path = ["_static"]


# -- Extension configuration -------------------------------------------------
12 changes: 6 additions & 6 deletions docs/notebooks/api_examples.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -387,7 +387,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"To convert a motif to an image, use `to_img()`. Supported formats are png, ps and pdf."
"To convert a motif to an image, use `plot_logo()`. Supported formats are png, ps and pdf."
]
},
{
Expand All @@ -397,7 +397,7 @@
"outputs": [],
"source": [
"m = motif_from_consensus(\"NTGASTCAN\")\n",
"m.to_img(\"ap1.png\", fmt=\"png\")"
"m.plot_logo(\"ap1.png\", fmt=\"png\")"
]
},
{
Expand Down Expand Up @@ -465,14 +465,14 @@
"f = Fasta(\"test.small.fa\")\n",
"m = motif_from_consensus(\"TGAsTCA\")\n",
"\n",
"m.pwm_scan(f)"
"m.scan(f)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This return a dictionary with the sequence names as keys. The value is a list with positions where the motif matches. Here, as the AP1 motif is a palindrome, you see matches on both forward and reverse strand. This is more clear when we use `pwm_scan_all()` that returns position, score and strand for every match."
"This return a dictionary with the sequence names as keys. The value is a list with positions where the motif matches. Here, as the AP1 motif is a palindrome, you see matches on both forward and reverse strand. This is more clear when we use `scan_all()` that returns position, score and strand for every match."
]
},
{
Expand All @@ -497,7 +497,7 @@
}
],
"source": [
"m.pwm_scan_all(f)"
"m.scan_all(f)"
]
},
{
Expand Down Expand Up @@ -526,7 +526,7 @@
}
],
"source": [
"m.pwm_scan_all(f, nreport=1, scan_rc=False)"
"m.scan_all(f, nreport=1, scan_rc=False)"
]
},
{
Expand Down
56 changes: 38 additions & 18 deletions docs/release_checklist.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,26 +4,39 @@ This is mainly for personal use at the moment.

##

1. Create release candidate with `git flow`.

```
$ git flow release start ${new_version}
1. Make sure all tests pass.

```shell
mamba env update -f environment.yml
pytest -vvv
```

2. Update version in `gimmemotifs/config.py`
2. Create release candidate with `git flow`:

```shell
new_version=0.0.0
echo ${new_version}

git flow release start ${new_version}
```

3. Make sure `CHANGELOG.md` is up-to-date.

* add the new version & date to the header
* link to the diff in the footer
* add & commit the changes, but do not push

4. Test install using pip in fresh conda environment

```shell
python setup.py sdist
mamba create -n test python=3.9 pytest
mamba activate test
pip install dist/gimmemotifs*.tar.gz
pytest -vvv
```
$ cd ${test_dir} # Not the gimmemotifs git directory
$ conda create -n testenv python=3 --file requirements.yaml
$ conda activate testenv
$ pip install -e git+https://github.com/simonvh/gimmemotifs.git@release/${version}#egg=gimmemotifs
$ cd src/gimmemotifs
$ python run_tests.py
```

5. Upload to pypi testing server

```
Expand All @@ -37,20 +50,27 @@ $ twine upload -r testpypi dist/gimmemotifs-${version}.tar.gz

6. Finish release

```shell
git flow release finish ${new_version}
```
$ git flow release finish ${version}
```

7. Upload to PyPi.

7. Push everything to github, including tags:

```shell
git push --follow-tags origin develop master
```
$ python setup.py sdist
$ twine upload dist/gimmemotifs-${version}.tar.gz

8. Upload to PyPi.

```shell
python setup.py sdist
twine upload dist/gimmemotifs-${new_version}.tar.gz
```

8. Finalize the release on Github.
9. Finalize the release on Github.

Create a release. Download the tarball and then edit the release and attach the
tarball as binary.

8. Create bioconda package
10. Update the Bioconda package recipe
Loading

0 comments on commit 7451274

Please sign in to comment.