Skip to content

Releases: zellerlab/GECCO

0.4.2

11 Jan 15:59
v0.4.2
Compare
Choose a tag to compare

Fixed

  • TypeClassifier.predict_types using inverse type probabilities when
    given several clusters to process.

0.4.1

11 Jan 15:59
v0.4.1
Compare
Choose a tag to compare

Fixed

  • gecco run command crashing on input sequences not containing any genes.

0.4.0

11 Jan 15:59
v0.4.0
Compare
Choose a tag to compare

Added

  • gecco.model.ProductType enum to model the biosynthetic class of a BGC.

Removed

  • pandas interaction from internal data model.
  • ClusterCRF code specific to cross-validation.

Changed

  • pandas, fisher and statsmodels dependencies are now optional.
  • gecco train command expects a cluster table in addition to the feature
    table to know the types of the input BGCs.

0.3.0

11 Jan 15:59
v0.3.0
Compare
Choose a tag to compare

Changed

  • Replaced Nearest-Neighbours classifier with Random Forest to perform type
    prediction for candidate BGCs.
  • gecco.knn module was renamed to implementation-agnostic name gecco.types.

Fixed

  • Extraction of domain composition taking a long time in gecco train command.

Removed

  • --metric argument to the gecco run CLI command.

0.2.2

11 Jan 15:59
v0.2.2
Compare
Choose a tag to compare

Changed

  • Domain and Gene can now carry qualifiers that are used when they
    are translated to a sequence feature.

Added

  • InterPro names, accessions, and HMMER e-value for each annotated domain
    in GenBank output files.

0.2.1

11 Jan 15:59
v0.2.1
Compare
Choose a tag to compare

Fixed

  • Various potential crashes in ClusterRefiner code.

Removed

  • Uneeded feature dictionary filtering in ClusterCRF for models with
    Fisher Exact Test feature selection.

0.2.0

11 Jan 15:59
v0.2.0
Compare
Choose a tag to compare

Fixed

  • pandas warning about unsorted columns in gecco run.

Removed

  • Gene.probability property, replaced by Gene.maximum_probability and
    Gene.average_probability properties to be explicit.

Changed

  • Internal model now uses Pfam and Tigrfam with the top 35% features
    selected with Fisher's Exact Test.
  • ClusterRefiner now removes genes on Cluster edges if they do not
    contain any domain annotation.

0.1.1

11 Jan 15:59
v0.1.1
Compare
Choose a tag to compare

Added

  • ClusterCRF.predict_probabilities to annotate a list of Gene.

Changed

  • BGC probability is now stored at the Domain level instead of at the Gene
    level, independently of the feature extraction level used by the CRF.
  • ClusterKNN will use the model path provided to gecco run if any.

Docs

  • Added this changelog file to document changes in the code.
  • Added documentation to gecco submodules missing some.
  • Included the CHANGELOG.md file to the generated docs.

0.1.0

11 Jan 15:59
v0.1.0
Compare
Choose a tag to compare

Initial release.

0.0.1

11 Jan 15:59
v0.0.1
Compare
Choose a tag to compare

Proof-of-concept.