- Add: minor improvements in web-interface.
- Add: make web-interface remember nomenclatural code picked in the previous GET query.
- Add #265: add optional nomenclatural code option to parse names with an ambiguity according to a particular code.
- Add #269: switch to slog from zerologs.
- Fix [#271]: distinguish between
ex
andin
. - Fix #270: missing verbatim authorship for names that look similar to combination uninomial in ICZN.
- Fix #268: if botanical author looks like a combination uninomial, make sure its characters are not normalized.
v1.10.3 - 2024-10-11 Fri
- Fix #266: remove author from species detail for named hybrids.
v1.10.2 - 2024-07-30 Tue
- Add #264: more exceptions.
v1.10.1 - 2024-06-05 Wed
- Add #263: add more exceptions with speciific epithets like "complex", "do", "spec".
v1.10.0 - 2024-06-04 Tue
- Add #260: add
candidatus
field for parsed data. - Add #232: parse names like
subgen. Psammophrynopsis Koch, 1953
.
v1.9.2 - 2024-05-01 Wed
- Add #261: a constructor to return a pool of a given size of gnparsers.
v1.9.1 - 2023-10-13 Fri
- Add: update modules.
- Fix #259: allow diacritics in any UTF-8 normalization form.
- Fix #258: allow authors with 2 dashes in the name.
- Fix #256: fix normalization where a misplacced year changes the year of original authors.
v1.9.0 - 2023-10-12 Thu
- Add: restore backward compatibility by creating a new flag
--species-group-cut
.
v1.8.0 - 2023-10-11 Wed
- Add #255: normalize stemmed canonical of
Aus bus bus
toAus bus
. WARNING this creates some backward incompatibility. - Add: sorting uses
slices
package.
v1.7.5 - 2023-09-26 Tue
- Add: CSV and TSV files provide now verbatim authorship instead of normalized one.
- Add: a few more "termination words"
- Fix #254: treat
fa
as forma. - Fix #253: process
dem
as an author word forVon dem Bush
and like. - Fix #251: do not process
y
asand
forRafael Arango y Molina
. - Fix #249: allow
cf
at the end of the strings, cf for infraspecies. - Fix #248: do not escape double quotes for TSV output.
- Fix #246: ignore
ms
at the end of the strings.
v1.7.4 - 2023-08-22 Tue
- Fix #243: parse correctly
Nassa pagoda var. acuta P. P. Carpenter, 1857
.
v1.7.3 - 2023-06-17 Sat
- Add #241: allow comma before
ex
authors.
v1.7.2 - 2023-03-09 Thu
- Add #240: add
tr.
subtr.
as ranks for combo-uninomials.
v1.7.1 - 2023-03-07 Tue
- Add: upgrade all modules.
v1.7.0 - 2023-03-07 Tue
- Add #238: stem takes in account -ii suffix,
macdonaldii
->macdonald
.
v1.6.9 - 2022-11-10 Thu
- Add #237: detect and normalize non-breaking hyphens. In case if other non-typical hythens will appear, they will be dealt the same way.
v1.6.8 - 2022-10-01 Sat
- Add: update all modules.
v1.6.7 - 2022-08-22 Mon
v1.6.6 - 2022-05-15 Sun
- Add #224: Creation of Nix packages for gnparser.
v1.6.5 - 2022-03-21 Mon
- Add #223: Use PEG parser for preprocess instead of RegEx. This approach gives 15-17% speed increase.
v1.6.4 - 2022-03-19 Sat
- Add #224: Parse correctly italian authors with
degli
.
v1.6.3 - 2022-02-08 Tue
- Add #222: Improve logs for NSQ, switch to zerologs library.
v1.6.2 - 2022-02-04 Fri
- Fix #221: No parsing for names with
cyanobacterium
. - Fix #220:
Crenarchaeote enrichment culture clone
should stop parsing atenrichment
. - Fix #219: filter out
complex
word during preprocessing for names likeAegla uruguayana complex
.
v1.6.1 - 2022-02-01 Sat
- Add: use NSQ logger from sfgrp/lognsq
v1.6.0 - 2022-01-22 Sat
- Add #218: enable/disable logs for web-services, allow logs aggregation with NSQd.
v1.5.7 - 2021-11-26 Fri
- Fix: parsed.NormalizeByType preserves period char.
v1.5.6 - 2021-11-21 Sun
-
Add #212: Set year from 'ex' authorship as a year of a name. Add 'ex' authors to list of all authors.
-
Add #211: PR #214 by @tobymarsden, general approach for
non-...
specific epithets. -
Add #208: PR #210 by @tobymarsden, option to preserve diaereses.
-
Fix #213: Stop generating space between
Mc
,Mac
and the rest of an an author name.
v1.5.5 - 2021-11-17 Wed
v1.5.4 - 2021-11-14 Sun
- Add: different approach for normalize-by-type for words.
- Add #205: allow genera starting with De-, Eu-, Le-, Ne- (by @tobymarsden).
- Add #203: allow up to 2 dashes in genera (by @tobymarsden).
v1.5.3 - 2021-11-13 Sat
- Add #202: add NormalizeMore function for Word.
v1.5.2 - 2021-11-10 Wed
- Add #200: support for 'div.' rank in uninomial combinations.
- Add #199: fixes for several names that were not parsed correctly.
- Add #198: parse "Solanum tuberosum wila-k`oyu".
- Add #97: do not parse "Cyanophage".
- Add #85: parse names with a dagger character.
- Add #84: parse "Muscicapa randi Amadon & duPont, 1970".
- Add #83: parse authors like 'Laverde-R.'.
v1.5.1 - 2021-11-01 Mon
- Add #191: support for ambiguous specific epithets
v1.5.0 - 2021-10-22 Fri
- Add #194: support for cultivars' graft-chymeras (courtesy of @tobymarsden)
v1.4.2 - 2021-10-21 Thu
- Add #196: parse authors with prefix 'ver'
v1.4.1 - 2021-10-07 Thu
- Fix #195: parse multinomials where authorshp is not separated by space.
v1.4.0 - 2021-09-4 Sat
- Add #193: add TSV format for output.
- Add #190: support prefixes
do
andde los
for authors. - Add #187: support
ter
suffix for authors. - Add #186: support non-ASCII apostrophe in authors.
v1.3.3 - 2021-09-11 Wed
- Add #176: refactoring of hybrid sign treatment (use PEG instead of
RegEx for normalizing
x
,X
, and×
. - Add #183: stop parsing after
nec
,non
,fide
,vide
, treatms in
asin
orex
for exAuthors. - Add #182: support for authors with prefixes
ten
,delle
,dos
.
v1.3.2 - 2021-08-02 Mon
- Add #182: support
Do
,Oo
,Nu
2-letter genera. - Add [#53]: exceptions to annotations (
Bottaria nudum
for example). - Fix: names where sp epithet starts with
cf
can be parsed now.
v1.3.1 - 2021-07-17 Sat
- Add #180: Zenodo DOI.
v1.3.0 - 2021-06-29 Tue
-
Add #179: cultivars info to README.
-
Add #178: parse cultivars via REST API.
-
Add #177: parse botanical cultivars via web.
-
Add #173: cultivars parsing #174 @tobymarsden.
-
Add #172: authors initials with a dash like "B.-E.van Wyk".
-
Add: tests for cultivars (Toby Marsden)
-
Fix #174: Hybrid character is missed or wrong in details'
Words
section.
v1.2.0 - 2021-04-08 Thu
v1.1.0 - 2021-03-21 Sun
- Add #163: support bacterial
Candidatus
names. - Add #162: show PEG AST tree for debugging.
- Add #161: add automatic tools dependency.
- Add #160: use embed feature of Go v1.16.
v1.0.13 - 2021-02-23 Tue
- Add: limit nightly builds to master only.
- Fix #159: POST method contains w18rong URL.
v1.0.12 - 2021-02-21 Sun
- Add #154: parse names with ambiguous
f.
as forma if there is a space between authr andf.
. If there is no space, parse asfilius
. Give ambiguity warning in both cases. - Add: PHP example from @barotto about using pipes with gnparser.
v1.0.11 - 2021-02-20 Sat
- Fix #153: flags
csv=false
andwith_details=false
trigger opposite behavior.
v1.0.10 - 2021-02-19 Fri
- Add #152: change auto-prereleases from nightly to on master submit.
- Add #151: do not parse names with
(endo|ecto)?symbiont
. - Add #150: ignore serovar/serotype in bacerital names.
- Add #149: support abbreviated subgenus (
Aus (B.) cus
).
v1.0.9 - 2021-02-17 Wed
- Add #146: unordered flag.
- Add #145: better CI/build actions, add nightly binaries.
- Fix #144: remove configuration file as it creates more problems than solves.
v1.0.8 - 2021-02-15 Mon
- Add: remove config message for CLI app.
- Add: ldflags
-s -w
to decrease binary size. - Fix: header does not show in CSV format for stream.
v1.0.7 - 2021-02-14 Sun
-
Add #143:
quiet
flag to suppress showing progress output. -
Fix #142: stream waits until certain names number is equal the batch size.
-
Fix #141: config file is not created.
v1.0.6 - 2021-02-04 Thu
- Add: update version handling, readme.
v1.0.5 - 2021-02-01 Mon
- Add: remove gnlib package.
- Add #140: remove config package.
v1.0.4 - 2021-01-23 Sat
- Add: cleanup constructor methods names.
v1.0.3 - 2021-01-23 Sat
- Add #139: make package names less abstract.
v1.0.2 - 2021-01-22 Fri
- Fix #137: add correct VerbatimID for HTML-containing names.
v1.0.1 - 2021-01-20 Wed
-
Add #136: Man page
-
Add #100: Switch continuous integration to use GitHub Actions.
-
Add #129: Make c-binding usable for biodiversity parser.
-
Fix #135: Changes: SubGenus->Subgenus, InfraSpecies->Infraspecies
v1.0.0 - 2021-01-19 Tue
- Add #127: Update documentation to v1.0.0.
- Add #122: Implement parsing as a stream in addition to batch parsing.
- Add #126: Update c-binding to v1.0.0.
- Add #131: Add parameters "with_details" and "csv" to REST API.
- Add #134: Transoform "positions" section to "words" section.
- Add #128: Add more examples to OpenAPI specification.
- Add #125: Describe changes from v0.x to 1.x.
- Add #132: Add context.Context to control lifespan of
go routines
. - Add #115: Migrate tests from ginkgo to plain tests.
- Add #109: Move
web
package toio
. - Add #124: Document warnings for each quality category.
- Add #121: Convert
package
parser to use interfaces. - Add #120: CLI app for newly created functionality.
- Add #119: Formatted output for
output.Parsed
. - Add #117: Convert failed parsing results to
output.Parsed
. - Add #114: Convert parsing result to
output.Parsed
. - Add #118: Add
Verbatim
andYear
fields to the root ofAuthorship
. - Add #107: Move
grammar
package toentity
and rename toparser
. - Add #110: Move
stemmer
toentity
. - Add #113: Move
str
package toentity
. - Add #112: Move
preprocess
package toentity
. - Add #105: Move
fs
package toio
. - Add #111: Move
dict
package toio
. - Add #106: Describe main use-case via interface.
- Add #104: Add configuration package.
- Add #103: Create an output.Parsed object that can be used in Go and as JSON.
- Add #101: Start using gnlib where it makes sense.
- Add #99: Move code to GitHub and change links accordingly.
- Add #95: Remove dependency on gRPC and protobuf.
v0.14.4 - 2020-12-15 Tue
- Add #96: Do not parse names starting with "Candidatus".
- Add #93: Parse 'y' (Spanish '&') as an author separator.
v0.14.3 - 2020-12-13 Sun
- Add #95: Remove make dependency on gRPC tooling.
- Add #94: Do not parse names with "bacterium" "epithet.
v0.14.2 - 2020-05-12 Tue
- Add #90: Allow
ß
in names. - Add #89: Support
subspec.
as a rank. - Add #82: Support authors with prefix
zu
.
v0.14.1 - 2020-05-07 Thu
- Fix: Change web API from default to Compact format to get correct API output.
v0.14.0 - 2020-05-07 Thu
- Add #81: Add year range in format "1888/89".
- Add #80: Add Cardinality to parser outputs.
- Add #79: Make CSV the default format for CLI.
- Add #78: Take into account
non-virus
names that look like virus names.
v0.13.1 - 2020-03-05 Thu
- Fix #77: Memory leak when used as clib.
- Fix #76: Non ASCII apostrophe does not show up in canonical.
v0.13.0 - 2020-02-12 Wed
- Add #74: Simple format output is now in CSV format.
- Add #73: Improve speed by using ragel's FSM instead of regex.
- Fix #75: Normalize subspecies to
subsp.
instead ofssp.
. - Fix #72: Surrogate detection by
gnparser.ParseToObject
method.
v0.12.0 - 2019-11-18 Mon
- Add #71: do not parse 'Unnamed clade...'.
- Add #69: gnparser as a shared C library.
- Add: Make dynamic version using ldflags.
- Fix #70: parse 'Remera cvancarai' correctly.
v0.11.0 - 2019-10-24 Thu
- Add #68: add stemmed version of canonical form to outputs.
- Add: benchmarks to gnparser_test.go
v0.10.0 - 2019-09-10 Tue
- Add #67: field
authorship
of the name for JSON output - Add #66: remove HTML tags during parsing instead of a separate step.
- Add #61: handle authors that end with a word "bis".
- Add #60: handle correctly deprecated ranks with Greek letters.
- Fix #62: parser breaks on
Drepanolejeunea (Spruce) (Steph.)
.
v0.9.0 - 2019-08-16 Fri
- Add #65: gRPC is able to return a protobuf object now instead of JSON. string (only for ParseArray function so far). The same protobuf object is now also used by gnparser.ParseToObject function.
- Add #64: gRPC method ParseArray that cleans and parses an input from an array of names instead of a stream.
- Add #63: abbreviation for
form
orforma
is nowf.
instead offm.
.
v0.8.0 - 2019-04-10 Wed
- Add [#51]: strings like
Aus (Bus)
are parsed differently for ICN and ICZN names. If string inside of parenthesis matches known ICN author name is parsed asUninomial (Author)
, otherwise it is parsed asAus subgen. Bus
.
v0.7.5 - 2019-03-31 Sun
- Add #59: method
ParseToObject
to avoid JSON in Go programs. - Add #58: parse
Aus (Bus)
asUninomial (Author)
to prevent botanical authors appear as subgenera. We need a better solution for this. - Add #57: warning in cases of an ambiguous
filius
. - Fix #56: bug
Ambrysus-Stål, 1862
breaks parser.
v0.7.4 - 2019-02-12 Tue
- Add #48: transliteration of diacriticals.
- Add #43: notho- (hybrids) rank supported.
- Add #52: genera with hyphens with lower or upper char after hyphen.
- Add #49: multiple hyphens in specific epithet.
v0.7.3 - 2019-02-04 Mon
- Add #54: add cleaning functions to gRPC
- Add #46: add
supg.
rank - Add #45: add
natio
rank (deprecated ICZN rank) - Add #44: documentation for canonicalName fields
- Add #42: tests for command line app
v0.7.2 - 2019-02-01 Fri
- Add #41: parse/clean multiple names from standard input.
v0.7.1 - 2019-01-24 Thu
- Add #40: add names with missing parenthesis for combination authors.
- Fix: remove typo for Scala parser URL on the parser web-page.
v0.7.0 - 2019-01-23 Wed
- Add #38: docker image can do gRPC, REST, CLI
- Add #37: flag for cleanup HTML entities and tags, underscores are part of parsing.
- Add #39: documentation for contributors.
- Add #31: continuous integration.
- Add #36: substitute underscores to spaces for Newick format.
- Add #34: escape HTML entities, remove common tags.
- Add #33: Web-based user interface and REST API.
v0.6.0 - 2019-01-16 Wed
- Add #35: gRPC method to preserve order in output according to input
- Add #30: write inline and README documentation.
- Add #29: docker and dockerhub support.
- Add #26: get all parser rules to CamelCase format.
v0.5.1 - 2019-01-15 Tue
- Add: fix Makefile
- Add #28: non-ASCII apostrophe support.
- Add #27: agamosp. agamossp. agamovar. ranks.
- Add #25: reorganize output to be more readable and logical.
- Add #24: gRPC server for receiving name-strings and streaming back the parsed results.
- Add #23: Remove multiple years. Now name can have only one year.
- Add #22: Run the parser against 24 million names from global names index and fix found problems.
- Add #21: Rebuilds tests into
test_data_new.txt
file. It is important for making global changes in tests. - Add #20: Pass all tests made for Scala gnparser. Tickets 1-19 are about approaching #20.
This document follows changelog guidelines