Skip to content

Releases: mrjleo/fast-forward-indexes

Fast-Forward Indexes v0.7.0

19 Dec 13:46
e922090
Compare
Choose a tag to compare

Encoders

  • TransformerEncoder class has been made more modular to make extending it easier.
  • TASBEncoder, ContrieverEncoder, and BGEEncoder classes have been added.
  • Defaults for the model argument have been added to Transformer encoder classes.

Misc

  • python-terrier dependency has been increased to 0.12.
  • PyTerrier transformers now call pyterrier.model.add_ranks.

Fast-Forward Indexes v0.6.0

13 Dec 12:58
30a2c37
Compare
Choose a tag to compare

Toolchain

  • Now uses uv for dependency management.
  • Linting and format checking using ruff has been enabled and configured.
  • Type checking using pyright has been enabled and configured.

Codebase

  • Minimum supported Python version is now 3.10.
  • Type hints have been modernized.
  • Repository has been converted to src-layout.
  • Docstrings have been converted to ReST format.
  • Library is now pyright-compliant (standard mode).
  • py.typed marker has been added.

API changes

  • All indexes and Mode are now imported from fast_forward.index.
  • All quantizers are now imported from fast_forward.quantizer.
  • Indexer is now imported from fast_forward.util.

Fast-Forward Indexes v0.5.1

04 Nov 09:43
9f7e636
Compare
Choose a tag to compare
  • Ranking.interpolate and Ranking.__add__ (+ operator) now treat missing scores in either ranking as zero.

Fast-Forward Indexes v0.5.0

18 Oct 15:55
42722aa
Compare
Choose a tag to compare

Ranking operations

  • Rankings now implement the + and * operators.
  • Rankings can now be normalized via Ranking.normalize (min-max normalization).
  • Reciprocal rank fusion is now supported via Ranking.rr_scores.

Misc

  • Index.__call__ now accepts a batch_size argument.

Fast-Forward Indexes v0.4.1

11 Oct 09:54
7c15966
Compare
Choose a tag to compare
  • Fixed a bug where OnDiskIndex would not respect the resize_min_val argument properly.
  • Fixed a bug where Indexer.from_dicts would ignore the encoder batch size in some cases.
  • Minor updates in the documentation for Indexer.

Fast-Forward Indexes v0.4.0

02 Oct 08:18
c6b0173
Compare
Choose a tag to compare

Indexer

  • Now supports transferring vectors from one index to another.
  • Now supports automatically training a quantizer during indexing.
  • The encoder has been made optional.

API changes

  • Indexer.index_dicts has been renamed to Indexer.from_dicts.
  • Indexer now takes a batch_size and an encoder_batch_size.
  • Index.__call__: early_stopping_intervals has been renamed to early_stopping_depths.
  • OnDiskIndex: ds_buffer_size has been renamed to max_indexing_size.
  • OnDiskIndex.to_memory: buffer_size has been renamed to batch_size.
  • util.create_coalesced_index: buffer_size has been renamed to batch_size.

Fast-Forward Indexes v0.3.1

05 Sep 09:53
b7db728
Compare
Choose a tag to compare
  • Optimized product quantization has been implemented via fast_forward.quantizer.nanopq.NanoOPQ.
  • Index.quantizer property has been added, allowing to attach a quantizer to an empty existing index.
  • Some outdated code snippets in the documentation have been fixed.

Fast-Forward Indexes v0.3.0

25 Aug 14:09
de69cdc
Compare
Choose a tag to compare

Index operations

  • When calling Index.add, the sequences doc_ids and psg_ids can now contain None elements, as long as each vector has at least one ID.
  • Indexes (vectors and corresponding IDs) can now be iterated over using Index.batch_iter and Index.__iter__.

Vector quantization

  • Indexes now support vector quantization via the fast_forward.quantizer.Quantizer interface.
  • fast_forward.quantizer.nanopq.NanoPQ implements product quantization based on nanopq.

Misc

  • The default ranking mode has been changed to MAXP.

API changes

  • The dim argument has been removed from OnDiskIndex and InMemoryIndex.
  • The dtype argument has been removed from OnDiskIndex.

Fast-Forward Indexes v0.2.1

13 Aug 14:06
26ee1cf
Compare
Choose a tag to compare
  • Transformer-based encoders now use torch.no_grad
  • Requirements have been made more precise by fixing the major versions
  • Minor optimizations for early stopping
  • Minor fixes in the documentation

Fast-Forward Indexes v0.2.0

10 Mar 20:36
4fbca6e
Compare
Choose a tag to compare

Index structures

  • New: OnDiskIndex is based on HDF5 and can be accessed on-demand from disk
  • Indexes can now grow dynamically in size

Performance

  • Data is now represented using pandas data frames internally
  • Many operations have been vectorized to improve performance
  • Early stopping now works in batches rather than per query

Misc

  • New: Indexer class for indexing corpora
  • New: PyTerrier transformers are provided for scoring and interpolation using Fast-Forward indexes

API changes

Many parts of the API have changed. Some of the most important breaking changes:

  • Scores are now computed using Index.__call__
  • Queries are not explicitly provided anymore but attached to the ranking
  • InMemoryIndex objects cannot be saved to or loaded from disk anymore