Skip to content
This repository has been archived by the owner on May 3, 2023. It is now read-only.

v0.23.0

Latest
Compare
Choose a tag to compare
@MartinBernstorff MartinBernstorff released this 26 Apr 10:16
· 25 commits to main since this release

Feature

  • Add logging and choose sfi types (d5f8e23)
  • Create example scripts (76e063a)
  • Initial text model pipelines (1934db0)
  • Add tests (d7a8bab)
  • Initial simple preprocessing pipeline for all sfis (f941a4d)
  • Add include_sfi_name in load_text_split (4605c88)
  • Include_sfi_name arg (58baf9a)
  • Fit and load tfidf, bow, and lda models (3d33d9b)

Fix

  • Preprocess to one regex (c716653)
  • Remove symbols again (1210b7e)
  • Based on HLasses comments (32da48f)
  • Insert model type in filename (1457387)
  • Add doc strings to preprocessing functions (4e27650)
  • Remove log.info and small fixes (84f3cc3)
  • Ruff fixes (ea9c564)
  • Return vectorizer and matrix + clean-up (e1c48a0)
  • Query string (cb7424c)
  • Naming and doc string update (141e52a)
  • General clean-up and change corpus in fit functions to list (22b6a9e)
  • Change ngram default and clean-up (387f845)
  • Small fixes to logging (c3a3f53)
  • Remove old comments (4b88514)
  • Change view name (a9bb0fc)
  • Move save_text_model_to_dir to utils (469df3b)
  • Move save_text_model_to_dir to utils (26a80d2)
  • Renaming in preprocessing (c381768)
  • Remove stop_words arg and return models (3d29012)
  • Change arg path to path_str (f781a74)
  • Enable multiple splits when loading data + add n_rows arg (8ae2d2e)
  • Remove Path from arg (29b442b)