Skip to content

Releases: dsfsi/textaugment

2.0.0 16-11-2023

16 Nov 20:54
Compare
Choose a tag to compare
  • now supports gensim >= 4

  • now support fasttext models

  • enhanced code to allow user to select top n words (synonyms/most similar words)

  • added punctuation insertion

1.3.4 05-11-2020

05 Nov 15:04
Compare
Choose a tag to compare
  • Fixed minor issues

1.3.3 21-10-2020

21 Oct 14:16
7f8af41
Compare
Choose a tag to compare
  • Added support for Fasttext augmentation
  • Added example notebook for Fasttext augmentation

1.3.2 10-06-2020

10 Jun 12:46
Compare
Choose a tag to compare
  • minor updates

1.3.1 29-05-2020

30 May 22:14
Compare
Choose a tag to compare
  • fix minor issues

1.3 29-05-2020

28 May 22:21
Compare
Choose a tag to compare
  • added mixup augmentation algorithm for NLP

1.2 23-05-2020

23 May 19:34
Compare
Choose a tag to compare
  • Added support for EDA algorithm
  • Added examples using Jupyter notebook

1.1, 16-07-2019

16 Jul 07:47
Compare
Choose a tag to compare

Updated ReadMe and icons.

  • Added licence icon.
  • Release icon.
  • Wheel icon.
  • Python version icon.

Added pre-print paper citation.

Initial release, 16-07-2019

16 Jul 07:20
Compare
Choose a tag to compare

TextAugment is a Python 3 library for augmenting text for natural language processing applications. TextAugment stands on the giant shoulders of NLTK, Gensim, and TextBlob and plays nicely with them.

Requirements

  • Python 3
    The following software packages are dependencies and will be installed automatically.
$ pip install numpy nltk gensim textblob googletrans 

The following code downloads wordnet, tokenizer, and part-of-speech tagger model.

nltk.download('wordnet')
nltk.download('punkt')
nltk.download('averaged_perceptron_tagger')

Install from pip [Recommended]

$ pip install textaugment

How to use

>>> from textaugment import Word2vec
>>> t = Word2vec(model='path/to/gensim/model'or 'gensim model itself')
>>> t.augment('The stories are good')
The films are good

Citation

@article{marivate2019improving,
  title={Improving short text classification through global augmentation methods},
  author={Marivate, Vukosi and Sefara, Tshephisho},
  journal={arXiv preprint arXiv:1907.03752},
  year={2019}
}

Built with ❤ on Python