Phrase Seeker is a Python library that searches for phrases in a text, regardless of their form or intervening words. It was developed in February 2019 to perform text analysis for a local scientific conference.
- Search texts for phrases.
- Search for multiple pharses at once.
- Find phrases even if they weren't in their normalized forms.
- Find phrases even if there had extra words in-between (e.g. adjectives).
- Get sentence where the phrase was found.
- Get location of the sentence in the text.
- Python 3.7
$ git clone git@github.com:kirillgashkov/phrase-seeker.git
$ cd phrase-seeker
$ pip install -r requirements.txt
Note: by default seeking function won't leave cache after itself. You can change this behavior by passing
should_delete_cache=False
as an additional argument to the function. However, if the phrases are changed, you must delete the cache before using the function again (callphrase_seeker.delete_cache()
to do so).
from phrase_seeker import seek_phrases_in_text
text = "Insert your awesome text here"
phrases = ["inserted text"]
matches = seek_phrases_in_text(phrases, text)
for match in matches:
print(match.phrase.text)
print(match.sentence.start, match.sentence.end, '-', match.sentence.text)
Distributed under the MIT License. See the LICENSE.md for details.