Skip to content

Latest commit

 

History

History
28 lines (15 loc) · 1.15 KB

File metadata and controls

28 lines (15 loc) · 1.15 KB

MSc-Y1-S1-W10-Thu-Lang-Eng-Python-12-2h-Lecture

Description

MSc-Y1-S1-W10-Thu-Lang-Eng-Python-12-2h-Lecture | Summary attempt

Content

NLTK

Tokenisation

To find page:

Step 1:

Step 2: click through to lexical analysis, which links to the relevant section of that page:

Stemming - "reducing inflected (or sometimes derived) words to their word stem, base or root form—generally a written word form." Stemming | Wikipedia

Lemmatization - "the process of grouping together the inflected forms of a word so they can be analysed as a single item, identified by the word's lemma, or dictionary form." Lemmatization | Wikipedia

References

Language Engineering Module