-
Notifications
You must be signed in to change notification settings - Fork 0
NLP Related
Tags: Machine Learning, Neural Networks, NLP, Text Classification
Let's have a look at the main approaches to NLP tasks that we have at our disposal. We will then have a look at the concrete NLP tasks we can tackle with said approaches.
Although NLP and text mining are not the same thing, they are closely related, deal with the same raw data type, and have some crossover in their uses. Let's discuss the steps in approaching these types of tasks.
Tags: Modeling, Natural Language Processing, NLP, Text Analytics, Text Mining
When performing a natural language processing task, our text data transformation proceeds more or less in this manner:
raw text corpus → processed text → tokenized text → corpus vocabulary → text representation
Keep in mind that this all happens prior to the actual NLP task even beginning.
The corpus vocabulary is a holding area for processed text before it is transformed into some representation for the impending task, be it classification, or language modeling, or something else.
The vocabulary serves a few primary purposes:
- help in the preprocessing of the corpus text
- serve as storage location in memory for processed text corpus
- collect and store metadata about the corpus
- allow for pre-task munging, exploration, and experimentation
Tags: NLP, Representation, Text Mining, Word Embeddings, word2vec