Skip to content

Illumaria/made-natural-language-processing

Repository files navigation

Natural Language Processing (MADE S02E02)

This repository contains materials for the Natural Language Processing course.

Tip #1:

Loading the entire repository can take a considerable amount of time. A single folder can be downloaded via DownGit.

Tip #2:

Sometimes GitHub failes to render a notebook. In that case use nbviewer — it works like a charm!

Tip #3:

In those cases when nbviewer fails to find a notebook whereas GitHub finds it just fine, try to add ?flush_cache=false at the end of the nbviewer link.

Legend: — slides, — code, — video.

Week What Where When
1 Tasks in NLP, text preprocessing (tokenization, normalization (stemming, lemmatization)), feature extraction (Bag-of-Words, Bag-of-Ngramms, TF-IDF), word embeddings (one-hot, matrix factorization, word2vec, CBOW, Skip-gram, GloVe). 10.03.2021
2 Embeddings: recap (word2vec), usage in unsupervised translation; cosine distance; RNNs, CNNs, n-grams, and their usage examples. 17.03.2021
3 Recap: RNN; LSTM, gates in LSTM; RNNs as encoders for sequential data; vanishing gradient problem; exploding gradient problem. 24.03.2021
4 Neural Machine Translation (NMT): problem statement, historical overview, statistical MT, beam search, BLEU/perplexity scores; Encoder-Decoder architecture, attention. 31.03.2021
5 Recap: attention in seq2seq; Transformer architecture, self-attention. 07.04.2021
6 Recap: self-attention; positional encoding, layer normalization, decoder in Transformer. 14.04.2021
7 OpenAI Transformer (pre-training decoder for language modeling), ELMo (deep contextualized word representations), BERT. 21.04.2021
8 ULMFiT, Transformer-XL, Question Answering (SQuAD, SberQuAD, ODQA), GPT. 28.04.2021

Additional materials:

About

Natural Language Processing course materials

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published