This repository contains homework and the final project of DeepPavlov Deep Learning for the NLP course.
The purpose of this assignment was to get familiar with the PyTorch library. The first sub-part was to implement the flatten function of an input tensor. Meaning given an input tensor of shape N x C x H x W we want to return N x (C x H x W). The second sub-part was to implement a forward pass of the two-layer fully-connected ReLU network. And the third part was to explore different types of PyTorch Module API
Assignment 2 involved the exploration of the Word2Vec model on our choice (CBOW in my case). After 10 epochs my Negative Log-Likelihood loss was around 6706. Below is a visualization of 300 words vector representations
The task was composed of two parts. In part one, it was needed to code a character-based language model with recurrent (GRU) neural network architecture. And part two consisted of an exploration of different model architectures for text classification task (models used: bidirectional LSTM, bidirectional two-layer LSTM, few Convolutional neural networks)
1. Data For my experiments I have taken 2 datasets: RuTweetCorp and Kaggle: Twitter Sentiment Analysis. Russian dataset consists of 114 911 positive and 111 923 negative tweets. English dataset consists of 56 457 positive and 43 532 negative tweets.
2. Experiment setup The purpose of these experiments was to explore whether the multilingual-BERT model can transfer knowledge from one language to another on a classification task. So, the first step of fine-tuning was done on the English dataset (99 989 samples), following with fine-tuning on the train split of Russian data (181 467 samples). Below you can see the table of classification metrics on test split (22 684 samples).
Model | Accuracy | Precision | Recall | F1 |
---|---|---|---|---|
Raw M-BERT | 49.5% | 49.5% | 99.9% | 66.2% |
MBERT-Ru | 99.9% | 99.9% | 99.9% | 99.9 % |
MBERT-En | 75.7% | 73.4% | 79.8% | 76.5% |
MBERT-En-Ru | 99.9% | 100% | 99.9% | 99.9% |
Raw RuBERT | 50.0% | 70.8% | 1% | 3% |
RuBERT | 99.9% | 100% | 99.8% | 99.9% |
Another task that we have tried to explore was Bool Question Answering. The obvious solution was to formulate this task as a binary classification.
1. Data Laking Russian BoolQ training data we have taken data that relates to a similar task - Natural Language Inference [Data source]. As NLI has 3 target classes - 'contradiction', 'neutral', 'entailment' we have tried different types of mapping for binary classification. The first approach was to map 'contradiction' and 'neutral' to 0 and 'entailment' to 1. The second approach was to remove 'neutral' samples from the dataset.
2. Experiment setup
We have tried to train and evaluate two models:
- Fine-tune Multilingual-BERT on English Bool Questions and evaluate on Russian Bool Questions
- First fine-tune of Multilingual-BERT on English Bool Questions, second fine-tune on Russian Bool Questions
- Fine-tune of Multilingual-BERT on Russian NLI dataset and second fine-tune on Russian Bool Questions
- Fine-tune of Multilingual-BERT on Russian NLI dataset and evaluate on Russian Bool Questions
Results of the experiments are presented in a table below
Model | Accuracy | Precision | Recall | F1 |
---|---|---|---|---|
M-BERT En-BoolQ | 31.9% | 31.9% | 100% | 48.3% |
M-BERT En-BoolQ; Ru-BoolQ | 73.8% | 74.5% | 98.6% | 84.9% |
M-BERT Ru-NLI; Ru-BoolQ | 74.5% | 74.5% | 100% | 85.4% |
Ru-BERT Ru-NLI | 39.2% | 32.6% | 84.9% | 47.1% |
From the table above we can see that models that were not fine-tuned on Russian Boolean Questions perform poorly, comparing to those that did.