Skip to content

Latest commit

 

History

History
9 lines (6 loc) · 769 Bytes

File metadata and controls

9 lines (6 loc) · 769 Bytes

Siamese-LSTM-for-Semantic-Similarity-PyTorch

This repositpory entails an implementation of a Deep Learning Pipeline that can be used to evaulate the semantic similarity of two sentenences using PyTorch. The model of choice is a Siamese LSTM Neural Network.

It consists of 2 modules:
- a dataset module that handles the data preparation and data loading
- a model module that handles the model configuration, the training, evaluation and prediction algorithms

The dataset used for this task was downloaded from https://www.kaggle.com/quora/question-pairs-dataset. In order to simplifiy the setup, the dataset can be modified to only used 50k examples and a question length of 30 to 50 characters. The modified dataset is included in the repository.