Skip to content

This project solves the IMDB review classification problem, which is a case study of Deep Learning with Python (See section 6.1.3). The book has an implementaion in Keras. I re-implement it using PyTorch.

License

Notifications You must be signed in to change notification settings

Ki-Seki/IMDB-review-classification

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

IMDB-review-classification

This project solves the IMDB review classification problem, which is a case study of Deep Learning with Python (See section 6.1.3).

The book has an implementaion in Keras. I re-implement it using PyTorch.

.
├── models                       # Custom models
│   └── SimpleModel.py           # Model introduced by Deep Learning with Python
├── resources                    # Downloaded resources
│   ├── aclImdb
│   └── glove.6B
├── tests                        # Unit tests
│   ├── test_utils_dataset.py
│   ├── test_utils_embedding.py
│   └── test_utils_tokenizer.py
└── utils
│   ├── Dataset.py               # IMDBDataset class
│   ├── Embedding.py             # GloVe embedding class
│   ├── plotting.py              # Plotting metrics history during training
│   ├── training.py              # Train and evaluate loops
│   └── Tokenizer.py             # A simple tokenizer
├── .gitignore
├── LICENSE
├── main.ipynb
├── README.md
├── requirements.txt
└── setup.sh

Usage

  1. Prepare the environment specified in requirements.txt.
  2. Run setup.sh to prepare the requested resources (IMDB and GloVe).
  3. Run main.ipynb.

What I've learned

  • PyTorch development life-cycle
  • TDD (Test Driven Development) practice
  • Tokenizer implementation (because there is no tokenizer in PyTorch as easy as Keras' tokenizer)
  • IMDB dataset preprocessing
  • GloVe embedding usage

TODO

  • The tokenizer and sequence padding should be seperated.
  • Tokenizer should support using library.
  • Implement some baseline models to make comparison.

About

This project solves the IMDB review classification problem, which is a case study of Deep Learning with Python (See section 6.1.3). The book has an implementaion in Keras. I re-implement it using PyTorch.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published