Molecule transformer using BERT based model

The BERT-based embedding model for SMILES molecule representation from the paper "Self-Attention Based Molecule Representation for Predicting Drug-Target Interaction" written by Shin et al 2019. These sources are pytorch-implemented codes.

Usage

awk '{OFS="\t"; FS="\t"; print $2}' CID-SMILES > CID-SMILES.txt
python preprocess.py -i CID-SMILES.txt -o CID-SMILES_train.txt -v vocab.voc
python main.py -i CID-SMILES_train.txt -e 5 --lossWeight none -v vocab.voc

usage: main.py [-h] [-i INPUT] [-e EPOCHS] [--lossWeight {none,log,sqrt,raw}]

optional arguments:
  -h, --help            show this help message and exit
  -i INPUT, --input INPUT
                        Input training SMILES file.
  -e EPOCHS, --epochs EPOCHS
  --lossWeight {none,log,sqrt,raw}
                        The type of class weights for the cross-entropy loss.

Final accuracy

Epochs 1, without loss weights: 0.9471
Epochs 3, without loss weights: 0.9554

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
train_log		train_log
README.md		README.md
count.sh		count.sh
main.py		main.py
make_vocab.py		make_vocab.py
model.py		model.py
molecule_transformer_trainer.py		molecule_transformer_trainer.py
preprocess.py		preprocess.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Molecule transformer using BERT based model

Usage

Final accuracy

About

Releases

Packages

Languages

odb9402/MoleculeTransformer

Folders and files

Latest commit

History

Repository files navigation

Molecule transformer using BERT based model

Usage

Final accuracy

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages