Skip to content

Generation of code-mixed hinglish senetences from English using LSTM, mT5 and Bart models.

Notifications You must be signed in to change notification settings

kolubex/Conditional_Code-mixed_Generation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Intro to NLP Project


Contributors:

Aryan Chandramania | Lakshmipathi Balaji | Akshit Kumar


Code

The code is in the form of Python. The code is in the code folder.


.
├── code
│   ├── data_analysis.py
│   ├── Indicbart.py
│   ├── LSTM.py
│   └── mT5.py
├── data
│   ├── dev.txt
│   ├── test.txt
│   ├── train_pre_analysis.txt
│   └── train.txt
├── Presentation.pdf
├── README.md
└── Report.pdf

Data

Google Drive

LinCE


Saved Model Checkpoints

Same Google Drive link as above


Running

You can run the Python scripts directly. Any libraries needed are installed by the scripts. The same goes for the pretrained models. All you need is an internet connection and enough compute power.

The input format for training is

<english> \t <hinglish>

The output for testing is the Hinglish sentence

About

Generation of code-mixed hinglish senetences from English using LSTM, mT5 and Bart models.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages