Releases: ksjae/KoGPT
HF Transformers compatible KoGPT2
Someone asked me to make a Transformers-compatible version of KoGPT2.
Well, here you go.
Sorry this is not a full version, couldn't get Transformers to train properly (ugh)
https://nas.ksjit.com:5001/sharing/AcmROoopD
If link is down, don't hesitate to email me. It is likely that I may not know.
Version 0.2 final release
Will not be updated until I get my hands on more data.
Couldn't make the 2.7B model on TPUs because OOM.
Available from my server
v0.2 for SNS article generation
Do note that this model was originally for book content generation, but experiments show this model is more suited for SNS content generation.
Available from my server and Google Drive
v0.2 optimized for plain written text
All releases will be available from my server. If Google drive gives too much hassle(e.g. headless linux), just wget from there.
v0.2-news is not trained on the book dataset(any dataset incl. books) but everything else.
Trained on v3-8 TPUs for 5 days(about 205K steps)
My server : here
Google Drive : uploading
From this release, I'm switching to home-brew code. This makes the model unable to be loaded in HF Transformers. I'll make a pull request later this year(lots of paperwork involved).
For demo/generation, check out the train code(ksjae/KoGPT2-train).
1.5B is finally here!
All releases will be available from my server. If Google drive gives too much hassle(e.g. headless linux), just wget
from there.
v0.2-beta is only trained on ~700MB web corpus, on v3-8 TPUs for 36 hours(about 80K steps)
My server : here
Google Drive : here
From this release, I'm switching to home-brew code. This makes the model unable to be loaded in HF Transformers. I'll make a pull request later this year(lots of paperwork involved).
For demo/generation, check out the train code
v0.1-beta
KoGPT2-large
KoGPT2-large trained on
- Kookmin Uni. Dataset
- Sejong corpus
- Namuwiki dump
Trained for 3 weeks on 2*V100
Sorry for that. Get it from Google drive