generate_stories.py
- Generates stories using a model that was specified in
model_name_or_path
. prompt
can be given or left blank to generate stories with given prompt.length
is the number of sentences that are generated.- Two stories are generated for each prompt: One using logical connectives (specified in
log_connectives
) and one without no_cuda
if set toTrue
forces usage of cpuseed
can be fixed for reproducibility
- Generates stories using a model that was specified in
generate_interactive_story.py
- Generates stories using a model that was specified in
model_name_or_path
. prompt
can be given or left blank to generate stories with given prompt.introduction_sentences
is the number of sentences that are generated before user input is askedno_cuda
if set toTrue
forces usage of cpuseed
can be fixed for reproducibility
- Generates stories using a model that was specified in
wordvectors.py
- generates wordvectors of the words given in
emotions
. In our case we used emotions specified indata/emotions.txt
- Only words that are encoded as single tokens are regarded as probabilities of multi-token-words are not comparable with single-token-words
- wordvectors are then computed for every single entry in
prompts
and saved in a pickle file specified inline 172
- generates wordvectors of the words given in
nextWordPrediction.py
- Outputs a given
context
the topnumWords
with probabilities - Model specified in
model_name_or_path
is used - Optional: set
filter
toTrue
to filter words out. In our case we used emotions specified indata/emotions.txt
- Outputs a given
- Clone this repository
- Have python installed
- Install the requirements of the requirements.txt
- This can be done with anaconda or pip (e.g.:
pip install tqdm
) (I used a conda environmentgpt-2
that was a clone of the basic python env)conda create --name gpt-2 --clone base
- Install pytorch-transformers (
pip install pytorch-transformers
)- Note: On the cluster we dont have permission to install packages for all users, but for yourself --> use
pip3 install --user pytorch-transformers
to install packages - Note2: I advise to use a virtual environment (like conda)
- Note: On the cluster we dont have permission to install packages for all users, but for yourself --> use
- This can be done with anaconda or pip (e.g.:
B) With Docker more info on how to use docker here
- Install Docker
- Clone this repository
- Build the docker image:
docker build --tag=transformers .
- Run an interactive and detached image:
docker run -it -d transformers
- To get the running containers:
docker ps
-a shows all (also stopped containers) - To copy files to the running docker image:
docker cp <folder/file-to-copy> <container-name>:/gpt-2
- To copy files from the running docker image to the host:
docker cp <container-name>:/gpt-2 .
- To get the running containers:
- To enter the running docker image:
docker exec -it <container-name>
- Clone the repository https://github.com/huggingface/pytorch-transformers.git
- Enter repository
pytorch_transformers gpt2 $OPENAI_GPT2_CHECKPOINT_PATH $PYTORCH_DUMP_OUTPUT [OPENAI_GPT2_CONFIG]
- e.g.:
python pytorch_transformers gpt2 ..\gpt-2\checkpoint\writingprompts117M ..\nextWordPrediction\models
- Note: I needed to remove the
.
before the import in the__main__.py
line 72 to make it work - Note2: Converting a 345M and higher requires a config-file of the model
- e.g.:
Note: This was not tested whether it is working, but it requires pytorch-transformers>1.1.0
python run_lm_finetuning.py --train_data_file data\wpdump.txt --output_dir models\117M_wp --model_type gpt2 --model_name_or_path gpt2 --do_train --per_gpu_train_batch_size 4 --save_steps 1000
Please refer to my other Readme for this here
Please refer to my other Readme for this here