Pipeline Overview

Here, I/we are collecting an overview over the processing pipeline and data types across the whole cookbook project. This is just a sketch for now but I thought I'd upload it already so people could see what I was doing. I'll also add some examples and Wiki pages for the different types of recipe graphs (action graphs, etc.).

%%{init: { 'securityLevel':'loose', 'startOnLoad': 'true' } }%%

graph TD;
    corpus[L'20 corpus] -- input --> tp[Tagger & Parser];
    tp -- input --> alignment[Alignment Model];
    tp -- action graphs --> crowd[Crowdsourcing];
    alignment -- top k predictions --> crowd;
    crowd -- training data --> alignment;
    crowd -- ARA --> alignment;
    yama[Y'20 corpus] -- training data --> tp;
    alignment -.-> generation;

    style generation stroke-dasharray: 5 5;

Overview: Tagger input from Microsoft corpus (Lin'20)

Repository: processing_microsoft_corpus

graph TD;
    A[ ] -- 'common_crawl_text_recipes.json' --> process['process_raw_json_corpus.py'];
    B[ ] -- 'recipe_pairwise_alignment' --> original['original_recipe_groupings.py'];
    original -- 'original_groupings_of_recipes.json' --> process;
    process -- dir structure with raw data --> j2t['recipe_json2tagger.py'];
    C[ ] -- 'ara2urls_mappings.json' --> j2t;
    j2t -.-> |tagger input files| tagger[Tagger];

    style tagger stroke-dasharray: 5 5;

Steps

Input / Basis

common_crawl_text_recipes.json

All recipes of the Microsoft corpus with raw text instructions split up into individual sentences.
Source: Download at
File format: json file with one "entry" for each dish which contains "entries" for all the recipes of that dish (at least one recipe per dish).
Example:

recipe-pairwise-alignment

Part of the Microsoft corpus; relevant for its file structure, i.e. for the information about train-dev-test splits and how recipes are grouped into dishes.
Source: download corpus as per these instructions; the relevant data is located at multimodal-aligned-recipe-corpus\recipe-pairwise-alignment
Example:

Groupings

Read in the structural information in the recipe-pairwise-alignment directory.
Run: python original_recipe_groupings.py after adjusting CORPUS_PATH to point to recipe-pairwise-alignment.
Output: json file original_grouping_of_recipes.json with one entry for each recipe in the recipe-pairwise-alignment corpus that specifies to which dish and which data split the recipe belongs.
Example:

Extract Recipes

Use structural information and raw corpus to create directory structure with train-dev-test splits and dishes.
Example prompt: python process_raw_json_corpus.py --raw common_crawl_text_recipes.json --grouping original_grouping_of_recipes.json --out raw-structured
Output: Creates directory raw-structed with one sub-directory per data split. Each data split has one sub-directory per dish containing exactly one file. Each line in this file is a json object corresponding to one recipe and with the same information as in common_crawl_text_recipes.json.
Example file structure:
Example dish file:

Tagger Input

Create files that can be used as tagger input. Either uses recipe IDs from ARA (file ara2urls_mappings.json needs to be in the same directory as the script) or assigns random new integer IDs in order to distinguish between recipes of the same dish.
Example prompt: python recipe_json2tagger_input.py --source raw-structured --target tagger-input (optional: --ara True)
Output: Creates directory tagger-input with one file for each recipe (no sub-directories!). Each file has one 'sentence' json object per line.
Example file:

Overview: Tagger & Parser - Training

Repository: tagger-parser

graph TD;
    A[ ] -- train.conll03 --> trained[tagger];
    B[ ] -- bert-large-eng.jsonnet --> trained;
    C[ ] -- train.conllu --> parser;
    D[ ] -- parser.jsonnet --> parser;
    trained -.-> |prediction pipeline| parser;
    parser -.-> |predicted action graphs| am[Alignment Model];
    parser -.-> |predicted action graphs| cr[Crowdsourcing];

    style am stroke-dasharray: 5 5;
    style cr stroke-dasharray: 5 5;

Steps

Environment

Setup environment from requirements.txt or use environment on the Coli servers at proj/cookbook.shadow/conda/envs/allennlp2.8.

Training Tagger

The recommended tagger (as well as additional trained taggers) trained with AllenNLP 2.8 can be found on the Coli servers at /proj/cookbook.shadow/Models_2022.05/bert-large-tagger.
To train a new tagger with AllenNLP 2.8, navigate to the dev branch of the repository.
Input:

training data can be found at data/English/Tagger/train.conll03 or on the Coli servers at /proj/cookbook.shadow/Yamakata/Data/yamakata_train_240.conll03.
configutation file; e.g. tagger/bert-large-eng.jsonnet. Example prompt: allennlp bert-large-eng.jsonnet -s trained-tagger after adapting the paths in the configuration file.
Output: creates a folder (e.g. trained-tagger) containing the trained model, training logs, etc.

Training Parser

The recommended parser trained with AllenNLP 2.8 can be found on the Coli servers at /proj/cookbook.shadow/Models_2022.05/parser.
To train a new parser with AllenNLP 2.8, navigate to the dev branch of the repository.
Input:

training data can be found at data/English/Parser/train.conllu or on the Coli servers at /proj/cookbook.shadow/Yamakata/Data/yamakata_train_240.conllu.
configutation file parser/parser.jsonnet. Example prompt: allennlp parser.jsonnet -s trained-parser after adapting the paths in the configuration file.
Output: creates a folder (e.g. trained-parser) containing the trained model, training logs, etc.

Overview: Tagging & Parsing

Repository: tagger-parser

For our paper, we used the AllenNLP 0.8 versions of the tagger and parser (main branch on GitHub); we are currently using the AllenNLP 2.8 versions (dev branch). For both, there are trained models ready to use on the Coli servers.

graph TD;
    A[ ] --> |input.json| tagger[Trained Tagger];
    tagger --> |tagged_recipe.json| 2conll[json_to_conll.py];
    2conll --> |tagged_recipe.conllu| parser[Trained Parser];
    parser --> |parsed_recipe.json| conll[json_to_conll.py];
    conll --> |parsed_recipe.conllu| red[reduce_graph.py];
    red -.-> |action_graph.conllu| alignment[Alignment Model];
    red -.-> |action_graph.conllu| crowd[Crowdsourcing];
    red -.-> |fat_graph.conllu| B[ ];

    style alignment stroke-dasharray: 5 5;
    style crowd stroke-dasharray: 5 5;
    style B stroke-dasharray: 5 5;

Steps

Models

You can find trained models here:

AllenNLP 0.8 tagger (the one we used for the paper): /proj/cookbook.shadow/Models/yamakata_eng_elmo_300_0
AllenNLP 2.8 tagger (most recent): /proj/cookbook.shadow/Models_2022.05/bert-large-tagger/model.tar.gz
AllenNLP 0.8 parser (the one we used for the paper): /proj/cookbook.shadow/Models/yamakata_deps_300_0.tar.gz
AllenNLP 2.8 parser (most recent): /proj/cookbook.shadow/Models_2022.05/parser/model.tar.gz

Tagger

Input

The tagger input is typically one recipe per file. Input files are either
a) a file with json 'sentence' objects containing the raw text of one recipe sentence.
Example:
or
b) a file in CoNLL-2003 format (which is also the output format of the tagger post-processing): tsv file where the first column are the tokens of the recipe text (one token in each line). [empty line between sentences?] Note: the recipe text needs to be tokenized if you want to use this option.
Example:

Tagging

Run conda: source /proj/cookbook.shadow/run_conda.sh
Activate conda environment: conda activate /proj/cookbook.shadow/conda/envs/allennlp2.8
Run tagger: allennlp predict /proj/cookbook.shadow/Models_2022.05/bert-large-tagger [input file as described above] --output-file tagged_recipe.json (use --use-dataset-reader flag if the input is in CoNLL-2003 format)

Output

json file with one line per sentence. Tokens and predicted tags are at keywords
Example:

Tagger Post-Processing

Convert tagger output into CoNLL-U format.

Example prompt: python data-scripts/json_to_conll.py -m tagger -p tagged_recipe.json (see data-scripts/README.md for additional arguments)

Output

CoNLL-U file (e.g. tagged_recipe.conllu) for one recipe: tab-separated values, first column ID, second column TOKEN, fourth column predicted TAG, seventh column (HEAD) with default value 0, eighth column (DEPREL) with default value root.
Example:

Additional Scripts

data-scripts/error_analysis.py: creates files for manual error analysis

Parser

Input

CoNLL-U file (output of tagger post-processing)

Parsing

Run conda: source /proj/cookbook.shadow/run_conda.sh
Activate conda environment: conda activate /proj/cookbook.shadow/conda/envs/allennlp2.8
Run parser: allennlp predict /proj/cookbook.shadow/Models_2022.05/parser tagged_recipe.conllu --output-file parsed_recipe.json --use-dataset-reader

Output

json file with one line per sentence. Tokens and predicted tags are at keywords
Example:

Parser Post-Processing

Convert parser output into recipe graph[] (CoNLL-U format).
Run: python data-scripts/json_to_conll.py -m parser -p parsed_recipe.json (see data-scripts/README.md for additional arguments)

Output

CoNLL-U file (e.g. parsed_recipe.conllu) containing a recipe graph: tab-separated values, first column ID, second column TOKEN, fourth column TAG, seventh column predicted HEAD, eighth column predicted RELATION type to head, additional (HEAD,RELATION) pairs in the ninth column if applicable.
Example:

Additional Scripts

data-scripts/error_analysis.py: creates files for manual error analysis
data-scripts/parser_evaluation.py: performs labelled evaluation on parser outputs

Simplify Recipe Graph

For some tasks, we need simpler graphs than the full recipe graphs generated by the parser. We reduce the set of relevant tags, thus deleting some nodes from the original graph (e.g. only action nodes for action graphs, only food, action and tool nodes for FAT graphs). There is an edge (directed but not labelled) between each pair of nodes where there was a path between these nodes in the original graph.
Run: python data-scripts/reduce_graph.py -m ['a' or 'fat'] -f parsed_recipe.conllu

Output

Creates CoNLL-U file with simplified graph at default output path data/dishname/<recipe_name>.conllu. All edge labels are 'edge'.

Additional Scripts

data-scripts/reduce_dir_to_action_graphs.py: calls data-scripts/reduce_graph.py for all files in a directoy.

Overview: Alignment Predictions

Repository: alignment-models
! Under construction - no details for now, changes ahead !

Steps

Train Alignment Model

Input

Folder structure where the main directory is called recipes-for-training contains one sub-direcory for each dish and each dish directory contains

a subdirectory recipes with action graphs in .conllu format
a file alignments.tsv with gold alignments in the four columns file1, token1, file2, token2.

Training

Activate or create conda environment (/proj/cookbook.shadow/conda/envs/alignment).
Run: python main.py Alignment-with-feature

Run Alignment Model

! Not implemented yet !

Input

Folder structure where the main directory is called recipes-input-model contains one sub-direcory for each dish and each dish directory contains
- a subdirectory recipes with action graphs in .conllu format
- a dummy alignment file alignments.tsv to provide recipe pairing information (won't elaborate further as this is currently changing).
Trained alignment model (soon to be found on the coli servers).
Empty target directory results1.

Prediction

Activate or create conda environment (/proj/cookbook.shadow/conda/envs/alignment).
Run: something like python main.py Alignment-with-feature -k 1

Output

Probably, one alignment file per dish called predicted_alignments_<dishname>.tsv with all the predicted alignments between recipe pairs of that dish.

Overview: Crowdsourcing Alignments with top k Predictions

Repository: crowdsourcing
For each action in a source recipe, use alignment model to predict top k (we've been using k=7) possible aligned actions in a target recipe. Create lists to display in crowdsourcing where the workers can choose between these k options and "None".

graph TD;
    A[Alignment Model] -.-> |predcitions.tsv| 2crowd[conllu2crowd_topk.py];
    B[Parser] -.-> |CoNLL-U recipe graphs| 2crowd;
    2crowd --> |Lists| lingo[LingoTurk];
    C[ ] --> |round3-instructions| lingo;
    lingo --> |results.csv| post[postprocess_crowdsourcing.py];
    post --> |results.tsv| stats[extract_annotations_and_stats.py];
    stats -.-> |alignments.tsv| A;
    stats -.-> |plots and statistics| plot[ ];
    post --> |results.tsv| r2[generate_non_majority_questionnaire.py];
    2crowd --> |Lists| r2;
    r2 --> |List| lingo;

    style A stroke-dasharray: 5 5;
    style plot stroke-dasharray: 5 5;
    style B stroke-dasharray: 5 5;

Steps

Input

A dir structure, e.g. crowdsourcing-input with one sub-directory per dish containing

one file predictions.tsv per dish with the top k alignment predictions.
a \recipes sub-directory with recipe graphs in individual CoNLL-U format files.
Example dir structure:
Example prediction file:

Lists

Pair up recipes from longest to shortest (within dish) and create lists with formatted experiment slides. Actions in the target recipes are coloured and indexed. In the source recipes, actions are bolded and only the up to two experiment items per slide are coloured. Example prompt: python topk-alignments/conllu2crowd_topk.py crowdsourcing-input --target-indexed Output: creates directory Lists with all the lists in tsv format. A lists has one slide per row and the following columns: recipe1, recipe2, question1, question2, options1, options2, indices1, indices2, q1_id, q2_id, documentid1, documentid2 document ID of recipe2, dish_name, slideid.
Example list:

Publish

Upload slides to LingoTurk and publish experiment.

Go to https://multitude.coli.uni-saarland.de:8080/ and log in (log-in data on crowdsourcing Wiki).
Click create experiment. Click CookBookPPZhai20201220.
Copy instructions (instructions/round2-instructions) and upload lists. Click save in database.
In view existing experiments, find the experiment and click publish.

Run experiment.
Results: Experiment results come in csv files with the following columns: filename, listnumber, assignmentid, hitid, workerid, origin, timestamp, partid, questionid, answer, recipe1, recipe2, question1, options1, question2, options2, indices1, indices2, q1, q2, documentid1, documentid2, dishname, id.

Evaluation / Post-processing

Fix output if necessary

Column names might be swapped. Use Post_Processing_maynotwork.ipynb to fix the data.

Extract the relevant fields from the crowdsourcing results

Input: Prolific result files. Example prompt: python postprocess_crowdsourcing.py results.csv Output: for each input csv file in input, creates a corresponding tsv file with the following columns: worker ID, question ID, question token ID, question doc ID, answer token ID, answer doc ID, numanswers.

Get statistics

Input: post-processed crowdsourcing results Example prompt: python extract_annotations_and_stats.py results.tsv Output: printed to command line and displayed in plots.
(Additional statistics might be computed with Post_Processing_maynotwork.ipynb and action_distributions.py.)

Get alignment model training data

Haven't found a script for this but the class AnswerExtractor implements a method data_to_majority(dataset) that may be utilized.

Follow-Up Questionnaires

If there's too many questions without a majority vote, a follow-up round of experiments needs to be conducted. Input: lists and post-processed result files. Example prompt: python generate_non_majority_questionnaire.py results.tsv after adapting QUESTIONNAIRE_LOCATION in the script to point to the lists used for the experiment. Output: a list file missing_majority_questionnaire.tsv suitable as experiment input.

Pipeline Overview

Overview: Tagger input from Microsoft corpus (Lin'20)

Input / Basis

common_crawl_text_recipes.json

recipe-pairwise-alignment

Groupings

Extract Recipes

Tagger Input

Overview: Tagger & Parser - Training

Environment

Training Tagger

Training Parser

Overview: Tagging & Parsing

Models

Tagger

Input

Tagging

Output

Tagger Post-Processing

Output

Additional Scripts

Parser

Input

Parsing

Output

Parser Post-Processing

Output

Additional Scripts

Simplify Recipe Graph

Output

Additional Scripts

Overview: Alignment Predictions

Train Alignment Model

Input

Training

Run Alignment Model

Input

Prediction

Output

Overview: Crowdsourcing Alignments with top k Predictions

Input

Lists

Publish

Evaluation / Post-processing

Fix output if necessary

Extract the relevant fields from the crowdsourcing results

Get statistics

Get alignment model training data

Follow-Up Questionnaires

Clone this wiki locally