Skip to content

Latest commit

 

History

History

bert

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 

Files obtained from the original BERT (tensorflow) repository.

Annotated with named shapes and compacted using tensor shorthand operators.

  • all dimension variables declared once
  • shorthand notation (TSN) and warp used extensively

Benefits:

  • Several cryptic, shape wrangling functions (reshape_from_matrix, reshape_to_matrix, transpose_for_scores) turn into convenient, lucid one-liners
  • The flow of shapes becomes far more apparent in the code (courtesy both shape annotations and warp tsn arguments)
  • Avoid copying around dimension sizes as arguments (get_dim_vars by name at any location)
  • Found inconsistencies between documented and runtime shapes and duplicate definitions in the original code.

Code size reduced throughout. Lines in attention_layer function reduced from ~200 to ~175.

Code can be simplified and cleaned up further.


The TSA-annotated Pytorch version of BERT is available in a separate repository here.