Jagged Sequences Cross Entropy Aggregation #8

knowlen · 2021-04-13T10:38:56Z

I believe event entropy is being computed incorrectly in the simple_lm and tiered_lm models for jagged sequences (eg; multi-stream data case). Note that jagged sequences were not used in any of the publications, and the option was only included as an experimental feature for follow up work.

Overview

in the case of jagged arrays, the code appears to compute event entropy as

∑ mask * s

but I think it should be

1/seq_len * ∑ mask * s

where

seq_len: is the sequence length of this line

mask: is a D dimensional binary vector where every index beyond seq_len is 0
      Eg; [1,1,1,1,0,0] 
      
s: is a D dimensional vector of token level cross entropy scores
   Eg; [0.18, 0.23, 0.08, 0.87, 0.06, 0.18]
                               -----  ----- unusable, zero out with mask
                               
D: is the maximum sequence length set during training
   eg; max([len(line) for line in data])

∑: is a summation over a vector
   Eg; sum([0.18, 0.23, 0.08, 0.87, 0.0, 0.0])

Without dividing by true sequence lengths, the line loss -consequentially the anomaly score (src)- becomes a function of sequence length, and the batch losses are on variable scales defined by their mean sequence lengths. This appears to be a typo, as reduce_mean is used along the sequence axis when lengths are not jagged.

Trace:

token losses defined in lm_rnn and batch_softmax_dist_loss
token loss aggregation here

Proposed Fix

Change this line in simple_lm.py and tiered_lm.py

line_losses = tf.reduce_sum(token_losses, axis=1)  # batch_size X 1

to

true_seq_len = tf.reduce_sum(ph_dict['mask'], axis=-1)
line_losses = tf.reduce_sum(token_losses, axis=1) / true_seq_len

The text was updated successfully, but these errors were encountered:

knowlen changed the title ~~Bug for jagged-sequences training~~ Jagged Sequences Cross Entropy Aggregation Apr 13, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Jagged Sequences Cross Entropy Aggregation #8

Jagged Sequences Cross Entropy Aggregation #8

knowlen commented Apr 13, 2021

Jagged Sequences Cross Entropy Aggregation #8

Jagged Sequences Cross Entropy Aggregation #8

Comments

knowlen commented Apr 13, 2021

Overview

Proposed Fix