Question about model input at training and inference time. #5

kimwongyuda · 2022-11-15T02:33:02Z

Let's take an example "Allie drove to Boston for a meeting."

When I pretrain UCTopic, the model takes input_ids as [0, 50264, 324, 4024, 7, 2278, 13, 10, 529, 4, 2] (Allie can be unchanged up to unchange probability) and entity_ids as [2] (mask token of entity embedding).

Then, the model computes contrastive losses by using the hidden state from entity_ids token [2] .

However, the entity embedding from LUKE doesn't have all entities.
Also, the entity embedding only has information about entities, but doesn't have general noun phrases.

Therefore, I guess that such hidden state is weak when unseen entity or general noun phrases are inputted. Is it right? (Allie doesn't also appear in entity vocab of LUKE)

But, when I analyze your code, the model always takes entity_ids as [2] at inference phase(clustering or topic mining) as well as training phase.

So, as if the cls token of BERT represents all tokens in sentence, does the token [2] (mask token) represent entity tokens in input_ids?
Also, since the model only uses the mask token in entity vocab, the model can deal with unseen entity or general noun phrases? (we don't need to worry about the first question ?)

Thank you.

JiachengLi1995 · 2022-11-15T23:43:13Z

I guess the entity embedding table is only used for the LUKE pre-training stage. In the hugging face, we don't use that table, instead, this model use entity positions to represent entities. Seen or unseen entities are not the problems. More details please refer to hugginface and LUKE paper.
The token used in your example represents only the entity 'Allie'.
In UCTopic, it only masks entity tokens during pre-training. For other usages like inference, we don't mask any tokens in sentences. Hence, UCTopic is not restricted to mask tokens for entity vocab. The model can deal with unseen entities because of the mask strategy during pre-training.

kimwongyuda · 2022-11-18T07:10:25Z

Thank you for your explanation !

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question about model input at training and inference time. #5

Question about model input at training and inference time. #5

kimwongyuda commented Nov 15, 2022 •

edited

Loading

JiachengLi1995 commented Nov 15, 2022

kimwongyuda commented Nov 18, 2022

Question about model input at training and inference time. #5

Question about model input at training and inference time. #5

Comments

kimwongyuda commented Nov 15, 2022 • edited Loading

JiachengLi1995 commented Nov 15, 2022

kimwongyuda commented Nov 18, 2022

kimwongyuda commented Nov 15, 2022 •

edited

Loading