Skip to content

Commit

Permalink
Fix notebook failure with Keras 3.
Browse files Browse the repository at this point in the history
PiperOrigin-RevId: 613736720
  • Loading branch information
MarkDaoust authored and tf-text-github-robot committed Mar 12, 2024
1 parent 8d6b5e0 commit 41613bf
Showing 1 changed file with 10 additions and 12 deletions.
22 changes: 10 additions & 12 deletions docs/tutorials/word2vec.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -86,7 +86,7 @@
"id": "xP00WlaMWBZC"
},
"source": [
"## Skip-gram and negative sampling "
"## Skip-gram and negative sampling"
]
},
{
Expand All @@ -95,7 +95,7 @@
"id": "Zr2wjv0bW236"
},
"source": [
"While a bag-of-words model predicts a word given the neighboring context, a skip-gram model predicts the context (or neighbors) of a word, given the word itself. The model is trained on skip-grams, which are n-grams that allow tokens to be skipped (see the diagram below for an example). The context of a word can be represented through a set of skip-gram pairs of `(target_word, context_word)` where `context_word` appears in the neighboring context of `target_word`. "
"While a bag-of-words model predicts a word given the neighboring context, a skip-gram model predicts the context (or neighbors) of a word, given the word itself. The model is trained on skip-grams, which are n-grams that allow tokens to be skipped (see the diagram below for an example). The context of a word can be represented through a set of skip-gram pairs of `(target_word, context_word)` where `context_word` appears in the neighboring context of `target_word`."
]
},
{
Expand Down Expand Up @@ -189,7 +189,7 @@
"id": "Y5VWYtmFzHkU"
},
"source": [
"The [noise contrastive estimation](https://www.tensorflow.org/api_docs/python/tf/nn/nce_loss) (NCE) loss function is an efficient approximation for a full softmax. With an objective to learn word embeddings instead of modeling the word distribution, the NCE loss can be [simplified](https://papers.nips.cc/paper/5021-distributed-representations-of-words-and-phrases-and-their-compositionality.pdf) to use negative sampling. "
"The [noise contrastive estimation](https://www.tensorflow.org/api_docs/python/tf/nn/nce_loss) (NCE) loss function is an efficient approximation for a full softmax. With an objective to learn word embeddings instead of modeling the word distribution, the NCE loss can be [simplified](https://papers.nips.cc/paper/5021-distributed-representations-of-words-and-phrases-and-their-compositionality.pdf) to use negative sampling."
]
},
{
Expand Down Expand Up @@ -447,7 +447,7 @@
"id": "_ua9PkMTISF0"
},
"source": [
"### Negative sampling for one skip-gram "
"### Negative sampling for one skip-gram"
]
},
{
Expand Down Expand Up @@ -630,7 +630,7 @@
"id": "iLKwNAczHsKg"
},
"source": [
"### Skip-gram sampling table "
"### Skip-gram sampling table"
]
},
{
Expand All @@ -639,7 +639,7 @@
"id": "TUUK3uDtFNFE"
},
"source": [
"A large dataset means larger vocabulary with higher number of more frequent words such as stopwords. Training examples obtained from sampling commonly occurring words (such as `the`, `is`, `on`) don't add much useful information for the model to learn from. [Mikolov et al.](https://papers.nips.cc/paper/5021-distributed-representations-of-words-and-phrases-and-their-compositionality.pdf) suggest subsampling of frequent words as a helpful practice to improve embedding quality. "
"A large dataset means larger vocabulary with higher number of more frequent words such as stopwords. Training examples obtained from sampling commonly occurring words (such as `the`, `is`, `on`) don't add much useful information for the model to learn from. [Mikolov et al.](https://papers.nips.cc/paper/5021-distributed-representations-of-words-and-phrases-and-their-compositionality.pdf) suggest subsampling of frequent words as a helpful practice to improve embedding quality."
]
},
{
Expand Down Expand Up @@ -815,7 +815,7 @@
"id": "sOsbLq8a37dr"
},
"source": [
"Read the text from the file and print the first few lines: "
"Read the text from the file and print the first few lines:"
]
},
{
Expand Down Expand Up @@ -1178,11 +1178,9 @@
" super(Word2Vec, self).__init__()\n",
" self.target_embedding = layers.Embedding(vocab_size,\n",
" embedding_dim,\n",
" input_length=1,\n",
" name=\"w2v_embedding\")\n",
" self.context_embedding = layers.Embedding(vocab_size,\n",
" embedding_dim,\n",
" input_length=num_ns+1)\n",
" embedding_dim)\n",
"\n",
" def call(self, pair):\n",
" target, context = pair\n",
Expand Down Expand Up @@ -1222,7 +1220,7 @@
" return tf.nn.sigmoid_cross_entropy_with_logits(logits=x_logit, labels=y_true)\n",
"```\n",
"\n",
"It's time to build your model! Instantiate your word2vec class with an embedding dimension of 128 (you could experiment with different values). Compile the model with the `tf.keras.optimizers.Adam` optimizer. "
"It's time to build your model! Instantiate your word2vec class with an embedding dimension of 128 (you could experiment with different values). Compile the model with the `tf.keras.optimizers.Adam` optimizer."
]
},
{
Expand Down Expand Up @@ -1424,8 +1422,8 @@
],
"metadata": {
"colab": {
"collapsed_sections": [],
"name": "word2vec.ipynb",
"provenance": [],
"toc_visible": true
},
"kernelspec": {
Expand Down

0 comments on commit 41613bf

Please sign in to comment.