From 373a33480c8587c8d84eaae3957f2df13f20e4bb Mon Sep 17 00:00:00 2001
From: Harish Subramony <81822986+hsubramony@users.noreply.github.com>
Date: Mon, 23 Sep 2024 09:26:21 -0700
Subject: [PATCH 1/2] recommend jemalloc for  gpt-neox-20b 8x

modifications to gpt-neox-20b documentation recommendation to use jemalloc instead of temalloc for this model execution
---
 examples/language-modeling/README.md | 2 ++
 1 file changed, 2 insertions(+)
diff --git a/examples/language-modeling/README.md b/examples/language-modeling/README.md
index 57cac19713..5e1b9e7fb4 100644
--- a/examples/language-modeling/README.md
+++ b/examples/language-modeling/README.md
@@ -137,6 +137,8 @@ The following command triggers the fine-tuning of [GPT-NeoX-20B](https://hugging
 Fine-tuning on 16 HPU cards (2 Gaudi2 nodes) takes around 9 minutes with a batch size of 32 (2 per device).
 It reaches a perplexity of 10.469.
 
+**Note:** For GPT-NeoX-20B model, please switch to jemalloc in case of host OOM issues using ``` export LD_PRELOAD=<path>/libjemalloc.so.2 ```
+
 > Please refer to [this page](https://github.com/huggingface/optimum-habana/tree/main/examples/multi-node-training) for performing multi-node training properly.
 
 ```bash

From e9b83305a3a442125d4d3d831ac5f610051bb959 Mon Sep 17 00:00:00 2001
From: Harish Subramony <81822986+hsubramony@users.noreply.github.com>
Date: Wed, 25 Sep 2024 08:19:09 -0700
Subject: [PATCH 2/2] review changes

---
 examples/language-modeling/README.md | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/examples/language-modeling/README.md b/examples/language-modeling/README.md
index 5e1b9e7fb4..776c89c913 100644
--- a/examples/language-modeling/README.md
+++ b/examples/language-modeling/README.md
@@ -137,7 +137,8 @@ The following command triggers the fine-tuning of [GPT-NeoX-20B](https://hugging
 Fine-tuning on 16 HPU cards (2 Gaudi2 nodes) takes around 9 minutes with a batch size of 32 (2 per device).
 It reaches a perplexity of 10.469.
 
-**Note:** For GPT-NeoX-20B model, please switch to jemalloc in case of host OOM issues using ``` export LD_PRELOAD=<path>/libjemalloc.so.2 ```
+> [!NOTE]
+>  For GPT-NeoX-20B model, please switch to jemalloc in case of host OOM issues using ``` export LD_PRELOAD=<path>/libjemalloc.so.2 ```
 
 > Please refer to [this page](https://github.com/huggingface/optimum-habana/tree/main/examples/multi-node-training) for performing multi-node training properly.