From 112214db0d07090bc177877348fb116e4175efa7 Mon Sep 17 00:00:00 2001 From: alabulei1 Date: Wed, 1 Nov 2023 19:59:22 -0700 Subject: [PATCH] fix format Signed-off-by: alabulei1 --- docs/develop/rust/wasinn/llm-inference.md | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/docs/develop/rust/wasinn/llm-inference.md b/docs/develop/rust/wasinn/llm-inference.md index b7cf3739..6871f89e 100644 --- a/docs/develop/rust/wasinn/llm-inference.md +++ b/docs/develop/rust/wasinn/llm-inference.md @@ -96,9 +96,8 @@ You can use environment variables to configure the model execution. | Option |Default |Function | | -------|-----------|----- | -| | -LLAMA_LOG| 0 |The backend will print diagnostic information when this value is set to 1| -|LLAMA_N_CTX |512| The context length is the max number of tokens in the entire conversation| +| LLAMA_LOG | 0 |The backend will print diagnostic information when this value is set to 1| +|LLAMA_N_CTX |512| The context length is the max number of tokens in the entire conversation| |LLAMA_N_PREDICT |512|The number of tokens to generate in each response from the model| For example, the following command specifies a context length of 4k tokens, which is standard for llama2, and the max number of tokens in each response to be 1k. It also tells WasmEdge to print out logs and statistics of the model at runtime.