Skip to content

Commit

Permalink
[DOCS] Port LLM output (#25564)
Browse files Browse the repository at this point in the history
### Details:
 - *item1*
 - *...*

### Tickets:
 - *ticket-id*

### Details:
 - *item1*
 - *...*

### Tickets:
 - *ticket-id*
  • Loading branch information
tsavina authored Jul 15, 2024
1 parent cd4afd6 commit 09c1c02
Showing 1 changed file with 8 additions and 7 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,8 @@ will not work with these instructions, make sure to

.. code-block:: python
optimum-cli export openvino --model "TinyLlama/TinyLlama-1.1B-Chat-v1.0" --weight-format int4 --trust-remote-code
optimum-cli export openvino --model "TinyLlama/TinyLlama-1.1B-Chat-v1.0" --weight-format int4 --trust-remote-code "TinyLlama-1.1B-Chat-v1.0"
2. Perform generation using the new GenAI API:

Expand Down Expand Up @@ -82,7 +83,7 @@ below, where a lambda function outputs words to the console immediately upon gen
import openvino_genai as ov_genai
pipe = ov_genai.LLMPipeline(model_path, "CPU")
streamer = lambda x: print(x, end='', flush=True)
pipe.generate("The Sun is yellow because", streamer=streamer)
Expand All @@ -97,8 +98,8 @@ below, where a lambda function outputs words to the console immediately upon gen
std::string model_path = argv[1];
ov::genai::LLMPipeline pipe(model_path, "CPU");
auto streamer = [](std::string word) {
std::cout << word << std::flush;
auto streamer = [](std::string word) {
std::cout << word << std::flush;
// Return flag indicating whether generation should be stopped.
// false means continue generation.
return false;
Expand Down Expand Up @@ -143,8 +144,8 @@ You can also create your custom streamer for more sophisticated processing:
class CustomStreamer: publict StreamerBase {
public:
bool put(int64_t token) {
bool stop_flag = false;
/*
bool stop_flag = false;
/*
custom decoding/tokens processing code
tokens_cache.push_back(token);
std::string text = m_tokenizer.decode(tokens_cache);
Expand Down Expand Up @@ -205,7 +206,7 @@ mark a conversation session, as you can see in these simple examples:
std::string model_path = argv[1];
ov::genai::LLMPipeline pipe(model_path, "CPU");
ov::genai::GenerationConfig config = pipe.get_generation_config();
config.max_new_tokens = 100;
pipe.set_generation_cofnig(config)
Expand Down

0 comments on commit 09c1c02

Please sign in to comment.