Skip to content

Latest commit

 

History

History
25 lines (20 loc) · 1.15 KB

llama2-to-pairs.md

File metadata and controls

25 lines (20 loc) · 1.15 KB

llama2-to-pairs

  • domain(s): pairs, pretrain
  • accepts: ldc.api.pretrain.PretrainData
  • generates: ldc.api.supervised.pairs.PairData

Converts llama2 pretrain records to prompts/response ones. The 'instruction' (ie prompt) is extracted from [INST]...[/INST] and the 'output' (ie response) is the string that follows the [/INST]. Splits on to generate multiple prompt/response records.

usage: llama2-to-pairs [-h] [-l {DEBUG,INFO,WARNING,ERROR,CRITICAL}]
                       [-N LOGGER_NAME]

Converts llama2 pretrain records to prompts/response ones. The 'instruction'
(ie prompt) is extracted from [INST]...[/INST] and the 'output' (ie response)
is the string that follows the [/INST]. Splits on <s> to generate multiple
prompt/response records.

optional arguments:
  -h, --help            show this help message and exit
  -l {DEBUG,INFO,WARNING,ERROR,CRITICAL}, --logging_level {DEBUG,INFO,WARNING,ERROR,CRITICAL}
                        The logging level to use. (default: WARN)
  -N LOGGER_NAME, --logger_name LOGGER_NAME
                        The custom name to use for the logger, uses the plugin
                        name by default (default: None)