Note
This part is not for beginner, please know about transformers, scaling law and SFT first. Basic computer network and python programming knowledge also required.
Here is directory description:
script | desc |
---|---|
reconstruct_wechat_group.py | reconstruct wechat group messages |
reconstruct_filter_annotate.py | filter data with puyu+kimi, manually annotate, manually review |
reconstruct_check_llm.py | recheck 14B & 32B results |
convert_to_alpaca.py | convert raw group data to alpaca format |
-
Get all WeChat group chats, use
python3 reconstruct_wechat_group.py
to split it and filter with LLM -
python3 reconstruct_filter_annotate.py
to filter, annotate and manually checkNow you can get gt.jsonl
-
Convert
gt.jsonl
to alpaca format, useconvert_to_alpaca.py
Finally we have alpaca.json for SFT
Install axolotl, update your model path and data path in axolotl_configs.
Let's take qwen2-lora-0.5B.yaml as example.
# config paths
base_model: /workspace/models/Qwen1.5-0.5B-Chat
..
datasets:
- path: /workspace/axolotl/alpaca.json
type: alpaca
..
output_dir: ./out-qwen0.5
Train the model
accelerate launch -m axolotl.cli.train examples/qwen/qwen2-lora-0.5B.yaml
Fine-tuned LoRA weights can be found in huggingface.
Merge LoRA weights
python3 -m axolotl.cli.merge_lora examples/qwen/qwen2-lora-0.5B.yaml
Serving merged Qwen model as openai
API with vLLM
python -m vllm.entrypoints.openai.api_server --served-model-name LoRA-Qwen1.5-0.5B-Chat --model /workspace/axolotl/out-qwen0.5/merged/ --port 29999 --max-model-len 8192
Evaluate the precision and F1 score, use your own IP and port in the python code.
python3 reconstruct_filter_annotate.py --action metric --llm-type Qwen1.5-0.5B-Chat