pdlite-llm

环境准备

cmake
gcc
g++
conda

conda create -n llm python=3.8
conda activate llm
pip install -r requirements.txt

模型准备

可以根据自己的需要，将原始模型转化至onnx格式；本项目借鉴https://github.com/wangzhaode/llm-export，直接下载了其onnx模型，使用chatglm2-6b的模型进行测试

模型转换

使用x2paddle将onnx转换为paddle的格式，需要对其源码进行一定修改，路径anaconda3/envs/llm/lib/python3.8/site-packages/x2paddle/op_mapper/onnx2paddle/opset_legacy.py，1133行添加：

if axes == [-1]:
    axes = [0]

具体原因还不知道，先把其强行设置成为0

执行脚本

bash convert.sh

模型量化

暂时使用paddleslim，动态离线量化，并没有显著的效果

源码编译

git clone https://github.com/PaddlePaddle/Paddle-Lite.git

将Paddle-Lite/cmake/generic.cmake284行修改为：

target_link_libraries(${TARGET_NAME} ${MKLML_LIB_DIR}/libiomp5.so)

执行

bash linux_build.sh

使用

在源码中已经将转换好的模型路径写进去了，直接执行：

./pdlite-llm

即可

遗留问题

在run()结束后，无法将模型运行结果作为返回值返回给主程序
log:

other has not been implemented transform with dtype3 X, dtype0 Out
*** Check failure stack trace: ***

文件Paddle-Lite/lite/kernels/host/cast_compute.cc中，

// BOOL = 0;INT16 = 1;INT32 = 2;INT64 = 3;FP16 = 4;FP32 = 5;FP64 = 6;
// SIZE_T = 19;UINT8 = 20;INT8 = 21;

可以发现问题原因是对于cast算子，输入类型int64无法转换为输出类型bool

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
include		include
tokenizer		tokenizer
.gitignore		.gitignore
CMakeLists.txt		CMakeLists.txt
LICENSE		LICENSE
README.md		README.md
convert.sh		convert.sh
linux_build.sh		linux_build.sh
onnx2paddle.py		onnx2paddle.py
opt.py		opt.py
pdlite-llm.cc		pdlite-llm.cc
quantize.py		quantize.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

pdlite-llm

环境准备

模型准备

模型转换

模型量化

源码编译

使用

遗留问题

About

Releases

Packages

Languages

License

masteryi-0018/pdlite-llm

Folders and files

Latest commit

History

Repository files navigation

pdlite-llm

环境准备

模型准备

模型转换

模型量化

源码编译

使用

遗留问题

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages