Extracting Prompts from Customized Large Language Models

This paper consists of the source code of paper: Why Are My Prompts Leaked? Unraveling Prompt Extraction Threats in Customized Large Language Models(arxiv).

Source code explanations

PEAD dataset: extractingPrompt/instructions/benchmark_collections/OVERALL_DATA_BENCHMARK.json
Source code of all experiments: extractingPrompt/
- Generalized evaluation
  - Vanilla: extractingPrompt/1.run_prompt_extraction.py
  - Function callings comparison: extractingPrompt/5.funcall_comparison.py
- Scaling laws of prompt extraction
  - Model size: extractingPrompt/2.model_size_prompt_extraction_experiments.py
  - Sequence length: extractingPrompt/4.varying_sequence_length.py
- Empirical analysis
  - Convincing Premise: extractingPrompt/6.ppl_comparison.py
  - Parallel-translation: extractingPrompt/7.attention_visualize.py
  - Parallel-translation: extractingPrompt/attention_visualize.py
- Defense strategies
  - Defending methods: extractingPrompt/defending/ppl_high2_confusingBeginnings.py
  - Performance drops experiments of the defending: extractingPrompt/defending/2.drops_of_defending.py
  - visualization: extractingPrompt/defending/defense_visualization.py
- Close-AI experiments
  - vanilla prompt extraction: extractingPrompt/api_related_experiments/1.run_prompt_extraction.py
  - soft extraction: extractingPrompt/api_related_experiments/2.soft_extraction_experiments.py
  - performance drops of defending: extractingPrompt/api_related_experiments/3.1.drops_of_defense.py

Experimental environments

Run

pip install -r re.txt

or install the following key packages manually:

datasets
numpy
pandas
peft
safetensors
scipy
tensorboard
tensorboardX
tiktoken
tokenizers
torch
tqdm
transformers
matplotlib
scikit-learn
thefuzz
einops
sentencepiece

Contact the authors

Feel free to open an issue, or send the email to zi1415926.liang@connect.polyu.hk if there exists any problem.

Citation:

@misc{liang2024promptsleakedunravelingprompt,
      title={Why Are My Prompts Leaked? Unraveling Prompt Extraction Threats in Customized Large Language Models}, 
      author={Zi Liang and Haibo Hu and Qingqing Ye and Yaxin Xiao and Haoyang Li},
      year={2024},
      eprint={2408.02416},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2408.02416}, 
}

Name		Name	Last commit message	Last commit date
Latest commit History 209 Commits
GPTs		GPTs
data		data
extractingPrompt		extractingPrompt
llama2-finetuning-test		llama2-finetuning-test
.gitignore		.gitignore
README.md		README.md
check_and_estimate_finetuned.py		check_and_estimate_finetuned.py
handle_finetuning.py		handle_finetuning.py
re.txt		re.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Extracting Prompts from Customized Large Language Models

Source code explanations

Experimental environments

Contact the authors

About

Languages

liangzid/PromptExtractionEval

Folders and files

Latest commit

History

Repository files navigation

Extracting Prompts from Customized Large Language Models

Source code explanations

Experimental environments

Contact the authors

About

Topics

Resources

Stars

Watchers

Forks

Languages