Skip to content

Prompt Engineering

Nhan Nguyen edited this page Dec 10, 2023 · 7 revisions

Introduction

Six strategies

  1. Write clear instructions

These models can’t read your mind. If outputs are too long, ask for brief replies. If outputs are too simple, ask for expert-level writing. If you dislike the format, demonstrate the format you’d like to see. The less the model has to guess at what you want, the more likely you’ll get it.

  1. Provide reference text

Language models can confidently invent fake answers, especially when asked about esoteric topics or for citations and URLs. In the same way that a sheet of notes can help a student do better on a test, providing reference text to these models can help in answering with fewer fabrications.

  1. Split complex tasks into simpler subtasks

Just as it is good practice in software engineering to decompose a complex system into a set of modular components, the same is true of tasks submitted to a language model. Complex tasks tend to have higher error rates than simpler tasks. Furthermore, complex tasks can often be re-defined as a workflow of simpler tasks in which the outputs of earlier tasks are used to construct the inputs to later tasks.

  1. Give the model time to "think"

If asked to multiply 17 by 28, you might not know it instantly, but can still work it out with time. Similarly, models make more reasoning errors when trying to answer right away, rather than taking time to work out an answer. Asking for a "chain of thought" before an answer can help the model reason its way toward correct answers more reliably.

  1. Use external tools

Compensate for the weaknesses of the model by feeding it the outputs of other tools. For example, a text retrieval system (sometimes called RAG or retrieval augmented generation) can tell the model about relevant documents. A code execution engine like OpenAI's Code Interpreter can help the model do math and run code. If a task can be done more reliably or efficiently by a tool rather than by a language model, offload it to get the best of both.

  1. Test changes systematically

Improving performance is easier if you can measure it. In some cases a modification to a prompt will achieve better performance on a few isolated examples but lead to worse overall performance on a more representative set of examples. Therefore to be sure that a change is net positive to performance it may be necessary to define a comprehensive test suite (also known an as an "eval").

Tactics

Each of the strategies listed above can be instantiated with specific tactics. These tactics are meant to provide ideas for things to try. They are by no means fully comprehensive, and you should feel free to try creative ideas not represented here.

Write clear instructions

Include details in your query to get more relevant answers

Worse Better
Who’s president? Who was the president of Mexico in 2021?

Ask the model to adopt a persona

Role Content
SYSTEM When I ask something, you can answer me angrily.
USER Who was the President of Vietnam in 2019?

Use delimiters to clearly indicate distinct parts of the input

  • Delimiters like triple quotation marks, XML tags, section titles, etc. can help demarcate sections of text to be treated differently.
Role Content
USER Please translate the text delimited by triple quotes into Vietnamese
""" hello """

For straightforward tasks such as these, using delimiters might not make a difference in the output quality. However, the more complex a task is the more important it is to disambiguate task details. Don’t make the model work to understand exactly what you are asking of them.

Provide examples

  • Zero-Shot Prompting
Role Content
USER Generate 10 possible names for my new dog.
  • One-Shot Prompting
Role Content
USER Generate 10 possible names for my new dog.
A dog name that I like is Banana.
  • Few-Shot Prompting
Role Content
USER Generate 10 possible names for my new dog.
Dog names that I like include:
– Banana
– Kiwi
– Pineapple
– Coconut

Providing general instructions that apply to all examples is generally more efficient than demonstrating all permutations of a task by example, but in some cases providing examples may be easier.

For example: Generate 10 possible names for my new dog. I want its name to be a type of fruit.

Specify the steps required to complete a task

Specify the desired length of the output

  • Example for 3 tactics: Specify the steps required to complete a task, Provide examples, Specify the desired length of the output
Role Content
SYSTEM Hãy sử dụng các bước sau để phản hồi lại đầu vào của user.

Bước 1 : bạn sẽ được cung cấp 1 câu hỏi, đầu tiên bạn hãy trả lời câu hỏi của user

Bước 2: bạn hãy dịch câu trả lời ở Bước 1 sang tiếng Anh, và câu trả lời phải bao gồm cả tiếng Việt và tiếng Anh.

Định dạng của câu trả lời ở bước 2 sẽ như thế này:
- Tiếng việt (Vietnamese)
- Hello (Hello)
- Xin chào (Hello)
USER Chat GPT là gì? hãy tóm tắt câu trả lời khoảng 5 từ.

Provide reference text

Instruct the model to answer using a reference text

Instruct the model to answer with citations from a reference text

  • Example:
Role Content
SYSTEM You will be provided with a document delimited by triple quotes and a question. Your task is to answer the question using only the provided document and to cite the passage(s) of the document used to answer the question. If the document does not contain the information needed to answer this question then simply write: "Insufficient information." If an answer to the question is provided, it must be annotated with a citation. Use the following format for to cite relevant passages ({"citation": …}).
USER <insert articles, each delimited by triple quotes>
Question: <insert question here>

Embeddings can be used to implement efficient knowledge retrieval. See the tactic "Use embeddings-based search to implement efficient knowledge retrieval" for more details on how to implement this.

Split complex tasks into simpler subtasks

Use intent classification to identify the most relevant instructions for a user query

For dialogue applications that require very long conversations, summarize or filter previous dialogue

Summarize long documents piecewise and construct a full summary recursively

Give models time to "think"

Instruct the model to work out its own solution before rushing to a conclusion

Use inner monologue or a sequence of queries to hide the model's reasoning process

Ask the model if it missed anything on previous passes

Use external tools

Use embeddings-based search to implement efficient knowledge retrieval

Use code execution to perform more accurate calculations or call external APIs

Give the model access to specific functions

  • function_call - Deprecated. Deprecated in favor of tool_choice.

Test changes systematically

Evaluate model outputs with reference to gold-standard answers

Clone this wiki locally