Skip to content

Commit

Permalink
docs: document system prompt design changes (#41)
Browse files Browse the repository at this point in the history
This commit adds documentation for changes to the system prompt design. It includes an Architecture Decision Record (ADR) detailing our proposed changes to the way we store the system prompt as well as instructions for how to properly modify the system prompt with this new design.
  • Loading branch information
MichaelRoytman authored Dec 14, 2023
1 parent 4537003 commit 5b11adb
Show file tree
Hide file tree
Showing 2 changed files with 144 additions and 0 deletions.
96 changes: 96 additions & 0 deletions docs/decisions/0002-system-prompt-design-changes.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,96 @@
2. System Prompt Design Changes
###############################

Status
******

**Accepted** *2023-12-13*

Context
*******

A system prompt in the Learning Assistant context refers to the text-based instructions provided to the large language
model (LLM) that describe the objective and proper behavior of the Learning Assistant. This prompt is provided to the
LLM via 2U's Xpert Platform generic chat completion endpoint as the first set of elements in the ``message_list``
payload key.

Currently, the system prompt is stored in the `CoursePrompt model`_ on a per-course basis. For each course in which the
Learning Assistant is enabled, a system prompt must be stored in the associated database table.

The original intention behind storing the system prompt in this way was to enable an expedited release, a greater degree
of per-course customization, and a flexibility to modify the system prompt quickly in the early stages of the project.

The process of releasing the Learning Assistant to a new course involves either the manual creation of the model
instance via the Django admin or the use of the `set_course_prompts management command`_. The latter requires that a
member of 2U's Site Reliability Engineering (SRE) team runs this command in the proper environment.

The next iteration of the Learning Assistant will enable the integration of unit content into the system prompt to
provide the LLM with more information about the context in which the learner is asking a question. This will require a
change to the system prompt to accommodate the unit content, and will, thus, require manual work on the part of an
engineer to update the existing system prompts and to create new system prompts as we approach a full roll out. This
presents an opportunity to reconsider the way that we store and process the system prompt.

Decision
********

* We will store a system prompt template in a Django setting in the form of a Jinja template.
* We will use Jinja constructs, such as variables and control structures, to implement a single system prompt template
for all courses.
* The system prompt template will be rendered in two steps.

* The first render step will be performed by the `learning-assistant`_ code. This step will interpolate any variables
for which this code has a value (e.g. unit content).
* The second render step will be performed by the 2U Xpert Platform generic chat completion endpoint. This step will
interpolate any variable for which the platform has a value (e.g. course title and skill names).

* After the first render step, the resulting value will be a Jinja template that has been partially rendered. This
template will be sent to 2U's Xpert Platform generic chat completion endpoint to be completely rendered.
* We will remove the `CoursePrompt model`_ following the instructions documented in
`Everything About Database Migrations`_.

Consequences
************

* Changes to the system prompt will require pull requests to the appropriate repository in which the prompt is stored.
* Anyone with access to the appropriate repository in which the prompt is stored can change the system prompt. Members
of the team will no longer require assistance from the SRE team.
* The transition to a system prompt template required the 2U Xpert Platform team to provide support for accepting and
rendering a system prompt template and the integration of Discovery ``skill_names`` into their index.
* The use of a system prompt template will require cross-team collaboration to ensure that the same variable names are
use in the system prompt template, in the `learning-assistant`_ code, and the 2U Xpert Platform generic chat
completion endpoint.
* Because the system prompt template will be stored in a Django setting in a private repository, and the code that
renders the template is stored here, changes to the template will require careful coordination to ensure that the
template is rendered properly.

* For example, if a new variable is added to the template, the code must be modified and deployed in advanced of
changes to the template. Otherwise, if changes to template are deployed before the related code changes, then the
rendered template will contain uninterpolated variables.

* It will become more difficult to enable per-course customizations because all courses will be served by a single
system prompt template.

Rejected Alternatives
*********************

* Status Quo

* The main alternative to this change is to continue to use manual entry and the `set_course_prompts management command`_
to manage system prompts.
* To enable unit content integration, we would need to modify the JSON string stored in the `CoursePrompt model`_ to
store a JSON string with format string variable for the content, which would be interpolated at runtime.
* The main advantage of this alternative is that it will require less engineering work.
* The main disadvantage of this alternative is that it will become challenging to manage which prompt a course should
use. For example, to do a stage released of unit content integration, we would be required to manage different
templates manually. Later, a full release would require additional changes. This would make management of the
templates even more tedious.
* Managing system prompts in a full release would become intractable.
* Additionally, in order to better operationalize the use of the management command and to reduce our reliance on the
SRE team, we would likely want to invest time in setting up a Jenkins job to run the management command ad-hoc. If
we will be investing engineering resources in this area anyway, we felt it was a more future-proof approach to
pursue the solution described above.

.. _set_course_prompts management command: https://github.com/edx/learning-assistant/blob/main/learning_assistant/management/commands/set_course_prompts.py
.. _CoursePrompt model: https://github.com/edx/learning-assistant/blob/34604a0775f7bd79adb465e0ca51c7759197bfa9/learning_assistant/models.py
.. _Everything About Database Migrations: https://openedx.atlassian.net/wiki/spaces/AC/pages/23003228/Everything+About+Database+Migrations#EverythingAboutDatabaseMigrations-Howtodropatable
.. _learning-assistant: https://github.com/edx/learning-assistant
48 changes: 48 additions & 0 deletions docs/modifying-system-prompt-template.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
Modifying the System Prompt Template
####################################

Context
*******

Because the system prompt template will be stored in a Django setting in a private repository, and the code that
renders the template is stored here, changes to the template will require careful coordination to ensure that the
template is rendered properly.

For example, if a new variable is added to the template, the code must be modified and deployed in advanced of
changes to the template. Otherwise, if changes to template are deployed before the related code changes, then the
rendered template will contain uninterpolated variables.

This document describes how to properly modify the system prompt template and the code that renders it. These steps
are only required when your change introduces a new dependency between the template and the code. For example, the
introductin of a new variable introduces a new dependency, because the code must provide the value for this variable
when the template is rendered for the variable to be interpolated properly. Additionally, renaming a variable or
removing a variable would also require you to follow these steps. On the other hand, using an existing variable in a new
way or changing static text in the template would not require you to follow these steps, because these changes would not
require a related change to the code.

Adding to the Template
**********************

If you are adding to the template, then you must follow these steps.

#. Modify the code to supply the correct values to the function that renders the template.
#. Merge and deploy the code changes.
#. Modify the template.
#. Merge and deploy the template changes.

Removing From the Template
**************************

If you are removing from the template, then you must follow these steps.

#. Modify the template.
#. Merge and deploy the template changes.
#. Modify the code to supply the correct values to the function that renders the template.
#. Merge and deploy the code changes.

Adding to and Removing From the Template
****************************************

Combination changes will require that the changes are divided into additions and removals. Divide your changes into
additions and removals and follow the above steps for adding to the template and removing from the template,
respectively.

0 comments on commit 5b11adb

Please sign in to comment.