Skip to content

Prompt Interpolation

ljleb edited this page Dec 10, 2023 · 37 revisions

Original wiki text from AUTOMATIC webui wiki

Prompt interpolation makes it possible to interpolate between the embeddings of multiple prompts over time. The syntax is a list of prompts followed by a list of step numbers. Each number indicates the specific step at which the corresponding prompt should be used:

[prompt1:prompt2:prompt3:...:when1,when2,when3,...]

prompt1, prompt2, prompt3 and so on are separate prompts, and when1, when2, when3 and so on are numbers that indicate at which steps which prompt should be used. If the sampling step is in-between when1 and when2, for example, then an embedding that is a weighted average of both when1 and when2 is used instead.

A smaller number for whenN has a bigger impact on the structure and placement of all objects in the final image, because it will be sampled earlier. A greater number will have a more fine-grained impact on the whole picture, affecting mainly the specific details and features of the image.

The further away the interpolated embeddings are from each other, the more sampling steps is required to get good results.

Interpolation Functions

In case there are at least 3 prompts used in the interpolation, one of the following interpolation curves can be used to interpolate between the embeddings of the prompts:

  • linear : interpolate linearly between the embeddings corresponding to each prompt
  • catmull : an artificial embeddings is created at each step that follows a catmull-rom curve that passes through all prompts' embeddings
  • bezier : an artificial embeddings is created at each step that follows a bezier curve with every prompt embedding used as a control point

By default, the interpolation curve is linear. If you want to use another function, write its name as an additional argument to the interpolation expression.
Here is an example with prompts p1, p2, p3 at steps 0, 1, 9 using catmull as the curve type:

[p1 : p2 : p3 : -1, 0, 8 : catmull]

Please note that step numbers are off by one n-1 to stay coherent with the existing prompt features for AUTOMATIC webui.

It is also possible to use step values between 0 and 1 to indicate a ratio of the total number of samples. For example, if the total number of steps is 20, [a : b : .25, .75] will start interpolating at step 5 (inclusive) and stop at step 15 (exclusive).

Using Multiple Interpolations

Using multiple interpolations in the same prompt is possible. There are two ways to achieve this:

Concurrent Interpolations

You can put multiple interpolations side by side:

a supersonic [ rocket : starship : 2, 7 ] leaving behind [ fire : gas : 1, 5 ]

This will create an "interpolatable embedding" that can represent all possible prompts. In this case, the possible prompts are:

  • a supersonicrocketleaving behindfire
  • a supersonicrocketleaving behindgas
  • a supersonicstarshipleaving behindfire
  • a supersonicstarshipleaving behindgas

The interpolatable embedding is then sampled once for each step of the sampling process to generate the actual embeddings that will be used to produce the output image. In this case, if sampler has 10 steps (=> steps 0 to 9):

DesModder_Video_Creator(2)

You can use as many side-by-side interpolations as you want. Note that as of right now, n concurrent interpolations will generate an exponential number of possible promts (at least O(2^n) in time and space complexity). This means that using more than 8 or 9 concurrent interpolations will probably hang the sampling process before it even has a chance to start (and so stopping it with the "Interrupt" button will not work). On my i5-12600KF, it took ~20 seconds with 10 concurrent interpolations.

Nested interpolations

You can also nest multiple interpolations:

a supersonic [ rocket : [ starship : massive structure : 0.1, 0.6 ] : 0.1, 0.6 ]

One way to understand this construction is as a linear interpolation where one of the control points is fixed (rocket) and the other is moving ([ starship : massive structure : 0.1, 0.6 ]).

This feature can also be understood as a generalization of a bezier curve, where each control point is arbitrarily chosen. If the points are chosen carefully, then nested interpolations can effectively be made to create a bezier curve.

For example, these prompts are equivalent (except for differences caused by precision errors):

[ a : b : c : , , : bezier]
[ [ a : b : , ]
: [ b : c : , ]
: , ]

Corner Cases

  • [to:when] - adds to to the prompt after a fixed number of steps (when)
  • [from::when] - removes from from the prompt after a fixed number of steps (when)
  • [from:to:when] - starts with from and switches to to after a fixed number of steps (when)
  • [a:b:from,to] - interpolate from a to b starting at step from and stopping at step to
  • [a:b:,when] - same as [a:b:-1,when] ; interpolate from a to b from the beginning of the sampling process to when
  • [a:b:when,] - same as [a:b:when,total] where total is the total number of steps of the sampling process
  • [a:b:,] - same as [a:b:-1,total] ; interpolate during the whole sampling process from a to b
  • [a:b:c:,,] - conceptually equivalent to [a:b:c:-1,(total + 1)/2 - 1,total] ; interpolate during the whole sampling process from a to b to c, where b appears at the middle sampling step
  • [a:b:c:d:from,,,to] - same as [a:b:c:d:from,when_b,when_c,to] where from, when_b, when_c and to are evenly spaced across the sampling schedule

Examples

1. A simple example

Prompt: a [fantasy:cyberpunk:3,7] landscape

  • At start, the model will be drawing a fantasy landscape.
  • After step 3 (so at step 4) still only includes fantasy.
  • Between step 4 (exclusive) and step 8 (exclusive), the embeddings corresponding to fantasy and cyberpunk are interpolated linearly:
    • At step 5, it will be .75*embeddings('a fantasy landscape') + .25*embeddings('a cyberpunk landscape').
    • At step 6, it will be .50*embeddings('a fantasy landscape') + .50*embeddings('a cyberpunk landscape').
    • At step 7, it will be .25*embeddings('a fantasy landscape') + .75*embeddings('a cyberpunk landscape').
  • After step 7 (so at step 8), it will switch to drawing a cyberpunk landscape, continuing from where it stopped at step 7.

2. Multiple interpolation control points

Prompt:

masterpiece, best quality,
[exploding huge bright fire ball in the night sky
:a magical landscape filled with fairies and dragons
:a mysterious and menacing medieval landscape
:,0,18
:catmull]

(sampler has 20 steps)

  • at start (after step -1 (since it is omitted), so step 0):
    • masterpiece, best quality, exploding huge bright fire ball in the night sky
  • after step 0 (so at step 1):
    • masterpiece, best quality, a magical landscape filled with fairies and dragons
  • between step 2 and step 18 (both included), sample the catmull curve passing by all prompt embeddings from t=2/19 to t=18/19
  • after step 18 (so at step 19):
    • masterpiece, best quality, a mysterious and menacing medieval landscape