The defacto tuning for almost all contemporary western music is 12 tone equal temperament. It is a good solution given the constraint that tuning an instrument in real-time during the performance and even between different musical pieces in a single sitting can be a tedious and time-consuming task. The most famous example is probably the piano where the tuning requires a separate profession.
We believe that 12 tone equal temperament is not necessarily the best tuning system for every melody if retuning is an easy operation like in electronic music. InTune aims to compute a better tuning tailored to a given musical piece.
We made two main assumptions. First one is: an interval is unimportant to the listener if the notes are sufficiently far apart in time. Second assumption is: Total cost is computed as weighted average of squared differences between desired vs. realized versions of an interval. Both assumptions are for reducing the computational burden.
Here is the algorithm:
- Assign a unique variable to each note instance in the score.
- Write down the total cost and construct the linear system from partial derivatives of it.
- Use SciPy's function for solving
$Ax=b$ for band matrix$A$ .
Let us put all notes in an ascending order w.r.t their onset time. Let
Differentiating the loss function we get
We get
Since neighborhood is a symmetric relation we have
Note that
Now we can write our problem in the form of a matrix equation
One would typically choose a neighborhood of size less than 100, so given that the whole score typically consists of thousands of note instances, the matrix
From our experiments, it seems that the key pitch should not be changed throughout a piece. For now, fixing the key pitch is trying to be ensured via keeping the neighborhood size large and assigning very high cost to unison deviation in case when both of notes are key. This, however, increases the computational cost dramatically. The key estimation algorithm used is Krumhansl-Schmuckler algorithm. In the future, we plan to address this issue in a better way.
First assumption seems to be roughly correct, except that when both notes are tonic they should better not change or change very slowly over time.
The second assumption is more problematic. It is the usage of square loss function to model a listener's preference. From our experiments we think that an ideal loss curve should increase faster than quadratic function of the absolute error. An extreme case would be to put sharp inequality constraints, which would boil down to solving a linear programming problem. We tried that approach too but it turned out to be unable to make any significant improvement (of major thirds in particular). Addressing this problem needs a serious rewrite of the program since a more advanced optimization algorithm would be required.
We felt that certain classical pieces that emphasizes the sweetness of major thirds (like Mozart's Piano Sonata No. 11 in A Major, 1. Theme) sound better with major thirds closer to the pure 5/4 ratio. The second plot shows histograms of all simultaneously sounding fifth, fourth and major third intervals.