Finds how similar 2 lists of rating are using the Divide and Conquer approach. Extension of MergeSort that actually displays the specific inversions as well as just counting the total number.
Compare ratings of n items between 2 lists. How similar are they (how many inversions)?
1 list is ranked; 1, 2, 3, ... , n
The other list has arbitrary order: a1, a2, a3, ... , an (values in the array at an index)
Inversion: index i
is less then index j
, but the value at i
is greater than the value at j
(i
< j
, but ai > aj)
This example has 6 inversions: (4,1), (4,2), (4,3), (3,1), (3,2), (2,1)
Use the array index as the reference ranking and the array entry as the compared ranking
To find the number of inversions, we can just write out the 2nd list below the 1st lists which is in sorted order & draw lines connecting identical values
Now count the number of times these lines cross (actual inversions are in colored parentheses)
Again, this example has 6 inversions: (4,1), (4,2), (4,3), (3,1), (3,2), (2,1)
If a list has no inversions, it is already sorted and no crossed lines
This reduces down to just comparing the 2nd list to its sorted order
Only requires a slight modification to MergeSort
When merging the 2 sublists, if a value is copied from the right half, then it is greater than all the remaining elements in the left half so the inversion could should increase by the number of elements remaining in the left half
This is a Split Inversion
MergeSort stages:
In the actual algorithm, the left half happens 1st, but I write both divide steps here for simplicity
- In Merge 1, the item from the right half (
3
) is copied 1st so we have an inversion (4,3) - In Merge 2, item from the right half (
1
) is copied 1st so we have the inversion (2,1) - In Merge 3,
1
is < both items in the left half so 2 inversions: (3,1) & 4,1) - Also in Merge 3,
2
is < both items in the left half so 2 more inversions: (3,2) & 4,2)
Divide step is the same, but now count the number of inversions in the left half, right half & split inversions
The Divide stage counts the inversion in the left & right halves, split inversions are counted in the Merge stage
T(n) ≤ T(⌊n/2⌋)+ T(| n/2 |)+ O(n)
⇒ T(n) = O(n log n)
(by solving the same recurrence relation as MergeSort)
- Return types of the classic mergeSort have been changed to
int
since they now return the number of inversions as well as sorting - This Algorithm is instance-based instead of a
static
sort method - The magic happens in the
else
of themergeAndCount()
Main Loop (this main loop copies the smallest item for a sub-array until 1 sub-array runs out)- The
else
is only entered if the item in the left half is NOT smaller than the item in the right half inversionCount += center-leftPos;
incrementsinversionCount
by the number of items remaining in the left half- Then a for loop goes from the current
leftPos
until the end of the left half & adds the inversions to a bookkeeping 2DArrayList
inversions.add(new ArrayList<Integer>(Arrays.asList(array[i], array[rightPos-1])));
array[rightPos-1]
is the item from the right sub-array because a few lines above is:array[ rightPos++ ];
which incrementsrightPos
, so the actual value is 1 less (it's previous value)
- The
- Counting Inversions - Kevin Wayne
- MergeSort Code - Data Structures and Algorithm Analysis in Java (Third Edition) by Mark Allen Weiss
- Part 1: O(n log n) Algorithm for Counting Inversions - Free Engineering Lectures
- Part 2: O(n log n) Algorithm for Counting Inversions - Free Engineering Lectures
- Count Inversions - GeeksForGeeks
Changed specific implementation ofinversionCount
- Counting inversions - toto2 (Stack Overflow)
Helped figure out Left Boundary & test cases