explanation.txt

The reason that the performance is improved by using column store indexing is because of the nature of the query in question 2.  It is an analytical query that is using the aggregate function ‘COUNT’ which accesses a specific attribute (integer stars in this case) of our tuple and when the tuples are stored row wise then this access is becoming a costly random access because it does not have spatial locality that a sequential access is offering.  

Assuming that SQL server is saving the tuples by default using a row store scheme then this is good for inserting and updating tuples but terrible for aggregates.  We need to optimise things by providing a sequential access cost with linearisation of the review stars attribute to allow spatial locality and this explains the performance improvement in the post indexing run because consecutive memory accesses are faster than random accesses which are now offered by decomposed storage.