Describe in a high level the solution you have in mind
For the recommendation algorithm, I would: Step 1: Determining the users favourite genre based on the music the the user has listened to, and possible also take into account the genre of the music his followees of the first degree, except with less weight. Step 2: To generate the list of recommendations, I would traverse the list and assign a score to each song based on the which genre/tags it has, and what "weight" that genre was assigned in the previous step of the algorithm. Step 3: If I do not include the followee's music genre preferences in step 1, I would try and include the followee's music they listened to when assigning score in step 2. Step 4: As I'm traversing the list, I would be generating a sorted list of recommended music on the side, and only inserting when the score of the music currently being considered has not been heard by the user before and higher than the score of the lowest scoring music or if the list size is less than 5. If this insertion increases the size of the list to greater than five, I would remove the last element. Step 5: Return the list of recommended music in the correct format
What other data could you use to improve recommendations?
Additional data that would be helpful are:
- When was the last time the user listened to a particular piece of music
- How many times did the user listen to that piece of music
- Is there a favourite system, and if there is, which pieces of music has been favourited by the user
- The artist of the music (and if a favourite system exists, which artists the user favourited)
- Do users listen all the way through a song, or skip it almost every time they hear it?
- Is there a playlist feature, and if so, which songs are on those play lists.
Assume a more real world situation where you could have more data you described above, and more time to implement, could you think of a possibly more efficient way to recommend?
Improvements possible if the data above is available include:
- Only take into account the most recent music that the user listened to, to reduce the size of the list incase it is large, as well as cut out old/stale information
- Make more accurate predictions using the number of times a user listened to a particular piece of music and if it is favourited to modify the score that a music and its genres receive.
- Can also only look at songs that the user has heard more than some threshold number of times (more than once?)
- Can give more weight to songs if they have an artists that the user really likes (ranked by the same system as the genres) when recommending
Assume you have more than one implementation of recommendations, how could you test which one is more effective using data generated by user actions?
Effectiveness can be gauged by:
- If the user has to manually choose if they want to listen to the music they were recommended, then I would closely monitor how many of the songs that were recommended were selected by the user, and of the selected songs, how many was listened to all the way until the end.
- If the user has all 5 recommendations added to a "play list" type feature, I would try to see which songs were skipped early on, and which songs were listened to all the way through.
- There may also be other actions that the user may perform on this list, such as favouriting, repeating, and adding the song to a play list, if those functions are available.
--
How long did this assignment take? Please be honest it's relatively new.
Approximately 3 days, with about 4 hours per day.
-
Day 1 was mostly research on Node.js, Express.js and mongoDB, as well as setup, may have been more than 4 hours
-
Day 2 was writing the server logic and implementing algorithms, approx 5 hours.
-
Day 3 was writing the script and polishing up code, 3-4 hours
-
About another 12 hours for fixing issues opened after review
Where would be the bottlenecks of this solution you have implemented?
Picking out all songs that match a the users "tags" and then ranking them, since if the user has listened to many songs, and has many followees, he will have a lot of songs in this category, and even though only 5 needs to be recommended. Each song may also have many tags, which would cause more operations.
What was the hardest part?
Efficiently implementing my algorithm in a new environment. While I had a good idea of the algorithm I wanted to implement, I did not want to implement a crude version just because of my unfamiliarity with the tech-stack, so I had to take extra diligence with my implementation.
Did you learn something new?
Other than getting to experience Node.js, Express.js and mongoDB for the first time, and learning more about Javascript and noSQL databases, I improved my ability to manage asynchronous callbacks thanks to the way callbacks are used in Node.js
Do you feel your skills were under tested?
I do not feel like this assignment was overly challenging. I believe I would have been able to very quickly finish the assignment if I was more familiar with the tech-stack;