Question of the day: https://leetcode.com/problems/linked-list-random-node/#/description
Two solutions come to mind immediately, with a tradeoff between space and time (that's always the tradeoff).
-
Store all the elements of the incoming linked list in an array, then use a random number generator to select an index. Storing the elements in an array would take up
O(n)
space, but then accessing them with theO(1)
randomly generated index would still beO(1)
since it's just a matter of indexing into the array. -
Reverse the first idea. Generate the index from the random number generator first and then move a pointer down the nodes of the linked list to find the value to return. Since we're not storing anything in this implementation, it'd be
O(1)
space, but every time we callgetRandom
it would cost usO(N)
time.
While pondering this question, I became really curious about random number generation. Like.. how does it work?! Is anything truly random in this world? If you think about things from a physics point of view, every atom has its role in changing the world from one moment to the next. So for the purposes of generating a random index for my linked list, to keep things practical, I need to know how random I need the random index to be. I'll just go ahead and assume I don't need something that's random to the point of cryptographic security.
[..2 hours of wikipedia surfing later..]
Something that'll suit my needs just fine is the random python library that's implemented using a Mersenne Twister.
I noticed this Leetcode user's (and many others') solution that uses the idea of
(Reservoir Sampling)[https://en.wikipedia.org/wiki/Reservoir_sampling]
to randomly choose an element from the linked list. It's an optimization
on my solution because in my code I first count how many elements n
there are, and then use that n
to generate a random number and then
iterate through the linked list to find the element, so my total
runtime is O(n)
setup and O(0.5n)
on average
for each getRandom
call, which is still asymptotically O(n)
,
but technically slower. Reservoir Sampling is just a flat O(n)
for each call, since you don't know n
ahead of time and have to
iterate through each of the elements, swapping out which one you
keep to return in the end.
Another thing I'd like to jot down here is that I spent a lot more time today than yesterday on structuring my tests. Not sure how valuable that is while doing this 100 day challenge, but it was an interesting experience. I think with data structures, it's pretty difficult to "quickly write tests" and make sure they're comprehensive, but concise.