164. Maximum Gap


Given an unsorted array, find the maximum difference between the successive elements in its sorted form.

Try to solve it in linear time/space.

Return 0 if the array contains less than 2 elements.

You may assume all elements in the array are non-negative integers and fit in the 32-bit signed integer range.

Credits:
Special thanks to @porker2008 for adding this problem and creating all test cases.


Solution


Approach #1 Comparison Sorting [Accepted]

Intuition

Do what the question says.

Algorithm

Sort the entire array. Then iterate over it to find the maximum gap between two successive elements.

Complexity Analysis

  • Time complexity: .

    Time taken to sort the array is (average case). Time taken for linear iteration through the array is of complexity. Hence overall time complexity is .

  • Space complexity: No extra space needed, other than the input array (since sorting can usually be done in-place).


Approach #2 Radix Sort [Accepted]

Algorithm

This approach is similar to Approach #1, except we use Radix Sort instead of a traditional comparison sort.

Complexity Analysis

  • Time complexity: .

    Since a linear iteration over the array (once it is sorted) is of linear (i.e. ) complexity, the performance of this approach is limited by the performance of Radix sort.

    Radix sort uses Counting sort as a subroutine.

    • Counting sort runs in time (where is the radix or base of the digits comprising the elements in the array). If , Counting sort would run in linear time. In our case, the radix is fixed (i.e. ). Hence our Counting sort subroutine runs in linear time.

    • Radix sort works by running passes of the Counting sort subroutine (where the elements are composed of, maximally, digits). Hence effective runtime of Radix sort would be . However, in our case an element can, maximally, be the maximum 32-bit signed integer 2,147,483,647. Hence is a constant.

    Thus Radix sort has a runtime performance complexity of about for reasonably large input.

  • Space complexity: extra space.

    Counting sort requires extra space. Radix sort requires an auxiliary array of the same size as input array. However given that is a small fixed constant, the space required by Counting sort can be ignored for reasonably large input.


Approach #3 Buckets and The Pigeonhole Principle [Accepted]

Intuition

Sorting an entire array can be costly. At worst, it requires comparing each element with every other element. What if we didn't need to compare all pairs of elements? That would be possible if we could somehow divide the elements into representative groups, or rather, buckets. Then we would only need to compare these buckets.

Digression: The Pigeonhole Principle The Pigeonhole Principle states that if items are put into containers, with , then at least one container must contain more than one item.

Suppose for each of the elements in our array, there was a bucket. Then each element would occupy one bucket. Now what if we reduced, the number of buckets? Some buckets would have to accommodate more than one element.

Now let's talk about the gaps between the elements. Let's take the best case, where all elements of the array are sorted and have a uniform gap between them. This means every adjacent pair of elements differ by the same constant value. So for elements of the array, there are gaps, each of width, say, . It is trivial to deduce that (where and are the minimum and maximum elements of the array). This width is the maximal width/gap between two adjacent elements in the array; precisely the quantity we are looking for!

One can safely argue that this value of , is in fact, the smallest value that can ever accomplish of any array with the same number of elements (i.e. ) and the same range (i.e. ). To test this fact, you can start with a uniform width array (as described above) and try to reduce the gap between any two adjacent elements. If you reduce the gap between and to some value , then you will notice that the gap between and would have increased to . Hence the maximum attainable gap would have become from . Thus the value of the maximum gap can only increase.

Buckets!

Coming back to our problem, we have already established by application of the Pigeonhole Principle, that if we used buckets instead of individual elements as our base for comparison, the number of comparisons would reduce if we could accommodate more than one element in a single bucket. That does not immediately solve the problem though. What if we had to compare elements within a bucket? We would end up no better.

So the current motivation remains: somehow, if we only had to compare among the buckets, and not the elements within the buckets, we would be good. It would also solve our sorting problem: we would just distribute the elements to the right buckets. Since the buckets can be already ordered, and we only compare among buckets, we wouldn't have to compare all elements to sort them!

But if we only had buckets to compare, we would have to ensure, that the gap between the buckets itself represent the maximal gap in the input array. How do we go about doing that?

We could do that just by setting the buckets to be smaller than (as described above). Since the gaps (between elements) within the same bucket would only be , we could deduce that the maximal gap would indeed occur only between two adjacent buckets.

Hence by setting bucket size to be , we can ensure that at least one of the gaps between adjacent buckets would serve as the maximal gap.

Clarifications

A few clarifications are in order:

  • Would the buckets be of uniform size? Yes. Each of them would be of the same size .

  • But, then wouldn't the gap between them be uniform/constant as well? Yes it would be. The gap between them would be integer unit wide. That means a two adjacent buckets of size could hold integers with values, say, and . We avoid overlapping buckets.

  • Then what are you talking about when you say the gap between two adjacent buckets could be the maximal gap? When we are talking about the size of a bucket, we are talking about its holding capacity. That is the range of values the bucket can represent (or hold). However the actual extent of the bucket are determined by the values of the maximum and minimum element a bucket holds. For example a bucket of size could have a capacity to hold values between . However, if it only holds the elements and , then its actual extent is only which is not the same as the capacity of the bucket.

  • Then how do you compare adjacent buckets? We do that by comparing their extents. Thus we compare the minimum element of the next bucket to the maximum element of the current bucket. For example: if we have two buckets of size each, holding elements and respectively, then the gap between the buckets would essentially refer to the value (which is larger than the size of either bucket).

  • But then aren't we comparing elements again?! We are, yes! But only compare about twice the elements as the number of buckets (i.e. the minimum and maximum elements of each bucket). If you followed the above, you would realize that this amount is certainly less than the actual number of elements in the array, given a suitable bucket size was chosen.

Algorithm

  • We choose a bucket size such that . Let's just choose .

  • Thus all the elements would be divided among buckets.

  • Hence the bucket would hold the range of values: (1-based indexing).

  • It is trivial to calculate the index of the bucket to which a particular element belongs. That is given by (0-based indexing) where is the element in question.

  • Once all elements have been distributed, we compare adjacent bucket pairs to find the maximum gap.

Complexity Analysis

  • Time complexity: .

    Distributing the elements of the array takes one linear pass (i.e. complexity). Finding the maximum gap among the buckets takes a linear pass over the bucket storage (i.e. complexity). Hence overall process takes linear runtime.

  • Space complexity: extra space.

    Each bucket stores a maximum and a minimum element. Hence extra space linear to the number of buckets is required.


Analysis written by @babhishek21. Shout-out to @zkfairytale, @teddyyyy and @lhearen!