719. Find K-th Smallest Pair Distance


Given an integer array, return the k-th smallest distance among all the pairs. The distance of a pair (A, B) is defined as the absolute difference between A and B.

Example 1:

Input:
nums = [1,3,1]
k = 1
Output: 0 
Explanation:
Here are all the pairs:
(1,3) -> 2
(1,1) -> 0
(3,1) -> 2
Then the 1st smallest distance pair is (1,1), and its distance is 0.

Note:

  1. 2 <= len(nums) <= 10000.
  2. 0 <= nums[i] < 1000000.
  3. 1 <= k <= len(nums) * (len(nums) - 1) / 2.


Approach #1: Heap [Time Limit Exceeded]

Intuition and Algorithm

Sort the points. For every point with index i, the pairs with indexes (i, j) [by order of distance] are (i, i+1), (i, i+2), ..., (i, N-1).

Let's keep a heap of pairs, initially heap = [(i, i+1) for all i], and ordered by distance (the distance of (i, j) is nums[j] - nums[i].) Whenever we use a pair (i, x) from our heap, we will add (i, x+1) to our heap when appropriate.

Complexity Analysis

  • Time Complexity: , where is the length of nums. As , this is in the worst case. The complexity added by our heap operations is either in the Java solution, or in the Python solution because the heapq.heapify operation is linear time. Additionally, we add complexity due to sorting.

  • Space Complexity: , the space used to store our heap of at most N-1 elements.


Approach #2: Binary Search + Prefix Sum [Accepted]

Intuition

Let's binary search for the answer. It's definitely in the range [0, W], where W = max(nums) - min(nums)].

Let possible(guess) be true if and only if there are k or more pairs with distance less than or equal to guess. We will focus on evaluating our possible function quickly.

Algorithm

Let prefix[v] be the number of points in nums less than or equal to v. Also, let multiplicity[j] be the number of points i with i < j and nums[i] == nums[j]. We can record both of these with a simple linear scan.

Now, for every point i, the number of points j with i < j and nums[j] - nums[i] <= guess is prefix[x+guess] - prefix[x] + (count[i] - multiplicity[i]), where count[i] is the number of ocurrences of nums[i] in nums. The sum of this over all i is the number of pairs with distance <= guess.

Finally, because the sum of count[i] - multiplicity[i] is the same as the sum of multiplicity[i], we could just replace that term with multiplicity[i] without affecting the answer. (Actually, the sum of multiplicities in total will be a constant used in the answer, so we could precalculate it if we wanted.)

In our Java solution, we computed possible = count >= k directly in the binary search instead of using a helper function.

Complexity Analysis

  • Time Complexity: , where is the length of nums, and is equal to nums[nums.length - 1] - nums[0]. We do work to calculate prefix initially. The factor comes from our binary search, and we do work inside our call to possible (or to calculate count in Java). The final factor comes from sorting.

  • Space Complexity: , the space used to store multiplicity and prefix.


Approach #3: Binary Search + Sliding Window [Accepted]

Intuition

As in Approach #2, let's binary search for the answer, and we will focus on evaluating our possible function quickly.

Algorithm

We will use a sliding window approach to count the number of pairs with distance <= guess.

For every possible right, we maintain the loop invariant: left is the smallest value such that nums[right] - nums[left] <= guess. Then, the number of pairs with right as it's right-most endpoint is right - left, and we add all of these up.

Complexity Analysis

  • Time Complexity: , where is the length of nums, and is equal to nums[nums.length - 1] - nums[0]. The factor comes from our binary search, and we do work inside our call to possible (or to calculate count in Java). The final factor comes from sorting.

  • Space Complexity: . No additional space is used except for integer variables.


Analysis written by: @awice.