4. Median of Two Sorted Arrays


There are two sorted arrays nums1 and nums2 of size m and n respectively.

Find the median of the two sorted arrays. The overall run time complexity should be O(log (m+n)).

Example 1:

nums1 = [1, 3]
nums2 = [2]

The median is 2.0

Example 2:

nums1 = [1, 2]
nums2 = [3, 4]

The median is (2 + 3)/2 = 2.5


Solution


Approach #1 Recursive Approach [Accepted]

To solve this problem, we need to understand "What is the use of median". In statistics, the median is used for:

Dividing a set into two equal length subsets, that one subset is always greater than the other.

If we understand the use of median for dividing, we are very close to the answer.

First let's cut into two parts at a random position :

          left_A             |        right_A
    A[0], A[1], ..., A[i-1]  |  A[i], A[i+1], ..., A[m-1]

Since has elements, so there are kinds of cutting ().

And we know:

.

Note: when , is empty, and when , is empty.

With the same way, cut into two parts at a random position :

          left_B             |        right_B
    B[0], B[1], ..., B[j-1]  |  B[j], B[j+1], ..., B[n-1]

Put and into one set, and put and into another set. Let's name them and :

          left_part          |        right_part
    A[0], A[1], ..., A[i-1]  |  A[i], A[i+1], ..., A[m-1]
    B[0], B[1], ..., B[j-1]  |  B[j], B[j+1], ..., B[n-1]

If we can ensure:

then we divide all elements in into two parts with equal length, and one part is always greater than the other. Then

To ensure these two conditions, we just need to ensure:

  1. (or: )
    if , we just need to set:

  2. and

ps.1 For simplicity, I presume are always valid even if , , , or . I will talk about how to deal with these edge values at last.

ps.2 Why ? Because I have to make sure is non-negative since and . If , then may be negative, that will lead to wrong result.

So, all we need to do is:

Searching in , to find an object such that:

and where

And we can do a binary search following steps described below:

  1. Set , , then start searching in
  2. Set ,
  3. Now we have . And there are only 3 situations that we may encounter:

    • and
      Means we have found the object , so stop searching.


    • Means is too small. We must adjust to get .
      Can we increase ?
            Yes. Because when is increased, will be decreased.
            So is decreased and is increased, and may
            be satisfied.
      Can we decrease ?
            No! Because when is decreased, will be increased.
            So is increased and is decreased, and will
            be never satisfied.
      So we must increase . That is, we must adjust the searching range to .
      So, set , and goto 2.

    • :
      Means is too big. And we must decrease to get .
      That is, we must adjust the searching range to .
      So, set , and goto 2.

When the object is found, the median is:

when is odd

when is even

Now let's consider the edges values where may not exist. Actually this situation is easier than you think.

What we need to do is ensuring that . So, if and are not edges values (means all exist), then we must check both and . But if some of don't exist, then we don't need to check one (or both) of these two conditions. For example, if , then doesn't exist, then we don't need to check . So, what we need to do is:

Searching in , to find an object such that:

or or and
or or where

And in a searching loop, we will encounter only three situations:

  1. or or and
    or or
    Means is perfect, we can stop searching.
  2. and and
    Means is too small, we must increase it.
  3. and and
    Means is too big, we must decrease it.

Thanks to @Quentin.chen for pointing out that: and . Because:

So in situation 2. and 3. , we don't need to check whether and whether .

Complexity Analysis

  • Time complexity: .
    At first, the searching range is . And the length of this searching range will be reduced by half after each loop. So, we only need loops. Since we do constant operations in each loop, so the time complexity is . Since , so the time complexity is .

  • Space complexity: .
    We only need constant memory to store local variables, so the space complexity is .


Analysis written by: @MissMary