268. Missing Number


Given an array containing n distinct numbers taken from 0, 1, 2, ..., n, find the one that is missing from the array.

Example 1

Input: [3,0,1]
Output: 2

Example 2

Input: [9,6,4,2,3,5,7,0,1]
Output: 8


Note:
Your algorithm should run in linear runtime complexity. Could you implement it using only constant extra space complexity?

Credits:
Special thanks to @jianchao.li.fighter for adding this problem and creating all test cases.


Approach #1 Sorting [Accepted]

Intuition

If nums were in order, it would be easy to see which number is missing.

Algorithm

First, we sort nums. Then, we check the two special cases that can be handled in constant time - ensuring that 0 is at the beginning and that is at the end. Given that those assumptions hold, the missing number must somewhere between (but not including) 0 and . To find it, we ensure that the number we expect to be at each index is indeed there. Because we handled the edge cases, this is simply the previous number plus 1. Thus, as soon as we find an unexpected number, we can simply return the expected number.

Complexity Analysis

  • Time complexity :

    The only elements of the algorithm that have asymptotically nonconstant time complexity are the main for loop (which runs in time), and the sort invocation (which runs in time for Python and Java). Therefore, the runtime is dominated by sort, and the entire runtime is .

  • Space complexity : (or )

    In the sample code, we sorted nums in place, allowing us to avoid allocating additional space. If modifying nums is forbidden, we can allocate an size copy and sort that instead.


Approach #2 HashSet [Accepted]

Intuition

A brute force method for solving this problem would be to simply check for the presence of each number that we expect to be present. The naive implementation might use a linear scan of the array to check for containment, but we can use a HashSet to get constant time containment queries and overall linear runtime.

Algorithm

This algorithm is almost identical to the brute force approach, except we first insert each element of nums into a set, allowing us to later query for containment in time.

Complexity Analysis

  • Time complexity :

    Because the set allows for containment queries, the main loop runs in time. Creating num_set costs time, as each set insertion runs in amortized time, so the overall runtime is .

  • Space complexity :

    nums contains distinct elements, so it costs space to store a set containing all of them.


Approach #3 Bit Manipulation [Accepted]

Intuition

We can harness the fact that XOR is its own inverse to find the missing element in linear time.

Algorithm

Because we know that nums contains numbers and that it is missing exactly one number on the range , we know that definitely replaces the missing number in nums. Therefore, if we initialize an integer to and XOR it with every index and value, we will be left with the missing number. Consider the following example (the values have been sorted for intuitive convenience, but need not be):

Index 0 1 2 3
Value 0 1 3 4

Complexity Analysis

  • Time complexity :

    Assuming that XOR is a constant-time operation, this algorithm does constant work on iterations, so the runtime is overall linear.

  • Space complexity :

    This algorithm allocates only constant additional space.


Approach #4 Gauss' Formula [Accepted]

Intuition

One of the most well-known stories in mathematics is of a young Gauss, forced to find the sum of the first 100 natural numbers by a lazy teacher. Rather than add the numbers by hand, he deduced a closed-form expression for the sum, or so the story goes. You can see the formula below:

Algorithm

We can compute the sum of nums in linear time, and by Gauss' formula, we can compute the sum of the first natural numbers in constant time. Therefore, the number that is missing is simply the result of Gauss' formula minus the sum of nums, as nums consists of the first natural numbers minus some number.

Complexity Analysis

  • Time complexity :

    Although Gauss' formula can be computed in time, summing nums costs us time, so the algorithm is overall linear. Because we have no information about which number is missing, an adversary could always design an input for which any algorithm that examines fewer than numbers fails. Therefore, this solution is asymptotically optimal.

  • Space complexity :

    This approach only pushes a few integers around, so it has constant memory usage.


Analysis written by: @emptyset