Given an array containing n distinct numbers taken from 0, 1, 2, ..., n
, find the one that is missing from the array.
Example 1
Input: [3,0,1] Output: 2
Example 2
Input: [9,6,4,2,3,5,7,0,1] Output: 8
Note:
Your algorithm should run in linear runtime complexity. Could you implement it using only constant extra space complexity?
Credits:
Special thanks to @jianchao.li.fighter for adding this problem and creating all test cases.
Intuition
If nums
were in order, it would be easy to see which number is missing.
Algorithm
First, we sort nums
. Then, we check the two special cases that can be
handled in constant time - ensuring that 0 is at the beginning and that
is at the end. Given that those assumptions hold, the missing number must
somewhere between (but not including) 0 and . To find it, we ensure that
the number we expect to be at each index is indeed there. Because we handled
the edge cases, this is simply the previous number plus 1. Thus, as soon as
we find an unexpected number, we can simply return the expected number.
Complexity Analysis
Time complexity :
The only elements of the algorithm that have asymptotically nonconstant
time complexity are the main for
loop (which runs in time), and
the sort
invocation (which runs in time for Python and Java).
Therefore, the runtime is dominated by sort
, and the entire runtime is
.
Space complexity : (or )
In the sample code, we sorted nums
in place, allowing us to avoid
allocating additional space. If modifying nums
is forbidden, we can
allocate an size copy and sort that instead.
Intuition
A brute force method for solving this problem would be to simply check for
the presence of each number that we expect to be present. The naive
implementation might use a linear scan of the array to check for containment,
but we can use a HashSet
to get constant time containment queries and
overall linear runtime.
Algorithm
This algorithm is almost identical to the brute force approach, except we
first insert each element of nums
into a set, allowing us to later query
for containment in time.
Complexity Analysis
Time complexity :
Because the set allows for containment queries, the main loop
runs in time. Creating num_set
costs time, as each set insertion
runs in amortized time, so the overall runtime is .
Space complexity :
nums
contains distinct elements, so it costs space to
store a set containing all of them.
Intuition
We can harness the fact that XOR is its own inverse to find the missing element in linear time.
Algorithm
Because we know that nums
contains numbers and that it is missing
exactly one number on the range , we know that definitely
replaces the missing number in nums
. Therefore, if we initialize an integer
to and XOR it with every index and value, we will be left with the
missing number. Consider the following example (the values have been sorted
for intuitive convenience, but need not be):
Index | 0 | 1 | 2 | 3 |
---|---|---|---|---|
Value | 0 | 1 | 3 | 4 |
Complexity Analysis
Time complexity :
Assuming that XOR is a constant-time operation, this algorithm does constant work on iterations, so the runtime is overall linear.
Space complexity :
This algorithm allocates only constant additional space.
Intuition
One of the most well-known stories in mathematics is of a young Gauss, forced to find the sum of the first 100 natural numbers by a lazy teacher. Rather than add the numbers by hand, he deduced a closed-form expression for the sum, or so the story goes. You can see the formula below:
Algorithm
We can compute the sum of nums
in linear time, and by Gauss' formula, we
can compute the sum of the first natural numbers in constant time. Therefore,
the number that is missing is simply the result of Gauss' formula minus the sum of nums
,
as nums
consists of the first natural numbers minus some number.
Complexity Analysis
Time complexity :
Although Gauss' formula can be computed in time, summing nums
costs us time, so the algorithm is overall linear. Because we have
no information about which number is missing, an adversary could always
design an input for which any algorithm that examines fewer than
numbers fails. Therefore, this solution is asymptotically optimal.
Space complexity :
This approach only pushes a few integers around, so it has constant memory usage.
Analysis written by: @emptyset