Given an array of integers and an integer k, find out whether there are two distinct indices i and j in the array such that nums[i] = nums[j] and the absolute difference between i and j is at most k.
This article is for beginners. It introduces the following ideas: Linear Search, Binary Search Tree and Hash Table.
Intuition
Look for duplicate element in the previous elements.
Algorithm
This algorithm is the same as Approach #1 in Contains Duplicate solution, except that it looks at previous elements instead of all its previous elements.
Another perspective of this algorithm is to keep a virtual sliding window of the previous elements. We scan for the duplicate in this window.
Java
public boolean containsNearbyDuplicate(int[] nums, int k) { for (int i = 0; i < nums.length; ++i) { for (int j = Math.max(i - k, 0); j < i; ++j) { if (nums[i] == nums[j]) return true; } } return false; } // Time Limit Exceeded.
Complexity Analysis
Time complexity : . It costs time for each linear search. Apparently we do at most comparisons in one search even if can be larger than .
Space complexity : .
Intuition
Keep a sliding window of elements using self-balancing Binary Search Tree (BST).
Algorithm
The key to improve upon Approach #1 above is to reduce the search time of the previous elements. Can we use an auxiliary data structure to maintain a sliding window of elements with more efficient search
, delete
, and insert
operations? Since elements in the sliding window are strictly First-In-First-Out (FIFO), queue is a natural data structure. A queue using a linked list implementation supports constant time delete
and insert
operations, however the search
costs linear time, which is no better than Approach #1.
A better option is to use a self-balancing BST. A BST supports search
, delete
and insert
operations all in time, where is the number of elements in the BST. In most interviews you are not required to implement a self-balancing BST, so you may think of it as a black box. Most programming languages provide implementations of this useful data structure in its standard library. In Java, you may use a TreeSet
or a TreeMap
. In C++ STL, you may use a std::set
or a std::map
.
If you already have such a data structure available, the pseudocode is:
true
if foundfalse
Java
public boolean containsNearbyDuplicate(int[] nums, int k) { Set<Integer> set = new TreeSet<>(); for (int i = 0; i < nums.length; ++i) { if (set.contains(nums[i])) return true; set.add(nums[i]); if (set.size() > k) { set.remove(nums[i - k]); } } return false; } // Time Limit Exceeded.
Complexity Analysis
Time complexity : . We do operations of search
, delete
and insert
. Each operation costs logarithmic time complexity in the sliding window which size is . Note that even if can be greater than , the window size can never exceed .
Space complexity : . Space is the size of the sliding window which should not exceed or .
Note
The algorithm still gets Time Limit Exceeded for large and .
Intuition
Keep a sliding window of elements using Hash Table.
Algorithm
From the previous approaches, we know that even logarithmic performance in search
is not enough.
In this case, we need a data structure supporting constant time search
, delete
and insert
operations.
Hash Table is the answer. The algorithm and implementation are almost identical to Approach #2.
true
if foundfalse
Java
public boolean containsNearbyDuplicate(int[] nums, int k) { Set<Integer> set = new HashSet<>(); for (int i = 0; i < nums.length; ++i) { if (set.contains(nums[i])) return true; set.add(nums[i]); if (set.size() > k) { set.remove(nums[i - k]); } } return false; }
Complexity Analysis
Time complexity : .
We do operations of search
, delete
and insert
, each with constant time complexity.
Space complexity : . The extra space required depends on the number of items stored in the hash table, which is the size of the sliding window, .