493. Reverse Pairs


Given an array nums, we call (i, j) an important reverse pair if i < j and nums[i] > 2*nums[j].

You need to return the number of important reverse pairs in the given array.

Example1:

Input: [1,3,2,3,1]
Output: 2

Example2:

Input: [2,4,3,5,1]
Output: 3

Note:

  1. The length of the given array will not exceed 50,000.
  2. All the numbers in the input array are in the range of 32-bit integer.


Solution


Approach #1 Brute force [Time Limit Exceeded]

Intuition

Do as directed in the question. We can simply check all the pairs if they are important reverse pairs or not.

Algorithm

  • Iterate over from to
    • Iterate over from to
      • If , increment

Complexity Analysis

  • Time complexity:
    • We iterate over all the possible pairs wherein () in the array which is
  • Space complexity: only constant extra space is required for , etc.

Trivia

The above code can be expressed as one-liner in Python:

Python

Herein, we iterate over all the pairs and store the boolean values for . Finally, we count the number of in the array and return the result.


Approach #2 Binary Search Tree [Accepted]

Intuition

In Approach #1, for each element , we searched the subarray for elements such that their value is greater than . In the previous approach, the search is linear. However, we need to make the process efficient. Maybe, memoization can help, but since, we need to compare the elements, we cannot find a linear DP solution.

Observe that the indices of the elements in subarray don't matter as we only require the count. So, we can sort the elements and perform binary search on the subarray. But, since the subarray keeps growing as we iterate to the next element, we need a data structure to store the previous result as well as to allow efficient searching(preferably ) - Binary Search Tree(BST) could be a good bet.

Refreshing BST

BST is a rooted binary tree, wherein each node is associated with a value and has 2 distinguishable sub-trees namely and subtree. The left subtree contains only the nodes with lower values than the parent's value, while the right subtree conatins only the nodes with greater values than the parent's value.

Voila!

This is exactly what is required. So, if we store our elements in BST, then we can search the larger elements thus eliminating the search on smaller elements altogether.

Algorithm

Define the of BST that stores the and pointers to the and . We also need a count of elements(say ) in the subtree rooted at the current node that are greater than or equal to the current node's . is initialized to 1 for each node and , pointers are set to .

We define the routine that recursively adds the given as an appropriate leaf node based on comparisons with the . Each time, the given is smaller than , we increment the and move the to the right subtree. While, if the is equal to the current , we simply increment the and exit. While, we move to the left subtree in case .

We also require the routine that gives the count of number of elements greater than or equal to the . In the routine, if the is NULL, return 0. Otherwise, if , we know the count of values greater than or equal to the , hence simply return . In case, , the ans is calculated by adding and recursively calling the routine with . And if , ans is obtained by recursively calling the routine with .

Now, we can get to our main logic:

  • Iterate over from to of :
    • Search the existing BST for and add the result to
    • Insert to the BST, hence updating the of the previous nodes

The algorithm can be better understood using the example below: !?!../Documents/493_reverse_pairs.json:1000,662!?!

Complexity analysis

  • Time complexity:
    • The best case complexity for BST is for search as well as insertion, wherein, the tree formed is complete binary tree
    • Whereas, in case like [1,2,3,4,5,6,7,8,...], insertion as well as search for an element becomes in time, since, the tree is skewed in only one direction, and hence, is no better than the array
    • So, in worst case, for searching and insertion over n items, the complexity is
  • Space complexity: extra space for storing the BST in class.

Approach #3 BIT [Accepted]

Intuition

The problem with BST is that the tree can be skewed hence, making it in complexity. So, need a data structure that remains balanced. We could either use a Red-black or AVL tree to make a balanced BST, but the implementation would be an overkill for the solution. We can use BIT (Binary Indexed Tree, also called Fenwick Tree) to ensure that the complexity is with only 12-15 lines of code.

BIT Overview:

Fenwick Tree or BIT provides a way to represent an array of numbers in an array(can be visualized as tree), allowing prefix/suffix sums to be calculated efficiently(). BIT allows to update an element in time.

We recommend having a look at BIT from the following links before getting into details:

So, BIT is very useful to accumulate information from front/back and hence, we can use it in the same way we used BST to get the count of elements that are greater than or equal to in the existing tree and then adding the current element to the tree.

Algorithm

First, lets review the BIT and routines of BIT. According to the convention, routine goes from to , i.e., gives the sum for the range , and updates the values from current to the end of array. But, since, we require to find the numbers greater than the given index, as and when we update an index, we update all the ancestors of the node in the tree, and for , we go from the node to the end.

The modified algorithm is:

update(BIT,index, val):
  while(index<0):
    BIT[index]+=val
    index-=(index&(-index))

Herein, we find get the next index using: , which is essentially subtracting the rightmost 1 from the binary representation. We update the previous indices since, if an element is greater than the index

And the modified algorithm is:

query(BIT,index):
  sum=0
  while(index<BIT.size):
    sum+=BIT[index]
    index+=(index&(-index))

Herein, we find get the next index using: . This gives the suffix sum from to the end.

So, the main idea is to count the number of elements greater than in range as we iterate from to . The steps are as follows:

  • Create a copy of , say ans sort . This array is actually used for creating the Binary indexed tree
  • Initialize and array of size to store the BIT
  • Iterate over from to :
    • Search the index of element not less than in array. the obtained index+1 in the , and add the result to
    • Search for the index of element not less than in . We need to the BIT for this index by 1. This essentially means that 1 is added to this index(or number of elements greater than this index is incremented). The effect of adding to the index is passed to the ancestors as shown in algorithm

Complexity analysis

  • Time complexity:
    • In and operations, we see that the loop iterates at most the number of bits in which can be at most . Hence, the complexity of both the operations is (Number of bits in is )
    • The in-built operation is binary search hence
    • We perform the operations for elements, hence the total complexity is
  • Space complexity: . Additional space for array

Approach #4 Modified Merge Sort [Accepted]

Intuition

In BIT and BST, we iterate over the array, dividing the array into 3 sections: already visited and hence added to the tree, current node and section to be visited. Another approach could be divide the problem into smaller subproblems, solving them and combining these problems to get the final result - Divide and conquer. We see that the problem has a great resemblance to the merge sort routine. The question is to find the inversions such that and . So, we can easily modify the merge sort to count the inversions as required.

Mergesort

Mergesort is a divide-and-conquer based sorting technique that operates in time. The basic idea to divide the array into several sub-arrays until each sub-array is single element long and merging these sublists recursively that results in the final sorted array.

Algorithm

We define routine that takes parameters an array say and and indices:

  • If >= this implies that elements can no longer be broken further and hence we return 0
  • Otherwise, set
  • Store by recursively calling on range and and adding the results. This is the divide step on our routine, breaking it into the 2 ranges, and finding the results for each range separately
  • Now, we that we have separately calculated the results for ranges and , but we still have to count the elements in that are greater than 2 * elements in . Count all such elements and add the result to
  • Finally, the array from to
    • Make 2 array : from elements in range and from elements in range
    • Keep pointers and to and respectively both initialized to start to the arrays
    • Iterate over from to and set to the smaller of or and increment the respective index

Complexity analysis

  • Time complexity:
    • In each step we divide the array into 2 sub-arrays, and hence, the maximum times we need to divide is equal to
    • Additional work needs to be done to count the inversions and to merge the 2 sub-arrays after sorting. Hence total time complexity is
  • Space complexity: . Additional space for storing and arrays

Analysis written by @abhinavbansal0.

Shoutout to @FUN4LEETCODE for the brilliant post!