462. Minimum Moves to Equal Array Elements II


Given a non-empty integer array, find the minimum number of moves required to make all array elements equal, where a move is incrementing a selected element by 1 or decrementing a selected element by 1.

You may assume the array's length is at most 10,000.

Example:

Input:
[1,2,3]

Output:
2

Explanation:
Only two moves are needed (remember each move increments or decrements one element):

[1,2,3]  =>  [2,2,3]  =>  [2,2,2]


Solution


Approach #1 Brute Force [Time Limit Exceeded]

In the brute force approach, we consider every possible number to which all the array elements should be equated so as to minimize the number of moves required. One point is obvious that the number to which all the elements are equated at the end should lie between the minimum and the maximum elements present in the array. Thus, we first find the minimum and the maximum element in the array. Suppose is the number to which all the elements are equated. Then, we iterate over the range between the minimum and maximum values and find the number of moves required for each , simultaneously finding the minimum moves, which will be the end result.

Complexity Analysis

  • Time complexity : , where is the length of the array and is the difference between maximum element and minimum element.
  • Space complexity : . No extra space required.

Approach #2 Better Brute Force[Accepted]

Algorithm

In this approach, rather than choosing every possible between the minimum and the maximum values in the array, we can simply consider as every element of the array. To understand why we need not iterate over all the complete range but only the elements of the array, consider the following example.

Say the array is:

. Now, if we try to equalize all the elements to , which by the way, may or may not be the final number required to be settled down to.

The total number of moves for doing this is given by:

Suppose, now, instead of , we try to equalize all the elements to a number , which is not present in the given array, but is slightly larger than and is thus given by say , where is an integer. Thus, the total number of moves required now will be given by:

...using from above

From this equation, it is clear that the number of moves required to settle to some arbitrary number present in the array is always lesser than the number of moves required to settle down to some arbitrary number . This completes the proof.

Complexity Analysis

  • Time complexity : . Two nested loops are there.

  • Space complexity : . No extra space required.


Approach #3 Using Sorting [Accepted]

Algorithm

In the previous approach, we needed to find the number of moves required for every chosen from the array, by iterating over the whole array. We can optimize this approach to sum extent by sorting the array and observing the following fact. The number of moves required to raise the elements smaller than to equalize them to will be given by: (The meanings of the keywords are given below) . Similarly, the number of moves required to decrement the elements larger than to equalize them to will be: . The total number of moves required will, thus, be the sum of these two parts. Hence, for a particular chosen, the total number of moves required will be given by:

where, = The number to which all the elements are equalized at the end.

= The number of elements which are lesser than .

= The sum of elements which are lesser than .

= The number of elements which are larger than .

= The sum of elements which are larger than .

= The total number of moves required to equalize all the elements of the array to .

Let's say that the index of the element corresponding to the element be given by . Instead of iterating over the array for calculating and , we can keep on calculating them while traversing the array since the array is sorted. We calculate the total sum of the given array once, given by . We start by choosing and as . To calculate , we just add the element to the previous . To calculate , we subtract the element from the previous .

Complexity Analysis

  • Time complexity : . Sorting will take time.

  • Space complexity : . No extra space required.


Approach #4 Using Median and Sorting [Accepted]

Algorithm

The problem of finding the number to which all the other numbers eventually settle can also be viewed as: Given a set of points in 1-d. Find a point such that the cumulative sum of distances between and the rest of the points is minimum. This is a very common mathematical problem whose answer is known. The point is the median of the given points. The reason behind choosing the median is given after the algorithm.

We can simply sort the given points and find the as the element in the middle of the array. Thus, the total number of moves required to equalize all the array elements is given by the sum of differences of all the elements from the . In mathematical terms, the solution is given by:

, where is the size of the given array.

!?!../Documents/462_Minimum_Moves1.json:1000,563!?!

Now, we'll look at the mathematical reasoning behind choosing the median as the number to which we'll settle. As discussed in the previous approach, the total number of moves required is given by:

, where all the variables have the same definition.

Now, as we know, in order to maximize this term w.r.t. the changes in , we can take the derivative of the above term w.r.t. . Thus, we proceed as:

Setting derivative equal to , we get:

or . This property is satisfied by the median only, which completes the proof.

Complexity Analysis

  • Time complexity : . Sorting will take time.

  • Space complexity : . Only single extra variable is used.


Approach #5 Without finding Median [Accepted]

Algorithm

In the previous approach, we went for finding the median after sorting and then calculated the number of moves required. But, if we observe properly, we'll find that if the array is sorted, we can do the same task without actually finding the median or the number to which we need to settle at the end. To proceed with this, let's look at the maximum() and the minimum numbers() in the array, which currently lie at its extreme positions. We know, at the end, both these numbers should be equalized to . For the number , the number of moves required to do this is given by . Similarly, for the number , the number of moves is given by . Thus, the total number of moves for both and is given by , which is independent of the number . Thus, we can continue now, with the next maximum and the next minimum number in the array, until the complete array is exhausted.

Therefore, the equation becomes:

, where is the number of elements in the array .

Complexity Analysis

  • Time complexity : . Sorting will take time.

  • Space complexity : . No extra space required.


Approach #6 Using quick-select [Accepted]

Algorithm

In order to find the median, we need not necessarily sort the given array. But we can find the median directly using the Quick-Select method to find the median, which doesn't use sorting.

The quick-select method is similar to the Quick-Sort method. In a single iteration, we choose a pivot and somehow bring it to its correct position in the array. If the correct position happens to be the central position(corresponding to the median), we can return the median directly from there. Now, let's look at the implementation of quick-select.

Quick-Select makes use of two functions and . function takes the leftmost and the rightmost indices of the given array and the central index as well. If the element reaching the correct position in the current function call to function happens to be the median(i.e. it reaches the central position), we return the element(since it is the median). The function takes the leftmost and the rightmost indices of the array and returns the correct position of the current pivot(which is chosen as the rightmost element of the array). This function makes use of two pointers and . Both the pointers initially point to the leftmost element of the array.

At every step, we compare the element at the index() with the pivot element(). If , we swap the elements and and increment and . Otherwise, only is incremented. When reaches the end of the array, we swap the with . In this way, now, all the elements lesser than lie to the left of the index, and all the elements larger than lie to the right of the index and thus, the reaches at its correct position in the array. If this position isn't the central index of the array, we again make use of the functions passing the left and the right subarrays relative to the index.

For more clarification, look at the animation below for this example: [3 8 2 5 1 4 7 6]

!?!../Documents/462_Minimum_Moves2.json:1000,563!?!

After finding the median, we can find the sum of absolute differences of all the elements from the median to determine the number of moves required. Mathematically, we use:

, where is the size of the given array.

Complexity Analysis

  • Time complexity : Average Case: . Quick-Select average case time complexity is . Worst Case: . In worst case quick-select can go upto

  • Space complexity : . No extra space required.


Approach #7 Using Median of Medians [Accepted]

Algorithm

It isn't hard to see that, in quick-select, if we naively choose the pivot element, this algorithm has a worst case performance of . To guarantee the linear running time in order to find the median, however we need a strategy for choosing the pivot element that guarantees that we partition the list into two sublists of relatively comparable size. Obviously the median of the values in the list would be the optimal choice, but if we could find the median in linear time, we would already have a solution to our problem.

The median-of-medians algorithm chooses its pivot in the following clever way:

  1. Divide into groups where size of each group is 5 elements, except possibly the last group which may have less than 5 elements.

  2. Sort the above created groups and find median of all groups. Create an auxiliary array and store medians of all groups in this median array. Also, recursively call this method to find median of median[0...]

  3. =

  4. Partition around and obtain its position(i.e. use as the pivot element).

  5. If return

  6. If return
  7. If return

Using the above method ensures that the chosen pivot, in the worst case, has atmost 70% elements which are larger/smaller than the pivot. The proof of the same as well as the reason behind choosing the group size of 5 is given in the explanation of time complexity.

Complexity Analysis

  • Time complexity : . Worst case time complexity is .

  • Space complexity : . No extra space required.

Proof: Time Complexity :

The worst case time complexity of the above algorithm is . Let us analyze all steps.

The steps 1. and 2. take time as finding median of an array of size 5 takes O(1) time and there are such arrays. The step 3. takes time(if the whole algorithm takes time). The step 4. is standard partition and takes time. The interesting steps are 6. and 7. At most, one of them is executed. These are recursive steps. What is the worst case size of these recursive calls? The answer is maximum number of elements greater than medOfMed (obtained in step 3) or maximum number of elements smaller than .

How many elements are greater than and how many are smaller?

Let's assume that the list of medians obtained from step 2. in the sorted order be , where is the median chosen as the pivot. To find an upper bound on the number of elements in the given array smaller than our pivot, first consider the half of the medians from step 2() which are smaller than the pivot. It is possible for all five of the elements in the sublists corresponding to these medians to be smaller than the pivot(, which leads to an upper bound of such elements. Now consider the half of the medians from step 2 which are larger than the pivot (). It is only possible for two of the elements(which are smaller than the respective medians) in the sublists corresponding to these medians to be smaller than the pivot(), which leads to an upper bound of such elements. In addition, the sublist containing the pivot() contributes exactly two elements smaller than the pivot. It total, we may have at most:

elements smaller than the pivot, or approximately 70% of the list. The same upper bound applies the the number of elements in the list larger than the pivot. It is this guarantee that the partitions cannot be too lopsided that leads to linear run time.

Thus, the minimum number of elements which are smaller or larger than the chosen pivot() is given by or nearly 30% of the elements.

In the worst case, the function recurs for at most times.

Note that for and that any input of 80 or fewer elements requires time. We can therefore obtain the recurrence:

$$T(n) <=

We show that the running time is linear by substitution. Assume that for some constant and all . Substituting this inductive hypothesis into the right-hand side of the recurrence yields

since we can pick c large enough so that is larger than the function described by the term for all . The worst-case running time of is therefore linear.

Choosing the group size of 3 leads to at least half of the n/3 blocks having at least 2 elements , hence this gives a n/3 split, or 2n/3 in the worst case.

This gives = , which reduces to in the worst case.

There is no reason why you should not use something greater than five; for example with seven the inequality would be

which also works, but five is the smallest odd number (useful for medians) which works.


Analysis written by: @vinod23