214. Shortest Palindrome


Given a string S, you are allowed to convert it to a palindrome by adding characters in front of it. Find and return the shortest palindrome you can find by performing this transformation.

For example:

Given "aacecaaa", return "aaacecaaa".

Given "abcd", return "dcbabcd".

Credits:
Special thanks to @ifanchu for adding this problem and creating all test cases. Thanks to @Freezen for additional test cases.


Solution


Approach #1 Brute force [Accepted]

Intuition

According to the question, we are allowed to insert the characters only at the beginning of the string. Hence, we can find the largest segment from the beginning that is a palindrome, and we can then easily reverse the remaining segment and append to the beginning. This must be the required answer as no shorter palindrome could be found than this by just appending at the beginning.

For example: Take the string . Here, the largest palindrome segment from beginning is , and the remaining segment is . Hence the required string is reverse of ( = ) + original string( = ) = .

Algorithm

  • Create the reverse of the original string , say . This is used for comparison to find the largest palindrome segment from the front.
  • Iterate over the variable from 0 to the :
    • If (i.e. substring of from to is equal to the substring of from to the end of string). This essentially means that that substring from to is a palindrome, as is the reverse of .
    • Since, we find the larger palindromes first, we can return reverse of largest palindrome + as soon as we get it.

Complexity Analysis

  • Time complexity: .

    • We iterate over the entire length of string .
    • In each iteration, we compare the substrings which is linear in size of substrings to be compared.
    • Hence, the total time complexity is .
  • Space complexity: extra space for the reverse string .


Approach #2 Two pointers and recursion [Accepted]

Intuition

In Approach #1, we found the largest palindrome substring from the string using substring matching which is in length of substring. We could make the process more efficient if we could reduce the size of string to search for the substring without checking the complete substring each time.

Lets take a string . Let us consider 2 pointers and . Initialize . Iterate over from to , incrementing each time . Now, we just need to search in range . This way, we have reduced the size of string to search for the largest palindrome substring from the beginning. The range must always contain the largest palindrome substring. The proof of correction is that: Say the string was a perfect palindrome, would be incremented times. Had there been other characters at the end, would still be incremented by the size of the palindrome. Hence, even though there is a chance that the range is not always tight, it is ensured that it will always contain the longest palindrome from the beginning.

The best case for the algorithm is when the entire string is palindrome and the worst case is string like , wherein first becomes (check by doing on paper), and we need to recheck in [0,12) corresponding to string . Again continuing in the same way, we get . In such a case, the string is reduced only by as few as 2 elements at each step. Hence, the number of steps in such cases is linear().

This reduction of length could be easily done with the help of a recursive routine, as shown in the algorithm section.

Algorithm

The routine is recursive and takes string as parameter:

  • Initialize
  • Iterate over from to :
    • If , increase by
  • If equals the size of , the entire string is palindrome, and hence return the entire string .
  • Else:
    • Return reverse of remaining substring after to the end of string + routine on substring from start to index + remaining substring after to the end of string.

Complexity analysis

  • Time complexity: .
    • Each iteration of is linear in size of substring and the maximum number of recursive calls can be times as shown in the Intuition section.
    • Let the time complexity of the algorithm be T(n). Since, at the each step for the worst case, the string can be divide into 2 parts and we require only one part for further computation. Hence, the time complexity for the worst case can be represented as : . So, which is .

Thanks @CONOVER for the time complexity analysis.

  • Space complexity: extra space for string.

Approach #3 KMP [Accepted]

Intuition

We have seen that the question boils down to finding the largest palindrome substring from the beginning.

The people familiar with KMP(Knuth–Morris–Pratt) algorithm may wonder that the task at hand can be easily be compared with the concept of the lookup table in KMP.

KMP Overview:

KMP is a string matching algorithm that runs in times, where and are sizes of the text and string to be searched respectively. The key component of KMP is the failure function lookup table,say . The purpose of the lookup table is to store the length of the proper prefix of the string that is also a suffix of . This table is important because if we are trying to match a text string for , and we have matched the first positions, but when we fail, then the value of lookup table for is the longest prefix of that could possibly match the text string upto the point we are at. Thus, we don't need to start all over again, and can resume searching from the matching prefix.

The algorithm to generate the lookup table is easy and inutitive, as given below:

f(0) = 0
for(i = 1; i < n; i++)
{
    t = f(i-1)
    while(t > 0 && b[i] != b[t])
        t = f(t-1)
    if(b[i] == b[t]){
        ++t
    f(i) = t
}
  • Here, we first set f(0)=0 since, no proper prefix is available.
  • Next, iterate over from to :
    • Set
    • While t>0 and char at doesn't match the char at position, set , which essentially means that we have problem matching and must consider a shorter prefix, which will be , until we find a match or t becomes 0.
    • If , add 1 to t
    • Set

The lookup table generation is as illustrated below:

KMP

Wait! I get it!!

In Approach #1, we reserved the original string and stored it as . We iterate over from to and check for . Pondering over this statement, had the been concatenated to , this statement is just finding the longest prefix that is equal to the suffix. Voila!

Algorithm

  • We use the KMP lookup table generation
  • Create as and use the string in the lookup-generation algorithm
    • The "#" in the middle is required, since without the #, the 2 strings could mix with each ther, producing wrong answer. For example, take the string . Had we not inserted "#" in the middle, the new string would be and the largest prefix size would be 7 corresponding to "aaaaaaa" which would be obviously wrong. Hence, a delimiter is required at the middle.
  • Return reversed string after the largest palindrome from beginning length(given by ) + original string

Complexity analysis

  • Time complexity: .

    • In every iteration of the inner while loop, decreases until it reaches 0 or until it matches. After that, it is incremented by one. Therefore, in the worst case, can only be decreased up to times and increased up to times.
    • Hence, the algorithm is linear with maximum iterations.
  • Space complexity: . Additional space for the reverse string and the concatenated string.


Analysis written by @abhinavbansal0.