Given an array of n distinct non-empty strings, you need to generate minimal possible abbreviations for every word following rules below.
Example:
Input: ["like", "god", "internal", "me", "internet", "interval", "intension", "face", "intrusion"] Output: ["l2e","god","internal","me","i6t","interval","inte4n","f2e","intr4n"]Note:
Intuition
Let's choose the shortest abbreviation for each word. Then, while we have duplicates, we'll increase the length of all duplicates.
Algorithm
For example, let's say we have "aabaaa", "aacaaa", "aacdaa"
, then we start with "a4a", "a4a", "a4a"
. Since these are duplicated, we lengthen them to "aa3a", "aa3a", "aa3a"
. They are still duplicated, so we lengthen them to "aab2a", "aac2a", "aac2a"
. The last two are still duplicated, so we lengthen them to "aacaaa", "aacdaa"
.
Throughout this process, we were tracking an index prefix[i]
which told us up to what index to take the prefix to. For example, prefix[i] = 2
means to take a prefix of word[0], word[1], word[2]
.
Complexity Analysis
Time Complexity: where is the number of characters across all words in the given array.
Space Complexity: .
Intuition and Algorithm
Two words are only eligible to have the same abbreviation if they have the same first letter, last letter, and length. Let's group each word based on these properties, and then sort out the conflicts.
In each group G
, if a word W
has a longest common prefix P
with any other word X
in G
, then our abbreviation must contain a prefix of more than |P|
characters. The longest common prefixes must occur with words adjacent to W
(in lexicographical order), so we can just sort G
and look at the adjacent words.
Complexity Analysis
Time Complexity: where is the number of characters across all words in the given array. The complexity is dominated by the sorting step.
Space Complexity: .
Intuition and Algorithm
As in Approach #1, let's group words based on length, first letter, and last letter, and discuss when words in a group do not share a longest common prefix.
Put the words of a group into a trie (prefix tree), and count at each node (representing some prefix P
) the number of words with prefix P
. If the count is 1, we know the prefix is unique.
Complexity Analysis
Time Complexity: where is the number of characters across all words in the given array.
Space Complexity: .
Analysis written by: @awice. Approach #1 inspired by @ckcz123.