320. Generalized Abbreviation


Write a function to generate the generalized abbreviations of a word.

Example:

Given word = "word", return the following list (order does not matter):

["word", "1ord", "w1rd", "wo1d", "wor1", "2rd", "w2d", "wo2", "1o1d", "1or1", "w1r1", "1o2", "2r1", "3d", "w3", "4"]


Summary

This article is for intermediate readers. It introduces the following ideas: Backtracking and Bit Manipulation

Solution


Approach #1 (Backtracking) [Accepted]

Intuition

How many abbreviations are there for a word of length ? The answer is because each character can either be abbreviated or not, resulting in different abbreviations.

Algorithm

The backtracking algorithm enumerates a set of partial candidates that, in principle, could be completed in several choices to give all the possible solutions to the problem. The completion is done incrementally, by extending the candidate in many steps. Abstractly, the partial candidates can be seen as nodes of a tree, the potential search tree. Each partial candidate is the parent of the candidates that derives from it by an extension step; the leaves of the tree are the partial candidates that cannot be extended any further.

In our problem, the partial candidates are incomplete abbreviations that can be extended by one of the two choices:

  1. keep the next character;
  2. abbreviate the next character.

We extend the potential candidate in a depth-first manner. We backtrack when we reach a leaf node in the search tree. All the leaves in the search tree are valid abbreviations and shall be put into a shared list which will be returned at the end.

Complexity Analysis

  • Time complexity : . For each call to backtrack, it either returns without branching, or it branches into two recursive calls. All these recursive calls form a complete binary recursion tree with leaves and inner nodes. For each leaf node, it needs time for converting builder to String; for each internal node, it needs only constant time. Thus, the total time complexity is dominated by the leaves. In total that is .

  • Space complexity : . If the return list doesn't count, we only need auxiliary space to store the characters in StringBuilder and the space used by system stack. In a recursive program, the space of system stack is linear to the maximum recursion depth which is in our problem.


Approach #2 (Bit Manipulation) [Accepted]

Intuition

If we use to represent a character that is not abbreviated and to represent one that is. Then each abbreviation is mapped to an bit binary number and vice versa.

Algorithm

To generate all the abbreviation with non-repetition and non-omission, we need to follow rules. In approach #1, the rules are coded in the backtracking process. Here we introduce another way.

From the intuition section, each abbreviation has a one to one relationship to a bit binary number . We can use these numbers as blueprints to build the corresponding abbreviations.

For example:

Given word = "word" and x = 0b0011

Which means 'w' and 'o' are kept, 'r' and 'd' are abbreviated. Therefore, the result is "wo2".

Thus, for a number , we just need to scan it bit by bit as if it is an array so that we know which character should be kept and which should be abbreviated.

To scan a number bit by bit, one could extract its last bit by b = x & 1 and shift one bit to the right, i.e. x >>= 1. Doing this repeatedly, one will get all the bits of from last bit to first bit.

Complexity Analysis

  • Time complexity : . Building one abbreviation from the number , we need scan all the bits. Besides the StringBuilder::toString function is also linear. Thus, to generate all the , it costs time.

  • Space complexity : . If the return list doesn't count, we only need auxiliary space to store the characters in StringBuilder.