Courtesy of SQL Server Pro. You can find the original article here.

My dad, Gabriel Ben-Gan, passed away recently. He loved numbers, logic and puzzles, and used to solve problems in his own unique way. This article is about a puzzle that incorporates the above ingredients. Dad, this one’s for you, and is in your memory.

I’d like to thank David Shipley, who originally sent me this beautiful puzzle. The problem is pretty simple, but the solution requires some creativity.

As part of the solution to the puzzle I use an auxiliary table called dbo.Nums, which contains a sequence of integers starting with 1 in a column called n. You can find this auxiliary table in the sample database TSQLV3. You can download the source code to create this sample database here.

**The Puzzle**

The puzzle is about coping with typographical errors (typos) when a keyword is entered for search purposes.

You get as inputs three strings:

- A keyword that the user entered for search (@keyword)
- A source word (@src)
- A synonym for the source word (@synonym)

Here’s an example for possible input values:

DECLARE @keyword AS NVARCHAR(1000) = N'wwAMxxAMyyAMzz', @src AS NVARCHAR(1000) = N'AM', @synonym AS NVARCHAR(1000) = N'AN';

Your task is to generate all possible permutations of the input @keyword where each occurrence of @src is either replaced with @synonym or not. For example, for the above inputs the desired output is:

wwAMxxAMyyAMzz wwANxxAMyyAMzz wwAMxxANyyAMzz wwANxxANyyAMzz wwAMxxAMyyANzz wwANxxAMyyANzz wwAMxxANyyANzz wwANxxANyyANzz

To work…

**Step 1: Generate permutations**

The solution to this puzzle can be broken into five main steps.

The first step is to return a sequence of integers in the range 0 through num_permutations – 1, where num_permutations represents the number of permutations of @keyword that need to be created. The num_permutations value can be computed as 2^num_ocurrences, where num_occurrences is the number of occurrences of @src in @keyword. With our sample input values, @src is AM and @keyword is wwAMxxAMyyAMzz, so num_ocurrences = 3, and therefore num_permutations = 8.

If you’re wondering, “What’s the logic behind computing the number of permutations?” think of the template for the permutations as having placeholders for all occurrences of @src in @keyword: ww?xx?yy?zz. Each placeholder is like a bit in an integer bitmap. When the bit is not set, you replace the placeholder with @src; when it’s set you replace it with @synonym. With three bits, the possible integers fall into the range 0 through 7 (2^3 – 1). The bitmaps of the integers in this range represent the permutations of the aforementioned template that you need to create.

The following expression computes num_ocurrences:

(LEN(@keyword) - LEN(REPLACE(@keyword, @src, N''))) / LEN(@src)

The following query generates the integers whose bitmaps represent the permutations that you need to create:

SELECT n-1 AS permutation FROM TSQLV3.dbo.Nums WHERE n <= POWER(2, (LEN(@keyword) - LEN(REPLACE(@keyword, @src, N''))) / LEN(@src));

Table 1 shows the output of this query with the bitmap representation of the integers.

**Table 1: Permutations**

permutation binperm ----------- ------- 0 000 1 001 2 010 3 011 4 100 5 101 6 110 7 111

**Step 2: Compute injection positions**

Call the result of the first step P. The second step creates a row for every combination of permutation in P and number n in the range 1 through num_ocurrences. Each occurrence of a @src element in @keyword may be preceded by some non-@src element and may be followed by some non-@src element. So, for every n in the range 1 through num_ocurrences we can think of the positions of the @src elements among the @src and non-@src elements as being 2 * n. For example, in our input @keyword we have the following elements:

1 - ww 2 - AM 3 - xx 4 - AM 5 - yy 6 - AM 7 - zz

As you can see, the three occurrences of @src are in element positions 2, 4 and 6. If a non-@src element doesn’t exist before or after a @src element, you can think of it as existing but being an empty string.

As mentioned, the second step of our solution generates a row for every permutation in P and occurrence n of @src in @keyword. You compute the element position of the replacement element (call it pos) as 2 * n. Then depending on the state of the nth bit in permutation, the replacement element is either @src (bit isn’t set) or @synonym (bit is set). The code in Listing 2 implements the second step of the solution.

**Listing 1: Step 2 – Compute injection positions**

WITH P AS ( SELECT n-1 AS permutation FROM TSQLV3.dbo.Nums WHERE n <= POWER(2, (LEN(@keyword) - LEN(REPLACE(@keyword, @src, N''))) / LEN(@src)) ) SELECT * FROM P CROSS APPLY ( SELECT N.n, N.n * 2 AS pos, -- * 2 to account for a preceeding element CASE permutation & POWER(2, N.n - 1) WHEN 0 THEN @src ELSE @synonym END AS element FROM TSQLV3.dbo.Nums AS N WHERE N.n <= (LEN(@keyword) - LEN(REPLACE(@keyword, @src, N''))) / LEN(@src) ) AS A ORDER BY permutation, pos, element;

The code in Listing 1 defines a CTE called P based on the query that implements the first step in the solution. The outer query against P uses the CROSS APPLY operator to create for every permutation and occurrence of @src in @keyword the replacement elements. The APPLY operator applies a query against Nums (call it N). As mentioned, for occurrence N.n, the target element position (pos) is N.n * 2. To know whether to use @src or @sysnonym as the target element, you need to check whether the nth bit is set. You do so with the following expression:

permutation & POWER(2, N.n - 1)

You then use a CASE expression to return either @src or @synonym based on the result of this expression.

The output of the code in Listing 2 with our sample input values is shown in Table 2 in abbreviate form.

**Table 2: Injection positions**

permutation binperm n pos element ----------- ------- --- ----------- ------- 0 000 1 2 AM 0 000 2 4 AM 0 000 3 6 AM 1 001 1 2 AN 1 001 2 4 AM 1 001 3 6 AM 2 010 1 2 AM 2 010 2 4 AN 2 010 3 6 AM ...

Again, I added a column called binperm that shows the bitmap representation of the permutation value.

**Step 3: Split original elements**

The third step creates the set of elements that precede or follow the @src elements in @keyword. It also computes corresponding element positions, leaving room for the replacement elements to be injected. To achieve this you use classic string-splitting logic based on an auxiliary table of numbers (you can find coverage of such string splitting solutions here and here). In this case, you split the string stored in @keyword, using @src as the separator. The following code implements the third step in the solution:

SELECT (ROW_NUMBER() OVER(ORDER BY n) - 1) * 2 + 1 AS pos, SUBSTRING(@keyword, n, CHARINDEX(@src, @keyword + @src, n) - n) AS element FROM TSQLV3.dbo.Nums WHERE n <= LEN(@keyword) + 1 AND SUBSTRING(@src + @keyword, n, LEN(@src)) = @src;

This code, applied to our sample input values, generates the output shown in Table 3.

**Table 3: Split string**

pos element -------------------- -------- 1 ww 3 xx 5 yy 7 zz

**Step 4: Unifies original elements and elements to inject**

The fourth step simply combines the result of step 2 (replacement elements to inject) with the result of step 3 (original elements). The code in Listing 4 implements this step.

**Listing 2: Step 4 – Combine steps 2 and 3**

WITH P AS -- Permutations ( SELECT n-1 AS permutation FROM TSQLV3.dbo.Nums WHERE n <= POWER(2, (LEN(@keyword) - LEN(REPLACE(@keyword, @src, N''))) / LEN(@src)) ), E AS -- Orignal split elements minus @src occurrences ( SELECT (ROW_NUMBER() OVER(ORDER BY n) - 1) * 2 + 1 AS pos, SUBSTRING(@keyword, n, CHARINDEX(@src, @keyword + @src, n) - n) AS element FROM TSQLV3.dbo.Nums WHERE n <= LEN(@keyword) + 1 AND SUBSTRING(@src + @keyword, n, LEN(@src)) = @src ) SELECT * FROM P CROSS APPLY ( -- Elements to inject SELECT N.n * 2 AS pos, CASE permutation & POWER(2, N.n - 1) WHEN 0 THEN @src ELSE @synonym END AS element FROM TSQLV3.dbo.Nums AS N WHERE N.n <= (LEN(@keyword) - LEN(REPLACE(@keyword, @src, N''))) / LEN(@src) UNION ALL -- Original elements SELECT pos, element FROM E) AS A ORDER BY permutation, pos, element;

Like before, the CTE P represents the bitmaps for the permutations of @keyword that need to be created. The CTE E represents the original split elements (the query implementing step 3). The outer query uses a CROSS APPLY operator that applies logic to generate the elements to inject unified with the original elements from E. The output of the code in Listing 2, applied to our sample input values is shown in Table 4.

** Table 4: Unified original elements and elements to inject**

permutation pos element ----------- -------------------- ------- 0 1 ww 0 2 AM 0 3 xx 0 4 AM 0 5 yy 0 6 AM 0 7 zz 1 1 ww 1 2 AN 1 3 xx 1 4 AM 1 5 yy 1 6 AM 1 7 zz 2 1 ww 2 2 AM 2 3 xx 2 4 AN 2 5 yy 2 6 AM 2 7 zz ...

**Step 5: Concatenate elements**

The fifth and final step concatenates the elements of each permutation to one string based on pos ordering. I use the classic FOR XML PATH technique to concatenate the elements. Listing 3 has the complete solution for the puzzle.

**Listing 3: Complete solution**

WITH P AS -- Permutations ( SELECT n-1 AS permutation FROM TSQLV3.dbo.Nums WHERE n <= POWER(2, (LEN(@keyword) - LEN(REPLACE(@keyword, @src, N''))) / LEN(@src)) ), E AS -- Orignal split elements minus @src occurrences ( SELECT (ROW_NUMBER() OVER(ORDER BY n) - 1) * 2 + 1 AS pos, SUBSTRING(@keyword, n, CHARINDEX(@src, @keyword + @src, n) - n) AS element FROM TSQLV3.dbo.Nums WHERE n <= LEN(@keyword) + 1 AND SUBSTRING(@src + @keyword, n, LEN(@src)) = @src ) SELECT result FROM P CROSS APPLY ( SELECT element AS [text()] FROM ( SELECT N.n * 2 AS pos, CASE permutation & POWER(2, N.n - 1) WHEN 0 THEN @src ELSE @synonym END AS element FROM TSQLV3.dbo.Nums AS N WHERE N.n <= (LEN(@keyword) - LEN(REPLACE(@keyword, @src, N''))) / LEN(@src) UNION ALL SELECT pos, element FROM E ) AS D ORDER BY pos FOR XML PATH('') ) AS A(result);

Table 5 shows the output of the solution for the sample input values.

**Table 5: Desired result**

wwAMxxAMyyAMzz wwANxxAMyyAMzz wwAMxxANyyAMzz wwANxxANyyAMzz wwAMxxAMyyANzz wwANxxAMyyANzz wwAMxxANyyANzz wwANxxANyyANzz

**Conclusion**

I find the string replacement puzzle that I presented in this article to be a beautiful puzzle. It requires you to apply your knowledge of some classic T-SQL techniques such as the ones used for string splitting and string concatenation, and also to incorporate some new ideas. It involves numbers, logic, bitwise manipulation and love. I can only hope to face many more puzzles like this in the future.