Monthly Archives: May 2015

Almost Equal Strings // My problem in Code.cpp contest

I created the most difficult problem for May edition of Code.cpp on HackerRank. The full problem is available here. The problem got great attention, because many contestants were surprised how it is possible to "compare" two strings that fast.

The statement is quite straightforward:

We define two strings S and W as almost equal if and only if they have the same length and they differ on at most one position.

You are given a string S of length N and an integer Q. You have to perform Q queries on S. Each query consists of four integers, i, j, k, and l. For each such query, your task is to decide if the substrings S[i, j] and S[k, l] are almost equal.

While the problem is formulated very simple, the solution is not obvious, especially because there are many queries to handle - up to 10^5 and the input string can have at most 10^6 letters. I will sketch two completely different approaches here.

LCP Method

First of all, we construct suffix array for string S. While O(n \cdot \log n) algorithm is fine here, we can use a simple and famous linear one by Kärkkäinen and Sanders known as Skew algorithm

After that, we compute longest common prefix array abbreviated as LCP array, where LCP[i] is the longest common prefix of suffixes i and i-1 in the suffix array.

Using our LCP array and some range queries, we are able to compute the length of the longest common prefix for arbitrary suffixes of S, the sketch of this method is available here.

Being able to do that, let's consider any query i, j, k, l and two substrings S[i, j] and S[k, l]. Let assume that they have equal length K, because the other case is trivial. In order to answer, if these substrings are almost equal, first we compute the length of LCP of suffixes of S starting in i and k. Let this length to be L. If L \geq K, then we know that our two substrings are exactly equal. In the other case, we know that they differ on position L + 1, so we compute the length of the LCP of suffixes starting just after this different character and the answer to the query is positive if these suffixes have equal at least K - L - 1 characters, otherwise it is negative.

Time Complexity

If we use Skew algorithm and the fastest method of computing LCP of arbitrary strings, we can achieve O(n + q), but any algorithm running in O(n \cdot \log n + q \cdot \log q) is also fine.

Hashing method

The idea is to provide a method for deciding if S[i, j] = S[k, l] in constant time for any i \leq j and k \leq l

For now, let's assume that we can do that.

Then in order to decide if S[i, j] is almost equal to S[k, l], first we check if they have equal length. If they have, then we check if S[i, j] = S[k, l] if it is, they are exact the same strings. If not, we find the first letter on which they are different. We can do this using binary search and our method for equality of substrings on prefixes of both strings. After we find this first letter, let say it has index x inside these strings, we just need to compare S[i + x + 1, j] with S[k + x + 1, l] and we assumed that we can do this in O(1) time.

In order to do our magic comparison in constant time, first we define

H[i] = s[0] \cdot P^{i - 1} + s[1] \cdot P^{i - 2} + ... + s[i - 1] \cdot P^1 + s[i]

which is called polynomial hash of a string and we can compute H[i] for all 0 \leq i \leq n - 1 in O(n) using Horner's method i.e:

H[0] = s[0]

H[i] = P \cdot H[i - 1] + s[i] for i > 0

All these computations can be done modulo some big prime, but it is faster to use unsigned long long type here, especially for c++ which is nice in this contest, and we can avoid modulo computation.

The next step is to be able to check if S[i, j] = S[k, l] and we can do this by multiplications and subtractions of some values from H table, which can be done in constant time.

Of course this method is a probabilistic one and it can return false results, but we can either assume that it works for one H function, or compute more than one hashing function to increase its probability. It is worth to mention that using two functions rather that one squares the probability of an error. An example of using two functions can be computing both as H function given above, but for each one computing it modulo a different big prime number.

Time Complexity

Computing H table takes O(n) and binary search for one query takes O(\log n), so the overall complexity is O(n + q \cdot \log n).

Screen Shot 2015-05-29 at 3.16.22 PM

My problems in HackerRank Code.cpp May 2015

I am the author of two challenges in upcoming Code.cpp May 2015 by HackerRank and also tested all problems. I encourage anyone to participate, especially those of you who are new to competitive programming, since the contest will be simpler that usual, but still one of my problems is quite challenging. As a hint, I can tell you that there will be a few string problems ;)


Knight and Queen // HackerRank Lambda Calculi

It's been a while since my last post here. I've been busy working with my team on a brand new game for Android and iOS.

Anyway, I'm back with a very interesting problem. I was asked to test some problem for HackerRank Lambda Calculi Contest. The one which I want to write about is Knight and Queen. The original problem is available here:

Problem statement

Two players are about to play a game on a chessboard of size n x n, where the bottom-left cell has coordinates (0, 0). There are only 2 figures on the board: the Knight and the Queen. Both figures moves like in a classic chess game, but with a restriction that they cannot move right or up, so every legal move is left, down or left and down. For example, the Knight standing on a field (5, 4) can move to (4, 2) or (3, 3) but cannot move to (6, 6). Also note that both figures can be a the same cell in the same time.

Players takes moves alternately. Player who is about to make a move can pick the Knight or the Queen and make any valid move with in. A player who is not able to make a move is declared a looser.

Your task is to decide who is a winner knowing the initial positions of both figures and the fact that both players plays optimally.

You will be given many testcases, so the best you can do is to be able to provide the answer for any given initial positions fast.


If n is quite small, you can consider a dynamic programming solution where we have n^4 positions: n^2 possible positions of the Knight multiplied by n^2 possible positions of the Queen. For each such position, you can compute the best possible move in O(n) time, so a total time will be O(n^5), but since we have n = 50, this is way to much.

We need a faster solution.

Game of Nim

Let's consider a simpler game called Nim first. In the Nim game, there are N distinct heaps. Each heap has a positive number of objects. Players takes moves alternately. Player who is about to make a move, can choose any non-empty heap and remove any positive number of objects from it. A player who is not able to make a move is declared a looser and a player is not able to make a move if and only if all heaps are empty.

This is a very classic game, a fundamental one in game theory. There is a famous theorem which can be used to determine who has a winning position in Nim at the beginning of the game if both players plays optimally. If you haven't heard about it yet, you should be amazed at least a little bit.


Let a_1, a_2, ..., a_N be current sizes of heaps i.e. a_i is the number of objects in the i-th heap before the next move. Let X = a_1 xor a_2 xor ... xor a_N, where xor is a binary xor operation.

A player who is about to make a move has a winning position if and only if X != 0.

In other words, if X != 0 and you are about to make a move, you can win no matter what, and if X == 0 and you are about to make a move, no matter what will you do in that move, your opponent can always win.

Sounds great! Let's make a sketch of a proof.

Let X == 0 before the next move. There are two possibilities:

  1. All heaps are empty, so the current player is a looser
  2. There is at least one non-empty heap

Since the first situation is clear, let's consider the second one. The current player has to remove at least one object from a non-empty heap, so no matter what will he do, the xor of sizes of all heaps after his move will be non-zero.

On the other hand, if X != 0 before the next move, the player who is about to make a move can always choose a valid move leading to a situation that the xor of sizes of all heaps after this move will be 0.

If you want to more detailed explanation, check it here or do it on your own considering binary representation of heaps sizes.

Our problem

Let's think of our problem as of a special game of Nim where we have only two heaps, the first one for the Knight and the second one for the Queen and let's consider these two heaps separately. For each of these two heaps, we will assign a single number to any position of the figure corresponding on the board. This number will correspond to the number of objects in a game of Nim on a single heap.

Knight's heap

Let's consider any position of the Knight such that there is no valid move with this figure. Any such position is equivalent to an empty heap, so we will assign 0 to any such position. Next let's consider any other position P, and let M be a set of numbers assigned to all positions reachable from P. We will assign the smallest non-negative integer which is not in M to the position P.

Queen's heap

We will do the same as above with respect to Queen's valid moves.

After making these two above computations, we have assigned numbers to any valid position of each figure on a board. We can think of these numbers as of the number of objects in the corresponding heap in a Nim game - just note that in Nim if you have k objects on a heap before your move, you can reduce the number of objects in this heap to any non-negative number less than k. The same is true in our game. If you decided to move the Queen in your turn, and it is currently in a cell with number k assigned to it, you can move it to a different cell with any assigned non-negative number less than k.

So in order to solve our problem, just assign the numbers as described above to both Knight's and Queen's positions. Let X be the xor of assigned numbers corresponding to the initial position of both figures. The player who starts the game has a winning strategy if and only if X != 0.

Time complexity

For a board of size n x n, you have to compute two tables of size n x n and computing a single position in any of these tables takes at most O(n) time, so the total complexity is O(n^3) if you use memoization or dynamic programming and here we have n = 50, so it will be very fast.

Related problems

One interesting thing is that many simple games can be reduced to Nim. I encourage you to read about that here.