My 3rd Problem for HackerRank // Towers

Problem statement

You're given infinitely many bricks of at most 15 different heights (other dimensions are not relevant here). You're also given a big integer n. The task is to count the time needed to build every possible tower of height N from the given bricks. Bricks are stacked vertically only and stand in an upright position. Two towers are different if their brick height sequences are different and building one tower takes exactly 2 time units regardless of its height.

Here is the full problem statement.


In the editorial I omit the fact that we have to count the time needed to build all towers - we will count here just the number of different towers and to get the result, you have to multiply this number by 2.

Simpler problem first

There is a restriction, that every bricks is of height at most 15 and N is at most 10^{18}. Lets first try to solve a simpler problem and assume that there are only bricks of height 1 or 2 and N \leq 10^6.

Let f_n be the number of different towers of height n which can be build of given bricks. In our simpler problem we have:

f_1 = 1, because there is only one tower of height 1 consisting of one bricks of height 1

f_2 = 2, because there are exactly two towers of height 2 - one consisting of two bricks of height 1 and the other consisting of one brick of height 2

We can now give a recursive definition for f_n when n > 2:

f_n = f_{n - 1} + f_{n - 2}

because we can get one unique tower of height n by placing a brick of height 1 on any tower of height n - 1 and one unique tower of height also n by placing a brick of height 2 on any tower of height n - 2.

If we compute f_n bottom-up (like in dynamic programming) rather than using recursion explicitly, we can do it in O(n) time which is fine if n is not as big as in the problem statement.

Removing heights restriction

Lets remove the first restriction, so we have to deal now with bricks of at most 15 different heights (from 1 to 15).

Let h_i indicates whether we have a brick of height i:

 h_i = \left\{ \begin{array}{l l} 1 & \quad \text{if we have bricks of height $i$}\\0 & \quad \text{if we don't have bricks of height $i$} \end{array} \right.

then the general recursion equation is:

f_n = 0 for n < 0

f_0 = 1

f_n = h_1 \cdot f_{n - 1} + h_2 \cdot f_{n - 2} + h_3 \cdot f_{n - 3} + \ldots + h_15 \cdot f_{n - 15} for n > 0

Solving the original problem

The last thing that we have do deal with is the really big n - up to 10^{18}. We will do it defining the recursive equation as a matrix multiplication problem and solve it using fast matrix exponentiation.

Lets assume that we've computed f_n for n \leq 15 using dynamic programming and let M be a 15 \times 15 matrix defined as follows:

M = \begin{bmatrix}0&1&0&0&0&0&0&0&0&0&0&0&0&0&0\\0&0&1&0&0&0&0&0&0&0&0&0&0&0&0\\0&0&0&1&0&0&0&0&0&0&0&0&0&0&0\\0&0&0&0&1&0&0&0&0&0&0&0&0&0&0\\0&0&0&0&0&1&0&0&0&0&0&0&0&0&0\\0&0&0&0&0&0&1&0&0&0&0&0&0&0&0\\0&0&0&0&0&0&0&1&0&0&0&0&0&0&0\\0&0&0&0&0&0&0&0&1&0&0&0&0&0&0\\0&0&0&0&0&0&0&0&0&1&0&0&0&0&0\\0&0&0&0&0&0&0&0&0&0&1&0&0&0&0\\0&0&0&0&0&0&0&0&0&0&0&1&0&0&0\\0&0&0&0&0&0&0&0&0&0&0&0&1&0&0\\0&0&0&0&0&0&0&0&0&0&0&0&0&1&0\\0&0&0&0&0&0&0&0&0&0&0&0&0&0&1\\h_{15}&h_{14}&h_{13}&h_{12}&h_{11}&h_{10}&h_9&h_8&h_7&h_6&h_5&h_4&h_3&h_2&h_1\end{bmatrix}

Let V be the following vector:


if we multiply M \cdot V we get:


and we can return f_{16} from here, so in order to compute f_n for n > 15 we can compute:

R = \underbrace{(M \cdot (M \cdot \dots \cdot (M}_{n - 15} \cdot V)))

which can be written in the following form, because matrix multiplication is associative:

R = M^{(n-15)} \cdot V

and we can compute n-th power of a matrix of size m \times m it O(m^3 \cdot \log_2 {n}) using fast exponentiation by squaring and in our problem we have m = 15 and n \leq 10^{18} which is perfectly fine in terms of time limits. One speedup here can be to do exponentiation iteratively rather than recursively

  • Altaf Hussain

    nice explanation . can u share your code with us for more clarification and implementation . and i want to know how what i have to learnt to solve such types of problems , any dedicated algorithm or any other material or link if u suggest . i am an electronics engineer but i am interested in learning programming . so any help from your side would be appreciable . Thanks in advance

  • Azharuddin Ahmed

    nice one bro

    • pkacprzak

      Thanks :)

  • Ryan

    Funny I thought this was an integer composition problem. Oh well!

    • pkacprzak

      Actually you are right! You can think of this one as a problem of counting in how many ways you can represent an integer N using a sum of at most 15 different positive numbers where each number can be used as many times as needed and the order of numbers matters.

      • Ryan

        I have some gaps in my knowledge about vectors and matrix multiplication (last time using them was in college about 14 years ago) ... Questions:

        1. What does the T superscript mean with the vector V?
        2. Multiplication with h1 and fn-1 is regular multiplication and not dot product right?
        3. Is matrix multiplication a common approach for recursive solutions?

  • anonymousnotorious

    can you please elaborate on the h1*f(n-1) + h2*f(n-2) + .... + h15*f(n-15) part

    • pkacprzak

      Sure, h_i indicates if we have a brick of height i, so it has a value of 1 or 0. Then, as described in the editorial, in order to build a tower of height n, you can extend any tower of a height n - i to it, if and only if you have a brick of height i. Does it make it more clear?

      • anonymousnotorious

        Thank you! Sorry for the late reply. So, the runtime of this program is O(m^3logn) where m is 15 and n is the bound given in the question?

        • pkacprzak

          Yes, you're right. Please refer to the above comment for more detailed running time measure.

      • anonymousnotorious

        could you use strassens multiplication to speed it up further?

        • pkacprzak

          You can, but in most cases, it'd be only theoretical speed up. Actually, if f(m) is the running time of matrix multiplication algorithm you decide to use for matrices of size m x m, then the total running time of this approach is O(f(m) * log(n)).

          • anonymousnotorious

            why f(m) * log(n) why not just f(m) ? the matrix multiplication takes O(m^3logn) without strassens correct?

          • pkacprzak

            The solution is based on matrix exponentiation. We use fast exponentiation algorithm, which calls matrix multiplication O(log(n)) times when we want to raise the matrix to the n-th power. Since a single multiplication takes O(m^3) if we use the basic method, the total running time is O(m^3 * log(n)).

          • anonymousnotorious

            Okay. Thanks :)

  • R Prakash

    Solution is awesome.
    But, How did you get the idea for the matrix - Is it part of some class of problems ?
    I can see that DP and fast exponentiation is intuitive. But I can't really imagine where I'd want to use the matrix approach. Can you point me to some theoretical resource or principle which gave you this idea ?

    • pkacprzak

      Actually, this is a very standard method for solving linear recurrences. It is a nice speed up over the most intuitive O(n) method. For online resources, try searching for "matrix exponentiation recurrence" and you should get what you want.

      • R Prakash

        Thanks, that was really useful. Should try some more DP problem to get used to it.

  • Alex Totheroh

    Can you explain how you constructed the initial matrix M?

    • pkacprzak

      Sure, the idea is pretty simple.

      Let's assume that we have a smaller recursive relation like for example:

      f(n) = 3 * f(n - 1) + f(n - 2).

      The idea to construct the matrix is that the matrix multiplied by a vector of already known values [f(n-2), f(n-1)] will produce a vector of values [f(n-1), f(n)], so a vector "shifted" by one. To follow this idea, in the last row of the matrix we put coefficients of the recursive relation, in this case 1 and 3, in order to get the value f(n) when we multiply the last row of the matrix by the vector of already known values. In any other row, in most cases, we put just one 1 and we set the other cells to 0 in order to "copy" the value from the previous vector of known values to the new one. For example if we have a vector [f(n-2), f(n-1)], then we have to "shift" the value f(n-1) to the first place, and we want a new value f(n) to be placed as the second element, so in our case the matrix is:

      | 0 1 |
      | 1 3 |

      and if we multiply it by [f(n-2), f(n-1)] (as a column vector), we get:
      [0 * f(n - 2) + 1 * f(n - 1), 1 * f(n - 2) + 3 * f(n - 1)] = [f(n - 1), f(n)]

      Let me know if it's more clear now.