My problem for Hackerrank Weekly Challenge #12 // Favorite sequence

You can find and solve the problem here:

Problem statement

In this problem you are given N sequences of distinct numbers. The task is to construct the shortest sequence S, such that all N sequences are (not necessarily continuous) subsequences of S. If there are many such sequences, then you have to construct the lexicographically smallest one. It is guaranteed that such a sequence exists.

Sequence A[1\ldots n] is lexicographically less than sequence B[1\ldots n] if and only if there exists 1 \le i \le n such that for all 1 \le j < i, A[j] = B[j] \ \ and\ \ A[i] < B[i].


Let's consider the following example first, there are 3 sequences:

1 3
2 3 4
6 5

From the first one, you know that in the resulting sequence 1 has to be before 3, while from the second one you know that 2 has to be before 3 and 3 has to be before 4 and. From the third one you know that 6 has to be before 5.

Based on these facts we can construct a graph of precedence. Let's call this graph G. Numbers presented in sequences are vertices in G and there is a directed edge (u, v) if and only if there is a sequence where v is a direct successor o u.

Out example graph G looks like this:





It is easy to see that the shortest valid sequence will be a topological sorting of vertices of G. A topological order is a linear ordering of vertices such that for every directed edge (u, v), u comes before v in the ordering. The reason for this is that any topological order forms a valid sequence, because it preserves all precedence relations from all input sequences and it is the shortest one, because any number is presented only once. We reduced the problem to finding the lexicographically smallest one topological order of G. If you want to read more about topological order, you can check this article.

A graph can have many topological orders, for example any of below orders is a valid topological order of our graph G, but there are even more valid ones:

2 1 3 4 6 5
1 2 3 4 6 5
6 1 2 5 3 4

The problem statement says that we have to return the lexicographically smallest one and in our case it is: 1 2 3 4 6 5.

While there are several methods of finding a topological order of a graph, we will use the following, because it is easy to modify it to construct the lexicographically smallest one.

Since there exists a topological order of vertices, there exists a vertex with in-degree 0. It is easy to see that if any vertex has in-degree greater than 0, topological order does not exist because is a directed cycle in the graph. We construct the desire order in a main loop:

While there exists a vertex in the graph:
    select a vertex v with in-degree 0, add it to the resulting order and remove it with all outgoing edges from the graph

You can prove that the above method returns a topological order. If we want the lexicographically smallest one, inside the loop we have to pick a vertex represented by a smallest number from vertices with in-degree 0.

In order to implement the above method, we can use a priority queue where vertices are ordered by in-degree and if two vertices have the same in-degree, then a number on these vertices decides (a vertex with smallest number on it has higher priority). While we are in the main loop, we pick the first vertex u from the queue, delete it from it, put in into the resulting order and decrease in-degree of every vertex v such that there is an edge (u, v) in a graph.

Time complexity and implementation details

The time complexity of the above method depends on the queue implementation. We can achieve O((V + E) \cdot \log V) where v is the number of vertices in the graph and E is the number of edges in it, if we implement the queue as a heap with pointers or a balanced binary tree. This is because we V times select the minimum from the queue and E times we decrease in-degrees of vertices in the queue which has at most V elements in it and each of these operations takes logarithmic time on a heap with pointers (you have to have pointers to elements in order to achieve this performance for a decrease operation, because you have to be able to find any element quickly) or a balanced binary tree. We can notice that V and E are at most N \cdot K.

  • jporcelli

    Hey man first off I want to say I have read your blog before I even knew you were a setter for this weeks challenge on HR, when I was researching a TC problem. Very helpful, Thanks a lot, your very talented! Im just getting into competitive coding and love it and hope to get better. Anyway.. I was wondering if you could tell me where my solution differs from yours because I am not sure, I was getting TLE for additional cases. My solution is just like you describe I think, you can find it here on gist,

    • pkacprzak

      Hi, your solutions looks good for what I see. One improvement that I noticed is that you are using a linked list to store a list of edges for a vertex. It would be better to use a non-pointer data structure here like a dynamic array, in order to store edges as a one segment in memory. This allows a better performance duo to caching and it eliminates long jumps over the memory. For example, I used vectors in my c++ solution.

      • jporcelli

        So do you think I should put forward an argument to raise the TL for java for those cases? Thanks for your help, much appreciated!

        • pkacprzak

          Maybe, but I think that it can be coded in Java to pass all the testcases. The ability to code a list of edges as a continuous memory segment can help you with future problems :)

          • jporcelli

            thanks. Nice problem, enjoyed it.

  • Paramdeep Singh

    Your explanations are really crisp. For this problem, could you also upload your code?

    • pkacprzak
      • mauricepatel37

        In your solution, when will be the condition on line 89 executed:-
        " if(deg[n2] == -1) deg[n2] = 0;"

        If we consider 2 cases.
        1) where the size of each k is 1. Then we will never enter the second loop and deg[n2] can be some garbage value since n2 is not defined!

        2) If we are entering in the loop (i,k-1) then we are already checking if(deg[n2] == -1)... so the condition outside the loop can never occur.

        So why are we writing it on line 89? Should there be deg[n1] instead?

        • pkacprzak

          I've taken a look at the solution. You're right, this case never occurs. I wrote this code over a year ago, and I don't remember exactly why I put this line there, now it looks to me like some kind of a unnecessary safety check :D Definitely n1 shouldn't be there instead. You can assume that this line doesn't exist ;)

  • anair17

    Very cool

  • Sameer Agarwal

    if any vertex has in-degree greater than 0, topological order does not exist

    0 should be 1 or i am wrong?

    • pkacprzak

      For a connected graph, a topological order does not exist if and only if EVERY vertex has in-degree greater than 0

  • Saras_Arya

    Hey i was trying to solve the ques but i got stuck at this one particular test case
    68 7 18
    67 46 11 99 48 69 57 19 45 95

    output of this is given.67 46 11 68 7 18 99 48 69 57 19 45 95 How is this? Can you explain. ? i mean 68 7 and 18 could have been anywhere. in the start or at the end. Why at that particular location. Why after 11. How is it lexicographical correct?

    • pkacprzak

      Hi, let's try to make it clear.

      67 has to go first, because it is smaller than 68. After that you can choose 68 or 46 and the correct one is 46, because it is smaller. Next you can choose 68 and 11, 11 is smaller so it goes first. Then 68 or 99 and here 68 is smaller so it goes first. If you continue this process you will get the correct order which is also the smallest lexicographical one. Does it make it clear?

      • Saras_Arya

        Now that is one crystal clear explanation. Thanx @pkacprzak.

        • pkacprzak

          You're welcome.