My problem for Hackerrank Weekly Challenge #12 // Favorite sequence

You can find and solve the problem here: https://www.hackerrank.com/contests/w12/challenges/favourite-sequence

Problem statement

In this problem you are given $N$ sequences of distinct numbers. The task is to construct the shortest sequence S, such that all $N$ sequences are (not necessarily continuous) subsequences of $S$. If there are many such sequences, then you have to construct the lexicographically smallest one. It is guaranteed that such a sequence exists.

Sequence $A[1\ldots n]$ is lexicographically less than sequence $B[1\ldots n]$ if and only if there exists $1 \le i \le n$ such that for all $1 \le j < i, A[j] = B[j] \ \ and\ \ A[i] < B[i]$.

Solution

Let's consider the following example first, there are 3 sequences:

1 3
2 3 4
6 5

From the first one, you know that in the resulting sequence 1 has to be before 3, while from the second one you know that 2 has to be before 3 and 3 has to be before 4 and. From the third one you know that 6 has to be before 5.

Based on these facts we can construct a graph of precedence. Let's call this graph G. Numbers presented in sequences are vertices in G and there is a directed edge $(u, v)$ if and only if there is a sequence where $v$ is a direct successor o $u$.

Out example graph G looks like this:

1

34

2

65

It is easy to see that the shortest valid sequence will be a topological sorting of vertices of G. A topological order is a linear ordering of vertices such that for every directed edge $(u, v)$, $u$ comes before $v$ in the ordering. The reason for this is that any topological order forms a valid sequence, because it preserves all precedence relations from all input sequences and it is the shortest one, because any number is presented only once. We reduced the problem to finding the lexicographically smallest one topological order of G. If you want to read more about topological order, you can check this article.

A graph can have many topological orders, for example any of below orders is a valid topological order of our graph G, but there are even more valid ones:

2 1 3 4 6 5
1 2 3 4 6 5
6 1 2 5 3 4

The problem statement says that we have to return the lexicographically smallest one and in our case it is: 1 2 3 4 6 5.

While there are several methods of finding a topological order of a graph, we will use the following, because it is easy to modify it to construct the lexicographically smallest one.

Since there exists a topological order of vertices, there exists a vertex with in-degree 0. It is easy to see that if any vertex has in-degree greater than 0, topological order does not exist because is a directed cycle in the graph. We construct the desire order in a main loop:

While there exists a vertex in the graph:
select a vertex v with in-degree 0, add it to the resulting order and remove it with all outgoing edges from the graph


You can prove that the above method returns a topological order. If we want the lexicographically smallest one, inside the loop we have to pick a vertex represented by a smallest number from vertices with in-degree 0.

In order to implement the above method, we can use a priority queue where vertices are ordered by in-degree and if two vertices have the same in-degree, then a number on these vertices decides (a vertex with smallest number on it has higher priority). While we are in the main loop, we pick the first vertex $u$ from the queue, delete it from it, put in into the resulting order and decrease in-degree of every vertex $v$ such that there is an edge $(u, v)$ in a graph.

Time complexity and implementation details

The time complexity of the above method depends on the queue implementation. We can achieve $O((V + E) \cdot \log V)$ where $v$ is the number of vertices in the graph and $E$ is the number of edges in it, if we implement the queue as a heap with pointers or a balanced binary tree. This is because we $V$ times select the minimum from the queue and $E$ times we decrease in-degrees of vertices in the queue which has at most $V$ elements in it and each of these operations takes logarithmic time on a heap with pointers (you have to have pointers to elements in order to achieve this performance for a decrease operation, because you have to be able to find any element quickly) or a balanced binary tree. We can notice that $V$ and $E$ are at most $N \cdot K$.