Tag Archives: priority-queue

My problem for Hackerrank Weekly Challenge #12 // Favorite sequence

You can find and solve the problem here: https://www.hackerrank.com/contests/w12/challenges/favourite-sequence

Problem statement

In this problem you are given N sequences of distinct numbers. The task is to construct the shortest sequence S, such that all N sequences are (not necessarily continuous) subsequences of S. If there are many such sequences, then you have to construct the lexicographically smallest one. It is guaranteed that such a sequence exists.

Sequence A[1\ldots n] is lexicographically less than sequence B[1\ldots n] if and only if there exists 1 \le i \le n such that for all 1 \le j < i, A[j] = B[j] \ \ and\ \ A[i] < B[i].

Solution

Let's consider the following example first, there are 3 sequences:

1 3
2 3 4
6 5

From the first one, you know that in the resulting sequence 1 has to be before 3, while from the second one you know that 2 has to be before 3 and 3 has to be before 4 and. From the third one you know that 6 has to be before 5.

Based on these facts we can construct a graph of precedence. Let's call this graph G. Numbers presented in sequences are vertices in G and there is a directed edge (u, v) if and only if there is a sequence where v is a direct successor o u.

Out example graph G looks like this:

1

34

2

65

It is easy to see that the shortest valid sequence will be a topological sorting of vertices of G. A topological order is a linear ordering of vertices such that for every directed edge (u, v), u comes before v in the ordering. The reason for this is that any topological order forms a valid sequence, because it preserves all precedence relations from all input sequences and it is the shortest one, because any number is presented only once. We reduced the problem to finding the lexicographically smallest one topological order of G. If you want to read more about topological order, you can check this article.

A graph can have many topological orders, for example any of below orders is a valid topological order of our graph G, but there are even more valid ones:

2 1 3 4 6 5
1 2 3 4 6 5
6 1 2 5 3 4

The problem statement says that we have to return the lexicographically smallest one and in our case it is: 1 2 3 4 6 5.

While there are several methods of finding a topological order of a graph, we will use the following, because it is easy to modify it to construct the lexicographically smallest one.

Since there exists a topological order of vertices, there exists a vertex with in-degree 0. It is easy to see that if any vertex has in-degree greater than 0, topological order does not exist because is a directed cycle in the graph. We construct the desire order in a main loop:

While there exists a vertex in the graph:
    select a vertex v with in-degree 0, add it to the resulting order and remove it with all outgoing edges from the graph

You can prove that the above method returns a topological order. If we want the lexicographically smallest one, inside the loop we have to pick a vertex represented by a smallest number from vertices with in-degree 0.

In order to implement the above method, we can use a priority queue where vertices are ordered by in-degree and if two vertices have the same in-degree, then a number on these vertices decides (a vertex with smallest number on it has higher priority). While we are in the main loop, we pick the first vertex u from the queue, delete it from it, put in into the resulting order and decrease in-degree of every vertex v such that there is an edge (u, v) in a graph.

Time complexity and implementation details

The time complexity of the above method depends on the queue implementation. We can achieve O((V + E) \cdot \log V) where v is the number of vertices in the graph and E is the number of edges in it, if we implement the queue as a heap with pointers or a balanced binary tree. This is because we V times select the minimum from the queue and E times we decrease in-degrees of vertices in the queue which has at most V elements in it and each of these operations takes logarithmic time on a heap with pointers (you have to have pointers to elements in order to achieve this performance for a decrease operation, because you have to be able to find any element quickly) or a balanced binary tree. We can notice that V and E are at most N \cdot K.

Order exercises // My problem in Functional Programming Contest - October'14

You can find the problem here: https://www.hackerrank.com/contests/lambda-calculi-oct14/challenges/order-exercises

Problem statement

You are given an array of n integers, a = \{a_1, a_2,\ldots, a_n\} along with the following definitions:

- a subarray is a contiguous segment of an array. For example a[l, r] = \{a_l, a_{l+1},\ldots, a_r\} is a subarray, where 1 \le l \le r \le n

- We say that a sum of a subarray is a sum of elements in this subarray

- We say that subarray X = \{a_{i}, a_{i+1},\ldots,a_{j}\} is greater than subarray Y = \{a_{k}, a_{k+1},\ldots,a_{l}\} if and only if one of the following statements is true:
- X has a greater sum than Y
- X and Y has the same sum and X begins earlier
- X and Y has the same sum, they start in the same place and the length of X is smaller than the length of Y

It is guaranteed that there is no 0 in the array a.

You are given also an integer k. The task is to print as many as possible, but not more than k, subarrays with a positive sum in the following order.

- The first subarray is the greatest subarray in the array according to above definition.

- The i^{th} subarray is the greatest subarray disjoint to any of the j^{th} subarray, where j < i (disjoint means that they have no common elements).

Solution

If you are familiar with the Maximum subarray problem you may notice that our problem is a natural extension of it.

Simple, but not enough solution

Probably the first solution which comes to mind is to run Kadane algorithm or similar to get the value of the greatest subarray, then assign - \infty to every element of that subarray, find the second greatest subarray and so on. We iterate that process while there exists a subarray with a positive sum.

The time complexity of this method is O(n \cdot k) which gives you some points, but it is too slow to pass all testcases even if it is hardly optimized.

Faster solution

The general pattern of above solution is good, but we have to find the next subarray faster. A segment tree can help us here.

The first thing here is to implement a segment tree which supports the following query in O(\log n) time:

query(i, j) returns the greatest subarray in \{a_i, a_{i+1},\ldots, a_j\}

Let T_v be a subtree represented by a node v and a_v be an array corresponding to elements covered by T_v. We can implement that query maintaining in every node v several values:

- sum of the greatest subarray in a_v
- greatest prefix sum of a_v
- greatest suffix sum of a_v
- sum of values of elements in a_v

You can compute these values of any v based on values in its children. Values for leaves are obvious. After that, if you are familiar with segment trees, you can implement the query(i, j) easily.

In order to do the next step we need a priority queue, let's call it Q. We store in the queue triplets \langle R, V, P\rangle where:

- R is a range in which we search for subarrays
- V is the value of the greatest subarray in R
- P represents indices of the greatest subarray in R

Ordering of elements in Q is determined by values V and P.

First we compute the value and indices of the greatest subarray in the whole array. We do it using query(1, n). Let's assume that it is a[i, j] and has a sum V. We put into the queue a triplet \langle (1, n), V, (i, j)\rangle.

While Q is not empty and we have found less than k subarrays so far, we select the greatest triplet from the queue and remove it from there. Let's assume that the triplet is \langle (k, l), V, (i, j)\rangle. We print V as a result and if a range (k, i - 1) is not empty, we compute query(k, i - 1) and if it returns a subarray with a positive sum S with indices (c, d), we put into the queue a triplet \langle (k, i - 1), S, (c, d)\rangle. We do the same thing with a range (j + 1, l).

Time complexity

Time needed to construct the tree is O(n \cdot \log n) and we do at most O(k) queries, so the total complexity is O(n \cdot \log n), because k \leq n.