Notes for High Speed Internet at the Library problem
Notes for the High Speed Internet at the library problem.
Material Developed by: Atri Rudra
Part (a)
For sake of completeness, we repeat the problem statement:
The Problem
Citizens of SomePlaceInUSA
realize that the current problem as stated above could be monopolized by a citizen who (perhaps legitimately) can get a huge worth by using the computer terminal for the entire day. They come to you to help them solve the following updated version of the problem:
- The input to your algorithm is now the set of triples $\{(s_i,f_i,w_i)|i\in [n]\}$ along with an integer threshold $\tau\ge 0$. Your algorithm should
output the maximum worth among all
valid
subset $S\subseteq [n]$ with $|S|\ge \tau$. and output one with the maximum worth among all suchvalid
subsets (of size at least $\tau$). Note that the optimal worth can be $0$ (if there is novalid
subset of size at least $\tau$). -
Input:
$2;[(1,4,10),(5,10,20),(1,10,100)]$.
Output:
$30$ (for valid subset $\{1,2\}$). -
Input:
$3;[(1,4,10),(5,10,20),(1,10,100)]$.
Output:
$0$ (since there is no valid subset of size $3$).
Sample Input/Output
Here is a sample input/output pair (the input is stated as $\tau;[(s_1,f_1,w_1),\dots,(s_n,f_n,w_n)]$):
Algorithm Idea
We first make the observation that if we have $\tau=0$, then we have to find the maximum worth among all valid
subsets. However, note that in this case we exactly have the weighted interval scheduling problem, where the value of the $i$'th interval $[s_i,f_i]$ is $w_i$.
So the main idea is to extend the notion of sub-problems from the weighted interval scheduling problem to handle the extra constraint of the lower bound on the size of the subset. In particular let $OPT(j,t)$ denote the maximum worth among all valid
subsets of size at least $t$ for the sub-problem where we are only given $(s_1,f_1,w_1),\dots,(s_j,t_j,w_j)$ as the input (i.e. we only consider the first $j$ citizens). All we now need to do is to determine the recurrence relation and note that the algorithm needs to output $OPT(n,\tau)$.
We start off with the base case. Note that if $t> j$, then there is no possible valid
solution of size $t$ and hence we set $OPT(j,t)=-\infty$. Also trivially $OPT(0,0)=0$.
Let us now consider the general case with $t\le j$. We now have two choices:
- Case 1: $j$ is not in any optimal
valid
subset. In this case, we have that $OPT(j,t)=OPT(j-1,t)$. The argument is similar to what we had for similar case for the weighted interval scheduling: for sake of contradiction assume that $OPT(j,t) > OPT(j-1,t)$ and since any optimal solution achieving this worth does not have $j$ in it, such a solution would also be a valid solution for the input with the fist $j-1$ citizens (and same threshold of $t$), which would contradict the definition of $OPT(j-1,t)$. - Case 2: $j$ is in some optimal
valid
subset (let's call this optimal subset $\mathcal{O}_j$). Then the claim is that $w(\mathcal{O}_j\setminus \{j\})=OPT(p(j),t-1)$, where $p(j)$ is defined exactly as it is for the weighted interval scheduling problem. The argument again follows from a similar argument that we used in this case for the weighted interval scheduling problem. In particular, we claim that $\mathcal{O}_j\setminus \{j\}$ is also an optimal solution for the input on the first $p(j)$ citizens and a threshold of $t-1$ (if not, there exists a solution $\mathcal{O}'$ that has worth $> w(\mathcal{O}_j\setminus \{j\}$ is better for this smaller problem instance. In such a case, note that $\mathcal{O}' \cup \{j\}$ is a valid solution for the problem on the first $j$ citizens and threshold $t$ and has value $> w_j + w(\mathcal{O}_j\setminus \{j\} = w(\mathcal{O}_j)$, which contradicts the fact that $\mathcal{O}_j$ is optimal for the $(j,t)$ sub-problem.) Thus, we have $OPT(j,t)=w_j+OPT(p(j),t-1)$ in this case.
So overall the recurrence for the optimal worth is given by (for $0\le j,t\le n$): \[OPT(j,t)=\begin{cases} -\infty & \text{ if } j< t \\ 0 & \text{ if } j=t=0\\ \max\left\{ OPT(j-1,t), w_j+OPT(p(j),t-1)\right\} & \text{otherwise} \end{cases}.\] To finish off the algorithm idea note that we can compute all the $OPT(\cdot,\cdot)$ values if we compute them increasing order of $j$ (for each fixed $j$, the order of $t$ does not matter). Finally, we just output $\max(0,OPT(n,\tau))$ (we do the $\max$ to make sure if the value of $OPT(n,\tau)=-\infty$, then we make it $0$ as required by the problem statement).
Algorithm Details
Even this is not needed in your solution, we present the algorithm details below:
# Input: Integers t, n and triples (s1, f1, w1), ..., (sn, fn, wn)
# Also assume we know p(j) for every j = 0 .. n
if t > n
return 0
Allocate an (n+1) X (n+1) array M #M[j][k] will ultimately store OPT(j,k)
M[0][0]=0
for every k = 1 .. n
M[0][k] = -infty
for every j = 1 .. n
for every k = 0 .. n
if j < k
M[j][k] = -infty
else
M[j][k] = max ( M[j-1][k], wj + M[p(j)][k-1]
return max(M[n][t],0)
We are almost done with part (a) except for the following, which is left as an exercise for you:
Exercise 2
Argue that the above algorithm runs in $O(n^2)$ time.
Part (b)
The Problem
Citizens of SomePlaceInUSA
realize that the current problem as stated above might not be representative of their town. In particular, they would like the valid
subset to be representative of the economic diversity of their town. In particular, now they want your solution to meet some more requirements:
-
The input to your algorithm is now the set of $4$-tuples $\{(s_i,f_i,w_i,e_i)|i\in [n]\}$ (where $s_i,f_i,w_i$ are as before and $e_i\in \{1,2,3\}$ being an identifier for the range of their household income) along with there real numbers $0\le \rho_1,\rho_2,\rho_3\le 1$ such that $\rho_1+\rho_2+\rho_3\le 1$. Your algorithm should output the maximum worth among all
valid
subset $S\subseteq [n]$ with the following minimum requirement of the economic diversity-- for every $j\in [3]$, we must have \[\left|\{i\in S|e_i=j\}\right|\ge \rho_j\cdot |S|.\] Note that the optimal worth can be $0$ (if there is novalid
subset that can meet the economic diversity constraint). -
Input:
$\left(\frac 13, \frac 12, 0\right);[(1,4,10,1),(5,10,20,2),(1,10,100,3)]$.
Output:
$30$ (for the valid subset $\{1,2\}$). -
Input:
$\left(\frac 13, \frac 13, \frac13\right);[(1,4,10,1),(5,10,20,2),(1,10,100,3)]$.
Output:
$0$ ($\emptyset$ is the only valid subset that satisfies the economic diversity constraint).
Sample Input/Output
Here is a sample input/output pair (the input is stated as $(\rho_1,\rho_2,\rho_3);[(s_1,f_1,w_1,e_1),\dots,(s_n,f_n,w_n,e_n)]$):
We do not have much to say for part (b) except to say:
Hint
Try to extend the dynamic program of part (a) to work for part (b). It might be useful to think of each sub-problem being indexed by a $5$-tuple (where each entry in the tuple can have $O(n)$ possibilities).