## techInterview Discussion |
||

A part of techInterview.org: answers to technical interview questions.
Your host: Michael Pryor |
This was asked in one of the tech interview of a reputed company.
Given an array of 'n' random integers. Given an integer 1<=n. Find the k numbers such that the minimum difference of all the possible pairs of k numbers is maximum (maximum among other minimum differences for various possible selections of k numbers )
From what I have understood from the problem is we have to find any k nos -a subset of n nos. in an array such that for any pair of 2 nos in this k element subset the minimum difference is maximum among all such subset of K elements.
Consider Array A[]=<1,12,6,4,13,9,7> Here n=7 Let K=4 then a brute force implementation that I could think of is: 1. Take a K element subset of the array(combination). Example <1,12,6,4>=<(1,12),(1,6),(1,4),(12,6),(12,4),(6,4)> So for a k element we have k+k-1+k-2+..+1 pairs. Find the minimum difference among all the pairs formed. 2. Repeat the steps above for all such combination of k elements. The maximum among them is the answer. So, problem effectively reduces to getting all combination of k elements from n elements.(Can be done by taking a initial bit vector as "0001111" for k=4. For a bit set get the corresponding element at position in the subarray and repeat with next greater no with k bit set again...) assuming getting the next bit sequence of k bits set is constant. complexity=theta(n2) and finding the minimum difference among all pairs of elements in a k element sub array(can be done simply in O(n2) ) So Total complexity=theta(n2) Any better solution to this?? I hope mine is right though inefficient.
I think that for a sorted array it can be done with at most one loop of k iterations. If the array is not sorted, add O(n log n) .
If S=(p1,p2,...) the set of all possible pairs that can be made with all n elements, sorted so that the difference between the elements decrease (p1 contains the two most distant elements, pm contains the two closest elements, where m is the last pair - incidentally m=n*(n-1)/2). A valid solution is to pick pairs p1,p2,... in order until k distinct elements are accumulated. Obviously the first pair to pick is the one made of the first and last element - these two elements are the most distant of all. The indexes for the next candidate is one of (0,n-2), (1,n-1) or (1, n-2) - basically we are moving from the extremes toward the centre of the array. The algorithm continues this way, until k elements are picked. If the elements are represented on the real axis, at step m we have the following picture: |--|-----|-----|-----|--- ... ---|------|------|-----------| 0 lL cL cH lH n-1 So far all the elements 0,lL (last low) and lH (last high) and n-1 were picked. The question is whether we pick next both cL and cH (candidate low and high) or just one of them. All the elements between cL and cH will produce a smaller difference than a[cH]-a[cL], a[xH]-a[cL] or a[cH]-a[xL] where xH>cH and xL<cL are any of the already picked elements. Let dC = a[cH]-a[cL], dH = a[n-1]-a[cL], dL = a[cH]-a[0] . Lets demonstrate that these are the largest differences that can be made with the new elements and/or the existing elements. If this is the case, then, if dC is the biggest of the three, both cH and cL will be added at this step (unless only one element is required until reaching k elements, in which case we pick cL if dH is bigger than dL and cH if not). Otherwise, if dH is the largest, then only cL is added and if dL is the largest then only cH is added. Lets assume that thre is is a different x so that the difference between a[x] and one of a[cL] and a[cH] is bigger than dC, dH or dL. If x < cL, then dL is obviously bigger than both a[cH]-a[x] or a[cL-x]; if x > cH, then dH is obviously bigger. This demonstrates the previous assumption. This means that for a sorted array, the complexity is simply O(k) - you can pick all k elements in a single loop (for a uniform distribution most of the iterations will add two elements to the array - so the lower limit is k/2).
I guess only one (one iteration - whatever): check whether I pick index 1 or n-2. I'll double check (also simplify the test), but it can look like this (b is the result array with picked indexes):
<code> cnt = 0; b[cnt++] = 0; b[cnt++] = n-1; for(i = 1; cnt < k && i < n; ++i) { int cL = i; int cH = n-i-1; int dC = a[cH]-a[cL]; int dH = a[n-1]-a[cL]; int dL = a[cH]-a[0]; if(dC <= dL) { if(dL < dH) b[cnt++] = cL; else b[cnt++] = cH; } else { if(dC < dH) b[cnt++] = cL; else { if(dL < dH) { b[cnt++] = cL; if(cnt < k) b[cnt++] = cH; } else{ b[cnt++] = cH; if(cnt < k) b[cnt++] = cL; } } } } </code> Am I missing something? The alg. is simple: loop over the sorted array from outer indexes toward the middle and at every step add one or two indexes to the picked values. The proof looks fairly solid to me.
I have a greedy algorithm that is something like that of Qwas. It generates a desired combination.
I would try to explain with an example. Input: 1,12, 6, 4, 13, 9, 7 n=7, k=4 1. Sort the input : 1, 4, 6, 7, 9, 12, 13 => O(nlogn) 2. Choose first and last elements (1, 13). min(diff) = 12 3. Remove chosen elements from the list. 4. while('k' numbers are chosen) { Now, find out the min difference of the set if a number is chosen from the unchosen elements chosen = 1, 13 unchosen = 4, 6, 7, 9, 12 min_diff = 3, 5, 6, 4, 1 Now 6 is maximum among min_diffs above, that is produced by chosing number 7. So choose 7. add 7 to the chosen list, remove it from unchosen list. } Step 4, has to iterate for each of k-2 elements. To find the minimum diff it has to test with each of chosen element. So , the total complexity would be 2(n-2) + 3(n-3) + ...+ k(n-k) which approximates O(n^3) when k approaches n. So the total complexity is O(nlogn) + O(n^3) = O(n^3) Readers, let me know if I missed something.
A few thoughts:
Possible recursive solution: If k = n then the solution is the array. If k = n-1 then the solution is the array with the minimum difference pair excluded. If this pair is (6,7) then we must exclude 6 or 7. If we find the next minimum difference pair including 6 or 7 then we exclude it. Say (5,6) is the next minimum difference pair including 6 or 7 then we exclude 6 and we have the solution as any pair with a 7 in it will be of equal or greater difference. ... If k = 2 then the solution is the smallest and largest element. After sorting a single scan can give us the minimum difference pair: If we first sort the array then the minimum difference pair will be adjacent elements. The next minimum difference pair will be adjacent or separated by one element. The next minimum difference pair will be adjacent or separated by one or two elements. So a 3n scan can give us at a minimum the first three minimum difference pairs and with decent distribution many more.
I worked out the recursive technique for k=n-2 on paper so it should work for all n>= k>=2
The complexity of the recursive technique: nlogn to sort the array + (n-k)n to produce the minimum pairs + (n-k)nlog((n-k)n) ~ 2n^2logn to sort the pairs + ~(n-k)^2 to exclude all the numbers so n^2logn worst case k=2 For small k it would be better to work your way up rather than down obviously. |

Powered by FogBugz