techInterview Discussion

Answers to technical interview questions. A part of TechInterview.org

A part of techInterview.org: answers to technical interview questions.

Your host: Michael Pryor

Minimum difference which is maximum

This was asked in one of the tech interview of a reputed company.

Given an array of 'n' random integers. Given an integer 1<=n. Find the k numbers such that the minimum difference of all the possible pairs of k numbers is maximum (maximum among other minimum differences for various possible selections of k numbers )
Neophyte Send private email
Sunday, November 08, 2009
 
 
From what I have understood from the problem is we have to find any k nos -a subset of n nos. in an array such that for any pair of 2 nos in this k element subset the minimum difference is maximum among all such subset of K elements.

Consider
Array A[]=<1,12,6,4,13,9,7>
Here n=7
Let K=4
then a brute force implementation that I could think of is:
1. Take a K element subset of the array(combination).
 Example
<1,12,6,4>=<(1,12),(1,6),(1,4),(12,6),(12,4),(6,4)>
So for a k element we have k+k-1+k-2+..+1 pairs.

Find the minimum difference among all the pairs formed.

2. Repeat the steps above for all such combination of k elements.

The maximum among them is the answer.

So, problem effectively reduces to getting all combination of k elements from n elements.(Can be done by taking a initial bit vector as "0001111" for k=4. For a bit set get the corresponding element at position in the subarray and repeat with next greater no with k bit set again...)
assuming getting the next bit sequence of k bits set is constant.
complexity=theta(n2)

and

finding the minimum difference among all pairs of elements in a k element sub array(can be done simply in O(n2) )

So Total complexity=theta(n2)

Any better solution to this?? I hope mine is right though inefficient.
Neophyte Send private email
Sunday, November 08, 2009
 
 
Total complexity is a lot higher than Theta(n^2), because there are about n^k/k! subsets to consider.

I think there might be a dynamic programming solution that achieves O(n k), but I haven't worked out the details.
d Send private email
Monday, November 09, 2009
 
 
(Assuming a sorted array.)
d Send private email
Monday, November 09, 2009
 
 
u r right d,
I calculated wrongly the complexity
Neophyte Send private email
Tuesday, November 10, 2009
 
 
I think that for a sorted array it can be done with at most one loop of k iterations. If the array is not sorted, add O(n log n) .

If S=(p1,p2,...) the set of all possible pairs that can be made with all n elements, sorted so that the difference between the elements decrease (p1 contains the two most distant elements, pm contains the two closest elements, where m is the last pair - incidentally m=n*(n-1)/2).

A valid solution is to pick pairs p1,p2,... in order until k distinct elements are accumulated.

Obviously the first pair to pick is the one made of the first and last element - these two elements are the most distant of all.

The indexes for the next candidate is one of (0,n-2), (1,n-1) or (1, n-2) - basically we are moving from the extremes toward the centre of the array.

The algorithm continues this way, until k elements are picked. If the elements are represented on the real axis, at step m we have the following picture:

  |--|-----|-----|-----|--- ... ---|------|------|-----------|
  0              lL    cL          cH    lH                n-1

So far all the elements 0,lL (last low) and lH (last high) and n-1 were picked. The question is whether we pick next both cL and cH (candidate low and high) or just one of them. All the elements between cL and cH will produce a smaller difference than a[cH]-a[cL], a[xH]-a[cL] or a[cH]-a[xL] where xH>cH and xL<cL are any of the already picked elements.

Let dC = a[cH]-a[cL], dH = a[n-1]-a[cL], dL = a[cH]-a[0] .

Lets demonstrate that these are the largest differences that can be made with the new elements and/or the existing elements. If this is the case, then, if dC is the biggest of the three, both cH and cL will be added at this step (unless only one element is required until reaching k elements, in which case we pick cL if dH is bigger than dL and cH if not). Otherwise, if dH is the largest, then only cL is added and if dL is the largest then only cH is added.

Lets assume that thre is is a different x so that the difference between a[x] and one of a[cL] and a[cH] is bigger than dC, dH or dL. If x < cL, then dL is obviously bigger than both a[cH]-a[x] or a[cL-x]; if x > cH, then dH is obviously bigger. This demonstrates the previous assumption.

This means that for a sorted array, the complexity is simply O(k) - you can pick all k elements in a single loop (for a uniform distribution most of the iterations will add two elements to the array - so the lower limit is k/2).
Qwas Send private email
Wednesday, November 11, 2009
 
 
tl;dr

You can't solve the problem in time O(1) with a sorted array when k = 3.
d Send private email
Wednesday, November 11, 2009
 
 
Its O(k) - unless I didn't get the big O notation.

Its a single loop of max k iterations.
Qwas Send private email
Thursday, November 12, 2009
 
 
If you wrote a program to do this, how many instructions as a function of n are performed on a sorted array of length n when k = 3?
d Send private email
Thursday, November 12, 2009
 
 
I guess only one (one iteration - whatever): check whether I pick index 1 or n-2. I'll double check (also simplify the test), but it can look like this (b is the result array with picked indexes):
<code>
    cnt = 0;
    b[cnt++] = 0;
    b[cnt++] = n-1;
    for(i = 1; cnt < k && i < n; ++i) {
        int cL = i;
        int cH = n-i-1;
        int dC = a[cH]-a[cL];
        int dH = a[n-1]-a[cL];
        int dL = a[cH]-a[0];
        if(dC <= dL) {
            if(dL < dH) b[cnt++] = cL;
            else        b[cnt++] = cH;
        } else {
            if(dC < dH) b[cnt++] = cL;
            else {
                if(dL < dH) {
                    b[cnt++] = cL;
                    if(cnt < k) b[cnt++] = cH;
                } else{
                    b[cnt++] = cH;
                    if(cnt < k) b[cnt++] = cL;
                }
            }
        }
  }
</code>

Am I missing something? The alg. is simple: loop over the sorted array from outer indexes toward the middle and at every step add one or two indexes to the picked values. The proof looks fairly solid to me.
Qwas Send private email
Thursday, November 12, 2009
 
 
You actually need to continue until you get a clique on k elements, not just k pairs.
d Send private email
Thursday, November 12, 2009
 
 
cnt counts elements, not pairs.
Qwas Send private email
Thursday, November 12, 2009
 
 
If I give you an array that looks like

0000000000000001222222222222222222

with the 1 possibly not there, please tell me how the algorithm is going to decide whether to output 012 in O(1) time. (Formally: the array matches 0*1?2* .)
d Send private email
Thursday, November 12, 2009
 
 
I never said O(1) - the algorithm is O(k) - linear in k. Again, is a single loop of max k-2 iterations.
Qwas Send private email
Thursday, November 12, 2009
 
 
Ups ... sorry. Now I see the real problem.
I was wrong when I said that a solution can be obtain picking the pairs in order: once you pick two pairs, there are four other pairs that can be formed that may give a very small diff.

Thanks for the patience :)
Qwas Send private email
Thursday, November 12, 2009
 
 
I have a greedy algorithm that is something like that of Qwas. It generates a desired combination.
I would try to explain with an example.
Input: 1,12, 6, 4, 13, 9, 7
n=7,
k=4

1. Sort the input : 1, 4, 6, 7, 9, 12, 13  =>  O(nlogn)
2. Choose first and last elements (1, 13). min(diff) = 12
3. Remove chosen elements from the list.
4. while('k' numbers are chosen) {
Now, find out the min difference of the set if a number is chosen from the unchosen elements
      chosen = 1, 13
      unchosen = 4, 6, 7, 9, 12
      min_diff =  3, 5, 6, 4, 1
Now 6 is maximum among min_diffs above, that is produced by chosing number 7. So choose 7.
add 7 to the chosen list, remove it from unchosen list.
}

Step 4, has to iterate for each of k-2 elements. To find the minimum diff it has to test with each of chosen element. So , the total complexity would be
2(n-2) + 3(n-3) + ...+ k(n-k)  which approximates O(n^3) when k approaches n.

So the total complexity is O(nlogn) + O(n^3) = O(n^3)

Readers, let me know if I missed something.
Imran Shaikh Send private email
Sunday, November 15, 2009
 
 
You can't be greedy about this. If you have the input

1 3 4 5 7

then for k = 4, after you choose 1 4 7, you're going to have either 1 3 4 7 or 1 4 5 7 when you should have 1 3 5 7.
d Send private email
Sunday, November 15, 2009
 
 
d!
You are right. I should think of another way.
Thanks for pointing out.

-Imran
Imran Shaikh Send private email
Monday, November 16, 2009
 
 
A few thoughts:
Possible recursive solution:
If k = n then the solution is the array.
If k = n-1 then the solution is the array with the minimum difference pair excluded. If this pair is (6,7) then we must exclude 6 or 7. If we find the next minimum difference pair including 6 or 7 then we exclude it. Say (5,6) is the next minimum difference pair including 6 or 7 then we exclude 6 and we have the solution as any pair with a 7 in it will be of equal or greater difference.
...
If k = 2 then the solution is the smallest and largest element.

After sorting a single scan can give us the minimum difference pair:
If we first sort the array then the minimum difference pair will be adjacent elements. The next minimum difference pair will be adjacent or separated by one element. The next minimum difference pair will be adjacent or separated by one or two elements. So a 3n scan can give us at a minimum the first three minimum difference pairs and with decent distribution many more.
Erik Heikkinen Send private email
Friday, November 20, 2009
 
 
I worked out the recursive technique for k=n-2 on paper so it should work for all n>= k>=2
The complexity of the recursive technique:
nlogn to sort the array +
(n-k)n to produce the minimum pairs +
(n-k)nlog((n-k)n) ~ 2n^2logn to sort the pairs +
~(n-k)^2 to exclude all the numbers
so n^2logn worst case k=2

For small k it would be better to work your way up rather than down obviously.
Erik Heikkinen Send private email
Friday, November 20, 2009
 
 

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics
 
Powered by FogBugz