Find the missing element in a randomized range of numbers

I take n consecutive integers, toss one of them out and mix up the remaining n-1 integers as completely unsorted. Find the missing integer.


Problem Statement

I take n consecutive integers. I toss one of them out. I mix up the remaining n-1 integers so that they are completely unsorted. I give them to you as an integer array. Find the missing element.


Evaluation


  • What is the complexity of the algorithm you just offered? (25%)

  • Did you offer a linear algorithm without a hint or an explicit request? (25%)

  • Can you do it in linear time and constant space? (25%)

  • Can you do it in linear time to discover two integers I tossed out instead of one? (25%)


Solution


A one line solution

My one-liner Ruby solution to the single missing element:

For those of you unfamiliar with Ruby, reducing the array by the plus operator simply means summing the array. The extra zero parameter is the initial value of the implicit incremental summation. So above, I'm saying add the expected range sum, then subtract the actual sum.

Note that for the first summation, you can use the constant-time mathematical linear sum by averaging the min and max and multiplying it by the expected count. But I don't like it because it's silly and not elegantly readable. For min and max and actual array sum, you're already incurring linear complexity. There's no point in optimizing the expectation sum.

And some testing:


What the interviewer sees


  • Weak candidates sort the array and walk through it looking for a missing element. This takes O(n log n) time.

  • Intermediate candidates transform the array to a hash table and look up every element. This is linear, but also takes linear space.

  • Strong candidates sum up the array and compare it to the expected sum of the range. The difference is the missing element. This is linear time and constant space.


Harder follow-up questions to the first part

The interviewer will have a follow-up question making this problem harder. He will ask "what if you had two missing elements?" ... "what if you had three?"

The weak and intermediate candidates will answer these easily, because their sub-optimal solution can answer these. The strong candidate will have to come to the realization that they're going to need a second equation to solve for two variables.


  • A strong candidate will offer, as a second equation, perhaps to multiply all the numbers and compare them to the expected pi (as opposed to sum).

  • The test of whether the candidate is uber-strong is whether they realize that coming up with more equations is generalizable. In effect, the answer is a combination of polynomial series: the sum of all powers of P of the elements, for each missing element. (n equations for n missing elements)


The smart answer is NP-complete

The strong candidate's solution for the generalized problem will approach an NP-complete (hard) algorithm. In fact for degrees equal to or above 3, solving the problem through polynomial equations is the benchmark using which solutions to other problems are proven to be inefficient (this proof of problem equivalency is called Polynomial Time Reduction ).

The right strong candidate will need to back down to the sub-optimal solution of the weak candidate, when facing the generalized part of the problem.

Note: If you don't understand this part, don't worry. The candidates making it this far are one in a thousand. A candidate that detects NP-completeness during an interview is nearly guaranteed to get that (and any other) job. But if the interview process depended on a correct answer at this level of understanding, no employer would ever be able to fill the position in a competitive market.


References

Steve Heyman is the CTO of Adify. He asked me this question as the last round of the job interview.

Amin A.

Written by

Amin Ariana

A software entrepreneur from San Francisco