algorithim-house-robber.png

The House Robber Problem is a popular question in software engineering interviews, but not because of it’s practical applications.

We’ve seen a lot of competent engineers struggle with this problem, because, perhaps unsurprisingly, the optimal approach uses dynamic programming.

It’s a shame because a lot of otherwise qualified engineers are getting rejected from roles because they can’t solve what’s actually a relatively straightforward problem.

This article will help clarify the intuition behind the optimal approach to this problem and shed some light on the deeper problem-solving pattern at play here.

Let’s start by considering the problem setup: inputs, outputs, constraints, etc. With any problem, it’s always good to clarify what all of these are and what they mean.

Your input will be an array of integers, representing the amount of gold in each house.

Your output will be a single integer, representing the maximum amount of gold you can steal from the neighborhood.

So for example:

1_ARVbX5Gm6Vcrq2-u9bSDVg.png

When it comes to problem-solving, it’s always good to start with an example that is simple enough that we can eyeball the answer, but complex enough so as to not be trivial. This will be a sanity check against our approaches and will provide valuable data (observations) about the problem.

Right off the bat, we can notice that we’ve skipped over two numbers between the 3 and the 5. This means the optimal solution may not always consist of robbing the closest packing of houses we can find. We may want to skip over MORE than one house.

We can also start to make some predictions about time/space complexity. Often interviewers won’t give you a set constraint since that can be a big hint. They’ll just say something like “do the best you can.”

Given that it’s an array of numbers, and we’re doing some kind of aggregating of information by maximizing the sum of some subset of those numbers, a reasonable prediction might be that we can solve this in optimally in linear time. That by no means is a guarantee but given that we’ll likely have to scan every house to come up with our plan, it’s difficult to imagine a way of doing this MORE optimally than linear time.

But let’s focus first on the brute force approach, and let’s do this at a conceptual level. I’ll pose this as a question:

Given that we have to steal from a subset of houses to find the optimal amount of gold to steal, what subsets do we need to test?

The answer:

So what constitutes a valid combination of houses? Any combination that leaves at least one house between any two that get robbed.

Now that we’ve established we need to test different valid combinations of houses, this problem starts to look very similar to another problem that uses recursion. Can you guess which one?

Answer 1.png

It might seem surprising at first, but SO many recursive problems boil down to this pattern because what we’re essentially doing is trying all possible combinations of solutions.

Here’s what that looks like for our previous example:

Answer 2.png

Once we’ve created all the valid subsets of houses we can steal from, all we have to do is sum up all the values in each subset, and pick the one that produces the largest sum.

If you’re wondering “how the heck do I set up the recursion to solve this?”, you can use this high-level pseudocode. It’s basically the explanation in English:

Answer 3.png

Translating that into low-level pseudocode (mix of English and code):

Answer 4.png
Answer 5.png

Here’s what the code looks like in JavaScript, using helper method recursion:

Screen Shot 2019-04-04 at 8.10.40 PM.png

Now let’s analyze the time complexity of this approach. Thinking about this naively, testing out all possible combinations of houses should probably seem like a lot of work, especially when the neighborhood is large (many elements in the array).

If you’re at all familiar with the notion of what a power set is (set of all subsets), then you’ll probably deduce that it’s some kind of exponential runtime.

The way to work through this on your own is by manually comparing how much additional work you have to do BY ADDING 1 ADDITIONAL HOUSE TO THE NEIGHBORHOOD. For this example, that means figuring out how many additional combinations are created and need to be tested when you add 1 more house to the neighborhood.

Here’s how to do it on a simple example:

Answer 7.png

You can see that the larger input still tests all the combinations given in the smaller input, plus all the new ones that are created by adding the additional number. Just with this one example, we can see adding 1 additional house means 5 additional combinations that need testing.

Doing this exercise by hand is time-consuming, but it’s crucial to gain a deeper understanding of analyzing time/space complexities. My advice for if you’re doing this without any time constraints and just trying to learn is to try a few more inputs and see how things change.

You can even write code to see how many times the base case is hit for different input sizes:

Screen Shot 2019-04-04 at 8.13.38 PM.png

Try running this code. See what it prints.

Spoiler alert:

Answer 9.png

But in some ways, it shouldn’t come as a surprise if you really looked at both sets of code side-by-side.

1_selOPWHuUJFJVZKpW-7NYQ.png

If you actually plotted the numbers in the results table you’d get something like this on a log scale:

Answer 10.png

When considering the total space complexity we need to examine both how many calls are on the call stack at any point in time, as well as any auxiliary space we use outside of the recursion:

Answer 12.png

Great! We have a working solution to the House Robber Problem.

And this is exactly where the interviewer will ask you to come up with optimizations. So let’s go back to the drawing board and make some observations:

Answer 13.png
Answer 14.png

Intuitively, there are a lot of junk combinations we are testing out. The reason for this is that we’re testing a lot of smaller combinations that are contained within larger ones. This means there are combinations that are strictly betterthan others. This means we can start thinking about how we can get rid of those smaller combinations.

The other things to look for at this point are any key insights that might unlock the problem. This is just an observation or conclusion that is true about this problem. We arrive at it deductively after gathering data about the problem. If you think of the problem-solving process as an hourglass, this would be the narrowest point:

Answer 15.png

So what is the key insight for this problem?

Answer 16.png

It’s a simple conclusion but it has powerful implications. We derive it from our earlier observations that we are testing a lot of junk combinations. The proof for it being true rests in the fact that if we ever left a larger gap, there’d be some house with a positive amount of gold we’re overlooking.

Remember, that’s an important detail of the problem. It’s something that you’ll want to ask the interviewer about.

Answer 17.png

As far as how we use the insight, we’ll have to go back to working with some examples and change up our approach. Before, we were trying all possible combinations. What if instead, we broke an example down and treated it as a sequence of different inputs? Then we can track how the output for that sequence changed.

Answer 18.png
Answer 19.png

Even with this one example, we can draw an important conclusion:

Answer 20.png

It’s always a good idea to test out more examples, just as a sanity check. Does it make sense that the output should always be growing as we grow our input?

Let’s see:

Answer 21.png
Answer 22.png

Looks like the pattern holds. And notice the kinds of examples we tried. They are ones with obvious solutions we can check.

So what’s the next step?

Let’s rewrite all those outputs as a table. You’ll notice, each output corresponds to the addition of a house to the neighborhood. Here’s what that means:

Answer 23.png

At this point, it should be pretty clear what these tables mean, and you should have some intuition about why this approach works. The main question you should have is “how do we create those tables?”

Because if you can answer that question, you’ve effectively solved the problem. At that point, it’s just a matter of coding it out and debugging any minor syntax or off-by-one errors.

But just having this table as a goal is HUGE progress. Now you know concretely what it is you’re trying to build out. Initially, the problem started off sort of vague and broad, and now we have a very clear goal in mind. Notice this pattern in the future when you’re solving new problems. You want to make your tasks as concrete as possible.

So let’s build out one of these tables manually first and see if we can spot any patterns. Remember what each element in the table represents! It’s the maximum amount of gold you can have when you’ve reached that point in the neighborhood. This is important to remind yourself of during ANY dynamic programming problem.

For the first element, obviously, we’ll want to steal from that house. Let’s add that to our table.

Answer 24.png

For the second element, we’ll either keep the first element as our max if it’s greater. Otherwise, we’ll choose to steal from the second element. So far, so good.

Answer 25.png

Things get a little trickier at the third element. What are our options?

Well, we can either add to what we stole from the first house OR we can skip this house and keep what we stole from the second house. It depends on which of those two values is greater.

Answer 26.png
Answer 27.png

It’s always a good idea to list out all the options available to you in a particular scenario. And the best way to figure out the different possibilities is just by trying different examples as I did above.

Now let’s add a fourth house. What should we add to our table?

Intuitively we know the answer is 8, but how do we arrive at that number? We’re adding one of the fives, but which one?

Answer 28.png

To answer this, let’s look at adding another number.

Answer 29.png

If we add a 4, we can easily see that the output should be 9. It’s easy to eyeball the input and come to that conclusion. But we want to come up with a process for reliably finding that 9.

If we look at our table, we can see that we’re adding the value of the new house, to the max gold 2 indices back. Why? Because that’s the maximum amount of gold we’ve aggregated from an input two units smaller (aka, with two fewer houses).

Essentially what we’re asking here is “should we add this new house and skip the previous one or should we skip this new house and steal from the previous one?” And we’re doing this every time we add a new house.

One important detail to notice from this process is that we’re just finding the maximum amount of gold we can steal given the full set of houses, we’re not tracking which houses we stole from. We’d have to modify our solution if we wanted that information, but luckily we don’t care about where the gold came from. Always remember to answer the question you were asked.

With that said, let’s translate this into pseudocode:

“For every house in the list, either add its gold to the max gold 2 indices back or keep the max gold 1 index back.”

If we look at two more examples, we can see the pattern holds:

Answer 30.png
Answer 31.png

In the first example, we add the 5 and 8 together because that sum is greater than 9.

In the second example, we keep the 13, because that is greater than adding 1 and 9.

What we return at the end is hopefully pretty obvious: it’s just the last element in the max_gold array.

And that’s how you solve the House Robber Problem. Hopefully, coming up with the code solution to this is relatively straightforward at this point. I encourage you to try your hand at it before taking a look at the solution below.

It should be pretty easy to see that the time and space complexity are both linear since we need to iterate over each element once, and for each element, we are storing a corresponding maximum amount of gold possible at that point in the array.

One little piece of extra credit I encourage you to think about is how can we optimize this solution a little more? There’s one simple tweak that we can make that will save us some space, can you think of how we might do that?

Here are the optimized solutions:

Screen Shot 2019-04-04 at 9.18.57 PM.png

The main idea behind the second one: you only need to keep track of the previous two maximums. Everything else can be thrown out.

I hope you enjoyed the read, feel free to drop me a comment if you think I left anything out or if anything was unclear. This is a problem we cover at Outco to help engineers prep for their coding interviews so if you want to learn more about our 5–week program you check out our site. We have weekly visitor sessions and new cohorts starting every month.

Happy coding!

Comment