I can outline. A good solution would need some theorems and proofs and need at least TeX for the typing, and I can't post TeX on HN.
The problem is essentially in the field of 'facility location' which is a topic in essentially optimization.
The problem statement is a mess, so bad we should not even address the problem. But to start to clean up the problem statement, consider the set of all points in the plane, points with coordinates (x,y) where both x and y are integers. Let's call this set of points the 'grid'.
The problem statement mentioned a "grid" but didn't give a solid definition. Bluntly, the people who wrote the problem don't know how to write math. The problem also mentioned "cells", without even defining them. Even we give a definition, we don't need or want "cells". Uh, guys, it is just 100% absolutely, positively necessary to give precise definitions; intuition just doesn't work as a substitute!
So, for some positive integer n, there are n houses at distinct points in the grid.
Now the problem defines a 'distance' on the grid. For grid points P_1 = (x_1, y_1) and P_2 (x_2, y_2), the 'distance' from point P_1 to point P_2 is
d(P_1, P_2) = max(|x_1 - x_2|, |y_1 - y_2|)
Then easily enough,
d( P_1, P_2 ) = d( P_2, P_1 )
If we work at it, then we should be able to confirm that d is a 'metric'. For the main point left to show, that is the triangle inequality. A serious solution would show that we have a metric; e.g., a serious solution has some theorems and proofs and is not just some code. Sorry 'bout that!
Here let's just assume that d is a metric.
Next, let's assume that d can be extended to a metric on all of the plane. Yes, that would take more theorems and proofs.
Next we argue that for points U, V in the plane d(U, V), with V given and fixed, is a convex function in U.
So, what is a 'convex' function? Consider R as the set of real numbers and the plane as the set R^2. Suppose function f: R^2 --> R. Then f is 'convex' provided for all X, Y in R^2 and all t in [0,1],
f(tX + (1-t)Y) <= tf(X) + (1 - t)f(Y)
Here tX is multiplying the 'vector' X by the scalar t; similarly for (1 - t)Y. So, convex means, intuitively, that the function value is <= a straight line interpolation.
It's easy to show that every convex function on R^2 is continuous. Maybe in a good solution we should show this!
If f is convex and a >= 0, then af is convex. If f and g are convex, then f + g is convex. So any non-negative linear combination of convex functions is convex.
The 'epigraph' of f is the subset of R^{n + 1}
{ (x, f(x) + a) | a >= 0 }
and is a convex set (yes, I omit the definition of a convex set!). That is we just draw the graph of f and take the surface of the graph and everything above it; we get a convex set. Since f is continuous, this set is closed in the usual topology.
If function f is convex, then there exists linear p: R^2 --> R so that for all h in R^2
f(x + h) >= f(x) + p(h)
So, p is a 'subgradient' of f at x. So, we have a linear function that is thet same as f at x and otherwise is <= f. This linear function defines a 'supporting hyperplane' of the epigraph of f.
Also, if t > 0 and p(h) >= 0, then
f(x + th) >= f(x) + tp(x)
which means that if we leave point x in direction h following f and find ourselves going uphill, then as we continue in direction h we will continue to go uphill. So, if we want to go downhill, that is, to minimize, then continuing to go uphill won't get us there.
More generally we can show, from a subgradient at x, that there is a line in the plane through x so that one side of the line is downhill and the other side is uphill. So, with several such lines, we can get a convex polyhedron in the plane that has our solution.
Continuing, suppose, for i = 1, 2, ..., n, house i is at point P_i. Then we seek j = 1, 2, ..., n to minimize
S(P_j) = \sum_{i = 1|^n d( P_j, P_i )
(yes, here we are using the syntax of TeX). So, an easy solution is to try each j. For each j, the computational effort is proportional to n so that the whole solution has computational effort proportional to n^2.
We can evaluate S(Q) for any point Q in R^2. Then S(Q) is convex. So, we are trying to minimize a convex function but limited to n given points.
Let's move more quickly now (we'd need some theorems and proofs): We start with a rectangle that covers the n points. We know that our solution is inside this rectangle. This rectangle is a special case of a convex polyhedron. [We assume that the polyhedron has positive area and otherwise handle the case in a different way by a standard unidimensional search.] For such a polyhedron, we can use linear programming to find the center of the largest inscribed circle inside the polyhedron (if the radius is zero, then we know we can change to a unidimensional search). At this point, we find a subgradient of our convex function. This subgradient cuts our polyhedron roughly in half. We pick the half that goes downhill. If there are one or more houses in that half, then we let this half be our new polyhedron. If there is just one house in this half, then that house is our solution and we are done. If that half is empty, then we pick the other half as our polyhedron.
So, this is a 'central cutting plane' algorithm for minimizing a convex function, See Elzinga, Nemhauser, etc.
Thanks. I enjoyed reading this and found it sufficiently detailed that I could understand it despite being previously unfamiliar with some of it.
I don't quite understand why this solution is preferable to a solution which simply evaluates S(P_j) for each j and returns the minimum in O(n^2) time. I'm also not sure why I would prefer it to a solution that is faster.
One O(n lg n) solution requires that we convert the "Minimum sum of distances at a house for houses on lattice points in 2-space under the square metric" problem to an equivalent "Minimum sum of distances at a house for houses on lattice points in 2-space under the taxicab metric" problem, at which point we can consider the two dimensions independently. With some trickery we can find the sum of distances to other houses for all houses in the 1-dimensional version of the problem in linear time plus the amount of time required to sort the houses.
To do this, we begin at the house with the lowest coordinate and calculate its sum of distances. From the the sum of distances for one house, we can calculate the sum of distances for the next house in constant time: S(P_j) = S(P_j-1) + |P_j - P_j-1|(j) - |P_j - P_j-1|(n-j). This is not the most compact form and is an abuse of the notation considering that it refers only to the 1-dimensional problem, but I think it conveys its meaning well: Once the houses are sorted, the sum of distances for a house other than the first is the sum of distances for the previous house plus the distance between them for each house with a lower coordinate than this house, minus the distance between them for each house with a higher coordinate than this house. The phrasing of that is not quite right for houses that have the same coordinate, but that doesn't particularly matter since the distance between them will be 0.
After that we can calculate S(P_j) as the the sum of distances for P_j along one dimension and along the other. Finally, since we've calculated the sum of distances for each house, we can just print the least of these. We could also solve the problem in linear time if we were willing to use radix sort, but we probably wouldn't want to do that.
Exactly. I did just this, and still my algorithm supposedly gets only 4/15 test cases correct. The algorithm is pretty simple, and I tested it extensively against the naive O(n^2) algorithm on a bunch of random test cases and it always prints the same thing, so I'm not sure where it's going wrong. My one consolation is that no one else seems to have solved it either yet... Ah, the joy and frustration of programming challenges.
The conversion from the given metric to the 'taxicab' metric needs some justification!
The n^2 algorithm is simple enough to be solid. If your code does not give their answers, then their answers might be wrong!
My work with convexity is an effort at faster code, but actually programming all that would be a bit much. I've done such things, but I got the linear programming from the old IBM Fortran Optimization Subroutine Library (OSL) and, then, wrote the code in Watcom Fortran so that I could use the Fortran OSL OBJ files. Using the OSL for the problem in this thread would be a bit much.
So, we have a linear transformation from R^2 into R^2 with matrix
1 1
1 -1
So, the rows are orthogonal! And the columns! It's not quite an orthonormal matrix where its transpose is its inverse because the length of each row, column is not 1, but it's 'close' to orthonormal.
So, except for a scalar multiple, this linear transformation has to be an 'isometry', that is, preserves lengths and angles, lengths in the usual metric in R^2. So, this linear transformation starts to 'smell good'!
Looking, for the given metric, consider the 'unit circle', that is, all points distance 1 from the origin. Then consider the image of these points under this linear transformation. That 'circle' is a square with diagonal (1,1) to (-1,-1). Then its image is a 'diamond', that is, a square with diagonal, say, (-2,0) to (2,0). So, except for a scalar 2, we have preserved distances. That is, our linear transformation is 1-1 between a 'circle' in one metric and a circle in the other metric. That's close enough to a proof for gumment work!
Nice.
Be wise, generalize: So we have taken a nasty problem with an n^2 solution, or some tricky, solution iterating with convexity, and with a simple linear transformation turned the problem into a 'decomposition' on the two coordinates separately. So, where else can we take a challenging optimization problem, stuff a linear transformation inside the problem, and get a much simpler problem? Hmm ....
The taxicab metric is useful here because if you compute the components of a distance in the square metric, you must max() them together, while under the taxicab metric you can add them together. We can add a pair of large sums together to get the large sum of pairs, but we can't max() a pair of large sums together to get the large sum of max()es.
One way to get to the taxicab metric verson of the problem is to take w = x + y and z = x - y. From then on all the distances you calculate in wz-space under the taxicab metric will be precisely twice the distance between the same points in xy-space under the square metric.