It's amazing to me that someone of Norvig's stature still finds time and joy in tackling "common" CS questions and not just writing his own code, but going the extra mile by doing it in a presentation format (ipython notebook) that's designed to be reader and replication-friendly...nevermind creating a fleshed out narrative of the process.
edit: One of my favorite bits: Norvig even takes time to describe his thought process behind what most people would consider the most basic object-oriented design: how to construct a simple Point class, and then explains how that will affect his design and implementation of subsequent functions and routines: http://nbviewer.ipython.org/url/norvig.com/ipython/TSPv3.ipy...
One thing to note in particular is that general TSP has no non trivial approximation guarantee. Norvig's article might not stress this strongly enough, the guarantees he describes really depend on the triangle inequality. At least, I think the gap between what's possible for these two versions is deep enough to mention.
Another important note is that the TSP subproblem Norvig attacks, Euclidean TSP, has much better theoretical guarantees than a 2-approximation. In fact, the problem has what's called a polynomial time approximation scheme (for a fixed dimension) allowing one to efficiently compute an approximation that is arbitrarily close to optimal. The runtime of the algorithm is something like n (log(n))^c where c depends on both the accuracy and the dimension.
It's amazing to see all of this described in such detail. Very nice presentation!
Also, I'm going to mildly hijack this thread with a question since I know people familiar with this kind of problem will be looking at the comments ;)
The TSP is essentially trying to find the correct permutation of an ordered list of cities (up to a cycle) that minimizes the total distance traveled.
I have a somewhat similar problem. Given a set of vectors (in R^3), I am trying to find a way to generate a permutationally invariant representation of these vectors that is also rotationally invariant. The rotational invariance is easily solved by constructing the Gram matrix of the set of vectors (each element M_ij in the matrix is the inner product between vector x_i and vector x_j). Of course, the Gram matrix is overcomplete now (redundant), so I take the Cholesky decomposition to get a unique representation that has 3 less degrees of freedom (all the rotational degrees) than the original set.
For permutational invariance though, I'm having a heck of a time. I've been extensively searching the academic literature on this topic, and maybe I'm just not using the right search terms, but I can't find anything. Recently I've been reading about symmetric groups and trying to figure out if there's some kind of linear basis that I can construct its elements out of, but I'm not having much luck. Anyone have any ideas?
If that problem is too hard, consider this simpler problem: Given a metric between two sets of vectors (of the same order), I want to find the permutation of the second set that minimizes this metric. So basically, min(|| A - P A' P' ||), where ||X|| denotes the metric, and P is a permutation matrix. I feel like there is some kind of relation between this question and the TSP, but I can't quite figure out what.
This is just the problem of forming automorphism-invariant encodings of a graph, but extended to permit weighted graphs.
First, given the Gram matrix for your vectors, interpret its entries (read from left-to-right and top-to-bottom) as defining a cumulative sum.
Next, order your vectors such that the values in the order-induced cumulative sum are maximal for as many of the partial sums as possible, compared with the cumulative sums induced by any other possible ordering. This produces the same "canonical ordering" and hence the same "canonical Gram matrix" for any set of vectors that have the same collection of pair-wise relationships, as measured by dot-products.
For graphs with binary adjacency matrices this can be stated more simply as sorting the vertices such that the resulting adjacency matrix, when read l-to-r and t-to-b, produces the largest binary number. This encoding uniquely identifies the automorphism group of the encoded graph (i.e. all isomorphic graphs produce the same encoding).
If you were willing to represent your pair-wise dot-products u sing fixed-precision binary representations, then you could just use the "matrix as binary number" approach directly. Though, each matrix entry would now contribute multiple bits to the binary number, rather than just one.
Note that this approach is not efficient. But, your problem subsumes standard graph isomorphism, so a polynomial-time solution would be noteworthy. All of what I've said makes no assumption about the vectors' dimensions. There's probably a more efficient approach for vectors constrained to a relatively low-dimensional space.
I know it is computationally a bit of a pain, but if you have points v_1,v_2,v_3 does the Jordan Canonical Form of the matrix [v_1 v_2 v_3] not satisfy your requirements? I believe that the original matrix is similar to both its permutations and rotations, so the JCF should be invariant. My linear algebra is a bit rusty, so I might be wrong
I'm surprised that there is no mention of Tabu Search or other metaheuristics (like simmulated annealing, as pointed out in this thread). In my research, I discovered it to be one of the best algorithms for this class of problems (determinisitic, non-black-boxy, flexible, can incorporate expert knowledge).
Also worth mentioning is that while the TSP is interesting from an algorithms class perspective, a problem that occurs far more often in real-life is the Vehicle Routing Problem.
The Vehicle Routing Problem (with time-windows, types, multiple vehicles, multi-depot, capacities, etc) is a much more complex problem compared to the TSP, and I would argue that pretty much all the algorithms mentioned in the article would fail to transfer.
Here's an open-source library written in Common Lisp that implements the Tabu Search approach [1] -- shameless plug, but we also build an API for it [2]
For the set of real numbers R, positive integer
n, and the Euclidean norm on R^n,
there is the old R. Karp paper where
he shows amazing results for:
Find a minimum spanning tree
and do a depth-first traversal but
deleting arcs that have already
been traversed and going directly
to the city not yet visited
that the depth first traversal
would visit next.
So, just code up one of the at least
two famous polynomial and amazingly
fast algorithms for minimum
spanning tree then write the traversal
code.
For the Euclidean case, likely tough
to beat, all things considered, in practice.
Why not stop there?
(1) In some cases want to be quite
close to optimality or have actual
optimality.
(2) In some cases, the Euclidean norm
need not hold even roughly, e.g.,
job shop scheduling with setup times.
> There are more sophisticated approximate algorithms that can handle hundreds of thousands of cities and come within 0.01% or better of the shortest possible tour.
Anyone have more details on this more sophisticated algorithm? Even just a name to google would be great.
The Lin-Kernighan heuristic is a related heuristic developed later, which works well in practice on many problems. Here is an implementation (freely available source, but not freely licensed): http://www.akira.ruc.dk/~keld/research/LKH/
I'm not sure what you mean by "Real Programmer". if one write programs for a living one is a programmer. Reminds me that french joke about the "real hunter" and the "fake hunter".
I don't remember the algorithm exactly and Google isn't yielding any useful results, but I seem to recall that if all the distances are integers (or if you make all the distances integers via scaling/rounding) there is a much faster algorithm that relies on "stepping upward" and considering increasingly larger distances. All I remember is that it's very similar to the same principle that makes radix sort sub-linearithmic.
The classic practical algorithm comes from Bell Labs work in the 1960s. Connect the points in some random order. Then pick two links and cut there, leaving three connected chains. Try all 12 possibilities for reconnecting those three chains, and keep the shortest. Repeat until no improvement is observed for a while.
This was one of the first useful nondeterministic computer algorithms.
Aa Bb Cc is identical to Cc Aa Bb (or bB aA cC). That's why we fixed one chain, to simplify the rest of the calculation.
EDIT: on reading TFA I find Norvig does essentially the same thing: "So let's arbitrarily say that all tours must start with the first city in the set of cities. We'll just pull the first city out, and then tack it back on to all the permutations of the rest of the cities." Interestingly he doesn't also eliminate the Aa-aA redundancy, because "First, it would mean we can never handle maps where the distance from A to B is different from B to A. Second, it would complicate the code (if only by a line or two) while not saving much run time."
Great stuff from Norvig, as per usual. I find it strange that the next step after enumeration is a heuristic approach, something that comes up a lot in "solve a TSP" posts - there are many ways to solve TSP to provable optimality that perform really well on non-pathological instances, and can often be stopped at any point to get both a probably near optimal solution and, even better, a bound on how good the best solution could be. I would argue that the implementation cost is outweighed by the possibility of usually getting optimal solutions and by having confidence in how far off you are (no chance of producing a terrible solution).
I too was disapointed SA wasn't even mentioned. I've gotten good results with it in the past. It's ability to get out of local minima make it quite powerful. Still, it's an excellent article.
I believe it was, despite not calling the name out explicitly. Are you suggesting that simulated annealing does not belong in the categories of "repetition strategy," "alteration strategy", nor "ensemble strategy"?
Simulated annealing is essentially hill-climbing with random restarts with the tweak of a changing climb-distance. Depending on how you feel like categorizing that, it's either repetition, alteration, or both (and therefore ensemble).
Just the right mix of theory and practice to make the data structures class I had as an undergraduate pale in comparison. Thank you Mr. Norvig, and please keep them coming.
A while back someone posted a cool simulated annealing post of this[0]. The app seems to be down, but the code is there in case anyone wants to take a look.
Really? Are you like that with every word? Have you kept yourself from reading master pieces of literature because you thought "oh, well I think they should've gone with X word instead"?
I once tried to read some long essay where the author had replaced all instances of "he"/"she" with some weird made up gender neutral pronoun "xe" or something like it. It wasn't explained anywhere and I was very confused for a few minutes.
It felt really weird and unnatural and detracted from the main point. Testified to the fact that years later I have no memory of what the essay was about but still remember this.
It's quite sad that some people's first reaction to everything is to assume that the other carries an agenda and there's a need to be defensive.
Person A thinks that everyone else is a sexist person with an agenda. Person B who doesn't agree with person A thinks everyone else is a radical feminist with an agenda. It's a never ending cycle of stupidity.
A list of his notebooks here: http://norvig.com/ipython/
His explanation of the Cheryl's birthday brain teaser, which was posted here last week: http://nbviewer.ipython.org/url/norvig.com/ipython/Cheryl.ip...
edit: One of my favorite bits: Norvig even takes time to describe his thought process behind what most people would consider the most basic object-oriented design: how to construct a simple Point class, and then explains how that will affect his design and implementation of subsequent functions and routines: http://nbviewer.ipython.org/url/norvig.com/ipython/TSPv3.ipy...