Well yesterday I read a blog post by Jeff Atwood over at www.codinghorror.com about NP-completeness. In the comments to his post I felt that Jeff got a somewhat raw deal with the criticism he received. Ok, the details of his post were not exactly accurate to the nth degree, but his post may have identified a problem set that some of his many readers may never otherwise have been aware of. This has got to be a good thing, right?
So anyway, with a little experience on my side with these NP-complete problems maybe I will try to shed some further light on the details. Remember though, this is quite a complicated area and many books have been written about this stuff and many will save a whole chapter just as an introduction. Thus, I will try to keep this as light as possible and hopefully the detractors can fill in their own details where I leave them out.
First, one of the biggest questions in the field of computer science is does P=NP
. For those that wish to pursue research in this area, there is the Millenium Prize Fund offering a cool $1million for a proof either way of this question. Those living in the UK will be hoping that it’s paid in dollars with the piss poor exchange rate 😉 .
Now what is P
? Well P
is the set of problems that can be solved in polynomial time. That is, a problem where an algorithm is known to take O(nk)
, where n is the size of the input and k is some fixed constant; fixed meaning it doesn’t vary with the input size. An example of such an algorithm is Quicksort, whose complexity is O(n2)
.
So what is NP
? Well first, NP
does NOT stand for Non-Polynomial, if you want to take anything for this just take that, as it will save you embarrassment when you say it is (sorry I’ve seen this so many times). It actually stands for Non-deterministic Polynomial. Now when is a problem in NP
?
For a problem to be in NP
, the problem must be a decision problem. See, this is where it escalates trying to describe this stuff, as you now have to define a decision problem for all this to make sense. Anyway, a decision problem is a problem where the answer can be given as a simple ‘Yes’ or ‘No’. For example, given two cities is there a direct plane route between these cities. Clearly the answer is Yes or No. For a more computing/maths related example we can say that given a graph G, does there exist a path between two vertices, say u and v, in G.
Now that we know what a decision problem is, we can now describe what the class of problems in NP look like. Mmmmm, if only. In actual fact, I now need to describe what a non-deterministic algorithm is. Sigh. Let’s keep it simple if not 100% accurate. A non-deterministic algorithm has two phases. The first phase guesses an answer (certificate) to our problem and the second phase verifies that this certificate satisfies the constraints of the problem in polynomial-time. For example, consider the problem of matching 3 sets of people (known as 3D matching). Let’s name these people men, women and dogs, with each set having size n
. So what we want is a set of (man, woman, dog) triples, such that each man, woman and dog appear in exactly one triple and the matching is of size n
, i.e. everyone is matched. So the decision problem here is: given such an instance does there exist a matching of size n
? The non-deterministic algorithm is: guess a set of (man,woman,dog) triples, then first verify that our set of triples is of size n
, by counting the triples, and then to ensure that it is a matching, we can check that no two triples have someone in common.
Phhhhewww, that was longer than I imagined, see why Jeff’s abbreviated version may be better if not so accurate. I mean I have not even described what an NP-complete problem is yet. We can however describe what the class of problems in NP
now are.
The class NP
is defined as the collection of all decision problems that are polynomial-time solvable using a non-deterministic algorithm. We note here that many of the problems we see in everyday (programming) life are defined as optimization problems rather than decision problems – an optimization problem is one where we seek to maximise or minimise some property, for example the Travelling Salesman Problem. However, it is known that all optimization problems can be re-written as a decision version of the same problem.
So can we now define what an NP-complete problem is? Mmmm just about. In fact, I will state it here and then describe the one further bit of information we need. A problem X is NP-complete problem if:
- X is in
NP
; - X is NP-hard.
Often you hear people getting confused by NP-complete and NP-hard. They are different things, an NP-hard problem need not be NP-complete, basically NP-hard problems are at least as hard as those problems in NP
. Also, if a combinatorial decision problem is NP-complete we say that the optimisation problem is NP-hard.
So what does it mean for a problem to be NP-hard? Well to show that a problem is NP-hard we must show that it is polynomial-time reducible to an NP-complete problem (in fact, to be precise, what we have to show is that it is reducible to ALL NP-complete problems, however due to the transitivity of reductions we need only show it’s reducible to a single NP-complete problem). That is, we need to show that we can reduce an instance of one problem to the instance of another problem, i.e a Yes answer in one problem maps to a Yes answer in the other and vice-versa, similarly with the No answers.
Now if only we had one problem that we know is NP-complete. Well we do! This was shown in Cook’s Theorem. So now all (and I say all very lightly) we have to do to show that our problem is polynomial-time reducible to satisfiability, or some other NP-complete problem – satisfiability (SAT) is the problem Cook showed to be NP-complete. Another (classic) example of an NP-complete problem is the 3D matching problem example given above.
And there you have it, NP-completeness in just over 1000 words, although I missed a few details out I hope this is useful to someone. Incidentally I have meet Stephen Cook (ala Cooks’ theorem) before and I’m pretty sure from experience he would struggle to tell you this in under 1000 words 😉 Actually maybe not………………..
Over and out.