Monday, July 25, 2011

Rant on P versus NP

The weekend spent on absorbing information relating to P's relation to NP, I shall now solidify my understandings through regurgitation of what I've understood.

First, the types of texts that I've found either went deep into the topic (from the theoretical perspective) or brushed lightly upon it (the way a book on programming treats mathematics).  I'll try to walk a fine line between both while maintaining clarity.  Hopefully it works.

What are P and NP?

These have to do with timing.  Not in seconds, but in 'effort'.  For example, consider sorting books in a book-shelf.  You could scan through the 'N' books to sort, find the smallest one (or whatever sorting criteria you want), and put it at the beginning of the shelf.  This operation could be repeated for the remaining books.  So, to sort 'N' books, you'd look at (on average) 'N/2' books to find the next one to place to the left of the shelf.  That involves looking at 'N^(N/2)' books.  In the big picture, we say 'N^N' (N to the N) operations will be needed.  As the number of books increase, this method of sorting will be excruciatingly slow.  It is left as an exercise to the reader to translate looking at book titles to actual time.

They also have something to do with 'polynomial time'.  That is, a number that can be expressed in the form of 'N^x' where 'N' is the number of elements to work on and 'x' is a constant.  Sorting books according to our previous example takes much longer than polynomial time (think of very large 'N').

P looks at problems that take polynomial time.  Specifically, deterministic problems.  Typically involving Turing machines (but I don't want to go into the details of tapes).  Essentially, any problem whose solution expressed as a series of unambiguous steps takes polynomial time is in P.

And NP also looks at problems that take polynomial time.  This time, problems whose solution is expressed non-deterministically.  Back to our book sorting problem, we looked at 'N/2' books (on average) to find the next book to place on the shelf.  But what if we just 'looked at all the books at once' and happened to pick the right one.  Something similar goes on with 'Oracle machines'.  However, NP is easier to understand as the time it takes to verify the problem.  In other words, our book sorting would take 'N' time in non-deterministic polynomial time (NP-time)

And why is P versus NP important?

The crux of the problem: are all the problems in NP also part of P?  Essentially, NP problems are easy to verify but can be hard to compute.  Like our sorting problem, we only need to inspect each book once to confirm that they are sorted.  Similar ideas are used in cryptography -- a very difficult problem must be solved by the person who attempts to intercept the message.

Problems in P are those that are practically solvable (well, practically this is debatable, theoretically this is true).  There are problems that are much more difficult.  Let's call these NP-Complete (ok, technically NP-Complete problems are a class of problems where if one is easily solved then all of them can be easily solved.  They are very difficult to solve though).  For example, given a series of numbers, find the subset whose sum is the greatest.  Short of attempting every possible combination of numbers, a solution is hard to find (programmers may chime in with 'dynamic programming' rolling off their tongue.  But, that just stores intermediate results, all potential combinations are still evaluated.).

Back to encryption, if P = NP then there is a way to easily decrypt data.  From what I read, a lot of things would become much easier.

But that is the essential nature of P = NP as far as I can tell for now.  I might change my mind as I become more informed.

No comments:

Post a Comment