Dijkstra, Proofs and Non-Determinism
Wednesday, 17. May 2006, 22:12:48
I'm reading through a LOT of his manuscripts in the past couple days. I think this guy was misunderstood. When you see things from his point of view, it's easy to see how he formulates his arguments. The story behind the "The goto statement considered harmful" paper is interesting. He submitted a paper titled "A case against the goto statement" to ACM. The editor at the time changed the title to "The goto statement considered harmful" and changed the paper from an article to a letter to the editor so that it would get published faster. The editor was Niklaus Wirth.
The article was about specific cases. It was not meant to be about removing ALL gotos. After all, loops and if statements are built from gotos. If gotos are harmful 100% of the time, then by induction, so must loops and if's. Just when I thought no field was more political than physics and cosmology, I'm proved wrong. Then again, they both have physics and science backgrounds.
Most of his arguments revolve around proving the correctness of your program before even starting to type it in. Although his programs were correct, they weren't always efficient. That may have been the battle between himself and most of the computing community. One example was a program that would print all solutions (X,Y) to the following Diophantine equation:
X2 + Y2 = C
- where C is known.
He decided to start X at 0 and increment Y to its maximum allowed value where X2 + Y2 <= C holds true.
He could just have taken the integer square root, but whatever.
I'll use my own proofs for this example. The algorithm is that since Y > X always holds true, then decreasing Y by 1 and increasing X by 1 means that the new state will be
(X+1)2 + (Y-1)2 ? C
We want to know if the left side is lower. In our program, we want to prove that updating X and Y in this fashion will still be lower or equal than C.
We know that Y > X, so Y has to be at least 1 larger than X as we're dealing with integers. So the difference between Y and X is at least 1.
Y - X >= 1
Rewriting,
0 >= X - Y + 1
Multiply by 2,
0 >= 2X - 2Y + 2
Swap sides
2X - 2Y + 2 <= 0
Adding back our initial state...
2X - 2Y + 2 <= 0
+ (X2 + Y2 <= C)
gives
X2 + Y2 + 2X - 2Y + 2 <= C
Completing squares and reducing makes
(X+1)2 + (Y-1)2 <= C
So we have just proven that as long as Y > X and when we increment X and decrement Y, that it'll still be equal or less than C.
After this, we can keep increasing X to it's max value so that the main equation is still less than or equal to C. Once we reach this maximum value, we repeat the process until X < Y is no longer true. The reason for stopping when X >= Y is because addition is commutative, so we only have to do half the values. Dijkstra has some notations for this too, but I'll dispense with it.
X := 0; Y := 0; while (X2 + Y2 <= C) Y := Y + 1; end while Y := Y - 1; while (X<Y) if (X2 + Y2 = C) print(X,Y); Y := Y - 1; fi else if (X2 + Y2 > C) Y := Y - 1; fi else if (X2 + Y2 < C) X := X + 1; fi end while
The first 'if' statement is only checking for valid values of X and Y and printing them. We just decrement Y to make the equation less than C again. The second 'if' statement checks if the equation is larger than C. This can only happen by incrementing X. So if the equation is larger than C, this means that X has been incremented and inside the condition, Y is decremented. This means that the equation will again be less than C because we've already proven that incrementing X and decrementing Y will make the equation less or equal to C as described above. That's where the proof comes in. The third condition is just incrementing X as long as the equation holds true. The condition isn't actually necessary. You can just use an 'else' statement.
There is something quite satisfying about proving your program before entering it in the computer. The thing is that this program is slow as mud. But Dijkstra's point of view was always that correctness should take precedence over speed. It's no wonder that he got into arguments with people who had to implement software with real hardware constraints. He himself had to deal with these issues and often resulted in overly complex software.
Anyone can see that the integer square root can be used instead of incrementing by 1 to find the initial maximum value of Y. All that squaring is ridiculous. Also, the condition can be pulled outside of the 'if' statement. Instead of computing the square every time for the condition, you can notice that squares are always separated by a linear progression. {1,4,9,16,25...} The separation is {3,5,7,9,11,...}, and since we're either incrementing or decrementing squares by a single value, find out what the separation value is instead and use that to update the value for the equation at every iteration. This way, you completely eliminate the squaring except the initial ones. At each step, you can use simple addition.
X := 0; Y := IntegerSquareRoot(C); DX := 1; DY := (Y-1)*2+1; E := Y2; while (X<Y) if (E = C) print(X,Y);fi if (E >= C) Y := Y - 1; E := E - DY; DY := DY - 2;fi else X := X + 1; E := E + DX; DX := DX + 2;fi end while
We can even eliminate X and Y altogether.
SQ := IntegerSquareRoot(C); DX := 1; DY := (SQ-1)*2+1; E := SQ2; while (DX<=DY) if (E = C) print((DX-1)/2,(DY+1)/2);fi if (E >= C) E := E - DY; DY := DY - 2;fi else E := E + DX; DX := DX + 2;fi end while
This poses a real problem though. What solution do we use? The proven one that can be read quite easily? Or the optimized one that requires more 'proofs' and alterations to the algorithm in order to make it faster? This new version isn't as easy to see what's going on. How do you resolve this situation?
I now want to move on to something else that Dijkstra has often talked about. It's about concurrent programming and non-determinism. One example he gives is having two processes running at the same time on two processors where they have shared memory and can do atomic reads and writes. Is there a way to synchronise both processes so that they can accomplish the task at hand. While you can indeed get them to work together, non-determinism has a way of creeping in. There are situations, and plenty of them, where if you're not careful can result in a non-deterministic state. Meaning that it's possible to get into a state where no work can continue or where errors cannot be recovered.
My question is: Are there other definitions of non-determinism? This is related to another paper that's been circulating the last little while about certain things that can't be guaranteed or are just impossible in a distributed environment. My problem is that this is an absurd way to write software. It's like a peer to peer network where there is no node that is in charge. Most peer to peer networks don't care if one node disappears. But if all nodes are trying to accomplish a task, it would be helpful if a certain number of those were in charge and yet another one in charge of these. In this way nodes can request help or controlling nodes can alter the configuration of the work done depending on the state of the network. Having a free for all is a recipe for disaster. I just don't understand this obsession with creating unnecessary problems.
I would rather have a piece of software where I can prove that it works correctly. And then altering its runtime environment to multiple processing nodes on the fly. The trick here isn't the proof, but rather that it works correctly on one computer or some relatively simple system. Then an external runtime controller would make sure the software is executed properly. Synchronising different systems and processing nodes shouldn't be part of the actual software. That should be part of the runtime environment that controls the execution of your software.
Something that makes this difficult is exactly that we DON'T make sure our programs are correct. If everything was checked like it should, then even if failure happens, your software should be able to recover.
Other than blind processes, is there any other discussion about non-determinism? Blind processes are a BIG waste of time as there are better ways to write software. Just my opinion, but blindfolding a crowd and telling them to somehow coordinate themselves without the use of a leader is not my cup of tea.
While proving things is nice and all, I've shown that it doesn't necessarily yield the best results. So just because something is proven to not be 100% reliable doesn't mean there aren't easier and better ways to accomplish the same tasks with more than acceptable reliability. I can prove that I will lose my watch if I throw it in the river. By the same token, I'm pretty sure I won't lose it if I don't throw it.


How to use Quote function: