Future Computers
Monday, 3. September 2007, 05:03:07
I saw an essay entitled Building Robust Systems[pdf] by Gerald Jay Sussman on J. Paul Morrison's wiki. The paper has a lot of interesting ideas. Mostly about how biological systems work. Interestingly, Alan Kay wanted his notion of object oriented programming to be based on cells. As I said in the past, it's unfortunate that he chose a sequential framework instead of a parallel one where cells may send messages at any time.My main problem with the paper (though I highly recommend it) isn't its contents, but rather the lack of understanding on how biological systems work. Frankly, we just don't know enough. This is both discouraging and encouraging at the exact same time. On the one hand, not knowing how biological systems can produce such complexity and reliable systems from simple sets of initial conditions can be infuriating. We need to unlock this knowledge somehow. But herein lies the encouraging facts. That there is a silver bullet or equivalent waiting to be found from properties of living organisms. Where even withering away sparks new growth in the sea of possibilities.
If it weren't for our very existence as biological entities, we would never know that there is a silver bullet. Realise that by silver bullet, I make no distinction between one single invention and a cumulative effect just as long as the results end up with the same amount of progress. As such, it gives us one reason to strive for better solutions instead of the stagnant approach that most of computing seems to have taken. IOW, that there is no silver bullet and that there is no reason to try. If anything, one should know that adaptation isn't about a single silver bullet, but a progression of small changes. By looking at what makes adaptation possible rather than the changes that must occur, adaptation is certainly a silver bullet. Either you have it or you don't.
Features
The main features mentioned about biological systems that would benefit computing are redundancy, degeneracy, exploratory behaviour, localisation, regeneration and composition. These are all fascinating topics. I'll just describe each, occasionally using examples from the essay. These are covered in the essay, but I think they are worth revisiting.
Redundancy is something that computing has known and implemented for a while. Perhaps not universally, but enough that where it really matters, technology has been made available for this purpose. RAID is probably one the more well known solutions for home computers. Versioning software is now commonplace, though I still find the lack of integration on the part of existing tools to be behind the times.
Degeneracy is something I really find fascinating. It's something that goes beyond redundancy. With redundancy, if one fails, the other can take over. Although this is technically degeneracy (as I understand it which isn't saying much), it can go far beyond. Basically, it involves finding another mechanism for producing the same result. Often, from a mechanism that has a different purpose. Perhaps it'll be slower or less efficient, but still gets the job done. An example that comes to mind is people who can write with their feet or mouths. Sign language is also a notable example.
As I'm reading the essay, it seems degeneracy goes further still. While a system becomes inoperative or irrelevant because of environmental changes, this system is free to mutate and update itself so that it becomes useful again. When it was active, other systems would have been free to mutate in the event that the environment did change. In this way, the entire whole can evolve and be adaptable to external changes. You can think of this as preventative measures, but it's really more than that. It really appears to be a dynamic network of interactions that is continuously fine tuning itself. The essay mentions some example that I didn't really grasp because I'm not a biologist. I think perhaps the way a Yew tree will grow roots into its rotted core (that now acts as a compost) when it is about to die is a better example. This allows the tree to slowly remain alive until its environment hopefully changes for the better. At this point, the tree will grow bigger and better than before. I saw this on the discovery channel, so the reader will have to check the exact details of how this works.
Exploratory behaviour is where you basically use discovery instead of specification. Generate and test is how the essay describes it. The example in the essay essentially describes a way to create links between different nodes. If you have a set of nodes laid out on some kind of graph. You don't directly link them together. Instead you use another system entirely based on discovery. This system simply expands out from existing points on the network. Say we start at an initial node, the network would expand in all directions. When it reaches another node, the link between them is stabilised. Even if parts die off, it'll be able to reconstruct itself. All other unconnected parts will eventually die off after a while, but the front most parts of the network will still keep growing in all directions. In fact, all parts of the network are continuously growing outwards and inwards. This allows the network to interconnect by using the simplest of mechanisms. Adapting this to computing isn't so easy, but I'm reminded of vladas' theory of infinite computing. In the future, we will have vast amounts of processing power and resources at our disposal. There is no reason that these kinds of exploratory systems can't be built. Mobile nodes could remain connected by using this technique, or at least be certain that it will be reconnected in the future. I'm sure other uses abound. By coupling two independent systems, you can achieve greater results than the sum of their parts. The discovery system is developed independently from the stabilising system.
Localisation is the grouping of functionality or information within one entity in relation to its environment. On the surface, this seems a lot like Object Oriented programming. Indeed, cells have this trait where they contain a multitude of functionality, but only use certain abilities determined by their involvement with a particular organ. This allows the same initial set of "code" to be reused. The part that differs is the environment. In biological systems, it appears that you don't tell the cell what to do or not to do. Instead, the environment and interactions with other cells activate or shut down certain functionality. The cell should then use the remaining active functionality and thus already knows what to do. This is 180 degrees opposite to what Object Oriented programming is about. It's quite a new twist to this idea of what cells are supposed to be. It even turns upside down the notion of requests and messages. There are none.
Examples of localisation abound in our every day world. Just take water. We don't tell water to do anything. Instead, we change the environment and let water exert its natural abilities. By fixing the ambient temperature, you can inhibit its transformation to other states. By changing the temperature, you can enable it. By changing the landscape, you can let water flow to your fields or wherever else it is needed. And water evaporating from lakes and oceans produce rain. All this is actually contrary to the "no side-effect" doctrine. It also goes against the notion that data cannot do anything. But it does support the notion of data flow programming where I believe the data should have the innate ability to go from node to node of its own accord. So data does have some abilities. It's a fine line, but I believe it to be extremely important. The example in the essay mentions tagging in order to use localisation as a transport mechanism. The tags specify how the environment should change in order to send the water to its appropriate destination. This allows you to send the water where you want according to the tags. Biological systems use this technique for removing waste at the cellular level. In computing, we tend to hardcode this concept. I wonder if there isn't more flexibility in decoupling the transport mechanism that we've yet to discover.
Regeneration is the ability to repair or produce replacement parts. In cells, each one can potentially do the job of another. So repairing cells that are all the same is vastly easier than a multitude of different and unique parts. Also, if cells get attacked or damaged, other cells can stop their current jobs, relocate and reset their active abilities to keep the damaged section operative until it can be repaired. This also makes it easier to differentiate between local and foreign organisms since all the local ones are the same. I'm also guessing this is what makes a virus that takes over cells so much more difficult to differentiate. In computing, we rarely see anything that has a signature. One problem with TCP/IP is the inability to trace where a packet comes from. Anyone can spoof a packet and attack computer systems by making it think that the data came from a trusted network. Email has this problem too.
Composability is the ability to have different systems work with each other. In computing, we use protocols. As noted in the essay, a problem with this is that the multitude of parts in large software systems requires a vast array of protocols. From function calls arguments to OO object formats to SQL to any other external messaging system. In complex biological systems, it is claimed that there isn't enough initial information to specify the final configuration. So there must be a way for systems to self-configure. I don't even know where we would begin to attempt such a thing in computing.
Consequences
The essay then goes on to talk about how these concepts would apply to computing. There are some good suggestions, but this half of the paper seems less important to me than the first half. Mostly because I believe the reader should be given free reign to see for himself how these ideas can benefit his software. I want to do just that. I want to talk a little about how this applies to Project V.
Redundancy we can do. In fact, this is how I'm going to implement data paths when a Project V network is active. We don't want a path to fail and result with a system that no longer works. I'm also going to implement degeneracy. Nodes will have multiple uses. In fact, all nodes are the same. I can activate or disable certain properties depending on the requirements. This fits in well with the concept of localisation. For the developer, the natural ability for data to move from node to node and the fact that types are a method of tagging makes localisation a natural fit. Controlling the network, its connections and flow properties is just as important as the data flowing through it. For example, buffer sizes, increasing priority of certain channels and even reversing the flow to restart an activity can and should be as flexible as the functionality available to manipulate data. In most languages, you cannot control anything about the environment.
I want to qualify this notion of environment. Most programming languages define the environment within the language itself. That environment does not mean much. That is not what I am talking about. The environment needs to be separable from the entities that do the work. For example, if I want to change the protocol of a function (the opcodes, stack, etc. and not the task of the function), I should be able to change the environment to do so. If I want to change a protocol after the software is compiled, I should be able to do so. In Project V, the communication channel and network is separate from the components and data. Although this isn't strictly discovery behaviour, it does enable certain properties of it. Eventually, I hope to expand this idea much further. Right now, I like the fact that you design the task and also design the environment in two separate phases. This is actually what allows my definition of portability. Just to be clear about VM's, you cannot change the environment so that's why this scenario does not work for what I want to do.
I want to return to localisation for a moment. In past comments, I mentioned how you could use dynamic subtyping. This is a perfect tool that allows localisation. By dynamically subtyping data, you can enable or disable functionality. If you decide to send components (instead of raw data) through data paths, then this would be doubly effective. The power perceived by closures would be insignificant compared to this. A difference from closures is that a component does not retain links to any other parts as it travels around the network. You can still do this, but it won't be localisation as this effectively kills concurrency.
An example where you would use this is where you pass around a resource. Resources can be dealt with in many ways. But if a part of your system only needs certain functionality, there's no need to have access to all of that power. If a developer tries to use this resource in a more expansive way than the subtype defines, they will be notified that this isn't possible. It also prevents incorrect actions from taking place with possibly corrupt input that comes from external sources.
An actual example would be of my backgammon game. When a new user is created, it should only have certain functionality, like verify authentication. Right now, I have to manually make sure to check its state so that it doesn't corrupt the rest of the system. A user waiting for authentication is put in a waiting list. If it were allowed to even send a text message to another user and an error occurred, the error reporting system would crash because it would get a NULL pointer when searching for the initial sender since that user is in a different list. At the very least, it necessitates that I check for NULL users. Besides, I shouldn't have to check for this. It should never happen. With dynamic subtyping, I can make sure that no improper action can take place. After authentication, I can subtype it again by giving it regular user functionality or even admin status and putting it in the proper list. But if you have to manually check settings every time you issue a command, this is cumbersome and error prone. I'm sure you can have clever solutions, but this is just an analogy. Project V has multiple dispatch and I need something this simple.
Right now, most programming languages don't allow sending objects over a network. And if they can, you encounter certain problems. If you send over an object with all its functionality, there is no protection from the other machine using any code in that object. Protection that you think is there in a programming language is 100% illusionary and/or local. With dynamic subtyping, you only send what is allowed. Concurrency has issues that monolithic languages tend to not consider. This may contradict the notion that all cells are the same, but in this case, it involves sending a cell to a foreign unsecure machine and different rules apply.
The Future
I am not striving to build a development environment similar to biological systems. I do like the concepts though. And there are many things that Project V simply isn't ready to make possible yet. Being able to adapt to things that aren't currently requirements isn't likely to show its face for a while. At least, not until we have MUCH more processing power than is available today. In fact, after looking at this essay, I now have an idea for the next generation of software development that goes beyond Project V. Again, I'll need MUCH more processing power. Something that may perhaps become available in 50 years or so. I can do it in hardware or software, but a software solution would require vastly more resources.
This generation of software involves using a continuous environment. Something where no matter what you try to access, there is something there. A physical simulation might work. You could create basic atoms of the system and design algorithms and tools that are simpler, yet may require more processing power. Right now, how do I find an object in memory? I can't just scan memory and be able to effect changes to correct anything. We need an environment where self-discovery is possible. Project V doesn't allow it. But I can use it to create it. I wish I could share what I see. It makes everything so far look primitive, Project V included. I just need more processing power than the world has ever seen. The main algorithm is based on the straight skeleton. You can look that one up on your own. It's used for roof designs a lot. Amazing how construction seeps in all the time.
Hope this has given the reader some food for thought. Also read the essay itself. The second half does have some interesting insights and the explanations in the first half are more accurate and in-depth than I can do it justice. This is probably one of the "best" essays I've read in a while, if only for the fact that it shows we've still got a lot to achieve and that it's possible because it actually exists. I want this to be a sign that when we achieve the robustness of biological systems, that we can go further still. The future really is what we want it to be. If you let others show you the way and you let yourself believe it, you will follow their future, but not necessarily the best future for you. A fallen leaf, incapable of choosing its actions, has more purpose than that.


How to use Quote function: