Skip navigation.

exploreopera

| Help

Sign up | Help

Software Development

Correcting The Future

Hierarchies and Equivalence

I've come to believe from what I've read online on other blogs and in aggregate feeds that programmers don't understand what they are doing. Sure, they understand the primitive tools they are using, but they don't see the big picture. They don't see the final structure of what they are building.

So I'm going to be point blank about hierarchies. There are TWO kinds of hierarchies. There is the stacked kind and the flat kind. Here is the stacked kind.



And here is the flat kind.



See the difference? In the stacked hierarchy, the lower items support the top ones. But in the flat hierarchy, an item is only a portion of each layer. For example, sections of cities and other jurisdictions make up a district. Districts make up a state. States make up the nation. Nations make up the world. But in each case, there is only ever ONE level. We know this because the surface is relatively flat. We don't stack districts or cities one on top of the other. Each layer is not distinct from the last, but instead groups more and more of it together.

The TRUE difference is this. If you pull out any card except those on top, the whole thing comes crumbling down. But if California ever decides to leave the US and form its own nation, then the US will still be there, just a little smaller. Removing one piece doesn't affect everything else in the hierarchy. Will certain things need to be updated? Sure! But most of those would be political ramifications. It could later rejoin just as other states and territories have joined in the past.

In the stacked version, integrity of the entire structure is critical. It's also critical that none of it changes. Like I said in my previous article, you use this kind of structure when you have a final monolithic design that you want to remain unchanged for thousands of years.

With the way we write software today, we use the stacked version. This should be obvious. When you call a function, you will create a chain of events that will cause other functions to be called until you get a return value, if it creates one. Trace all the calls and you will see that it charts a tree structure. A tree structure is a stacked hierarchy. What can change easily are the top most functions and data. In fact, this is even more obvious when you think about what part of your software can change at runtime. Your software updates data at runtime. It can do this precisely because only the top most part (data) of a stacked hierarchy can change. This is also why the rest usually stays static and why self modifying code is frowned upon. Something else that's possible using this model is adding on top. So you can add extensions. But only on top of what's already there IF the programmer took the time to allow extensions in the first place.

Now think about what we have as reusable code today. Library functions. These are the highest parts of the stacked hierarchy. The reason there is a push for abstractions is because these libraries can then encompass several of the final layers of your software. If your software used to have 10 layers (10 nested function calls max at any given time) where library functions took up the 10th layer, that left a lot of manual work for the programmer. But if the libraries can take care of more details and abstract more functionality, then it can encompass, say one to four layers. All of a sudden, you now only have to handle 6-8 layers instead of 9. This is a reduction in complexity of more than 30%. Not only because the top layers are taken care of by the libraries, but because lower layers have much fewer functions in them (for example, the first layer can usually only have ONE main function).

Essentially, all we're doing is pre-packaging the top layers so that our software looks smaller and more maintainable. Does it work? Absolutely! But there are limitations. The only replaceable parts are on top. The lower layers, the foundation, cannot be replaced or even updated without causing unforeseen consequences. I don't care if any one programmer can set up a correct system. It's that conventional languages tend to steer you in the wrong direction. You have to fight the natural tendencies of these languages to be monolithic. This isn't a failure of the programmer. It's a natural physical property of this type of stacked structure.

The flat hierarchy doesn't suffer from any of these problems. You can replace any part at any time and it'll only directly affect its neighbouring components. The integrity of the entire hierarchy is not in jeopardy as it would be with the stacked hierarchy. This is why the Internet has had every single part replaced at least twice without ever shutting it down.

Flat hierarchies also have the advantage that any item can be in several different groups at once. You're not limited to any ONE categorisation. Remember the problem of determining if a shape is a rectangle or a square? What type is it if all sides are equal length? With a flat hierarchy, you can have it in both types when of equal length, and remove it from the square type when the sides are not equal length. Why can you do this? Because the integrity of the type system is not lost by removing part of it. So there is no problem removing an item from a type. The rest of the type system will still work fine. You can also include in multiple types because a flat hierarchy allows for this. It's how it was designed to work.

Now compare to a stacked hierarchy. When you check a type, you probably expect it to stay static! Why is that? Why do you expect something to remain immutable and global? Well, because that's the only way stacked hierarchical type systems can work. That's the way you expect them to work. If you start removing derived types at any given time, the integrity of your type system falls apart. Imagine removing a base class at runtime from your type? You'd lose that functionality from everything that derives from the derived type. Chaos! And this is exactly like the house of cards above! In this case the picture is upside down, but the structure and effects are the same. The base types are actually the bottom of the hierarchy. Those base types hold up everything else. If those disappear, everything else collapses.

In Project V, I use a flat hierarchy even for the type system. You can remove a base type if you want at runtime. It won't affect anything else because each item is independent from everything else. It works on the basis of offsprings. It copies whatever it needs from its parent types, but leaves those parents alone. This means you can have a shape have whatever type you want at any given time. It can morph and adapt.

Think of this. Say you have a class record that represents a customer. That customer IS A Canadian. But what do we do? We don't assign a base Canadian class. No way. You can't change it if there was an error, or that person changes nationality. It gets worse the more local you get. What about the state or province? Do you have a customer record have a base type of the state of New York? Of course not. How about someone who's from Buffalo? The IS A relationship is quite obvious. But no. We have members that hold all this information.

Could we do it the other way? To have base types that specify how to categorise an entry? Of course. You just need to have a type system that is not based on stacked hierarchies. Project V can easily include items in different categories. In fact, this is how it implements templates. You can easily change these categories at runtime. These are called primitive types though they can contain anything you wish. It's not really the IS A categorisation. It's rather IS ONE OF. I am ONE OF many Canadians. This works exactly like choosing ONE number OF many in the set of Integers. This feature is missing in most programming languages. In Project V, everything's a set, so it's no big deal to handle this. If you've seen my last post about Project V's type system, then you would understand how the Enum type handles categories. The Options set would be updated at runtime for any given category. This leaves the original item unchanged exactly as is required by set theory where you can put items in any give set (or category).

Does Project V have static base types? Not really, though I certainly encourage you to leave types static. Besides, types aren't global entities as they are in conventional languages. When two components are linked together, only the two ends of that connection need to be aware of the type of the data that goes across that link. No other part of the system is aware of or cares about that piece of data and its type. IOW, there is only ever one recipient for any given piece of data and its type. It's not like in OOP where any method can access the same piece of data any time it wants.

Next time you come across the word hierarchy, ask yourself if it's a flat hierarchy or a staked one. And remember that anything that requires leaf nodes (like trees and function invocations), that you're using a stacked hierarchy and that you should be aware of the limitations therein. These things should not be dismissed and brushed aside with rash comments that one can organise their data and code in a way where you don't encounter theses issues. That's living in denial and in complete disregard for those who must maintain that code. The features of these two hierarchies are properties of the Universe. They cannot be avoided with simple wishful thinking.

This is also why the notion of equivalence is so dangerous. Can you achieve the same kind of results with both kinds of hierarchies? Yeah, they're equivalent. You can convert one into the other. But that's not all there is to the story. One kind requires superhuman memory and intelligence to control after a certain size while the other has no such requirements. Equivalence is only useful once you remove everything that is of importance or of consequence and as such is completely useless as a feature. If you remember anything, remember that equivalence doesn't take complexity, time, external factors or anything else into account.

Refactoring (Updated)Link July 30th, 2008

Comments

avatar
It sounds like you can remove types from runtime, but any currently instantiated objects of that type still stick around, all you are removing is the ability to create new instantiations of that object. What happens when the system is then restarted?

By spc476, # 25. July 2008, 20:42:13

avatar
Also, could you please explain how this would work with a real application, say, an email client. Start with a client that understands the Unix mbox format and talks SMTP for outgoing email. Replace the mbox with IMAP. Then replace SMTP (which is a store and forward architecture) with something different (say, this proposal).

My intent is, how would a Project V based design around mbox and SMTP (say, written in the early 90s, way before spam became a problem, but way after it pretty much killed off other competing email protocols) be any easier to modify for a vastly different architecture (under the hood—the user interface doesn't have to change at all) than what we have now.

By spc476, # 25. July 2008, 20:51:01

avatar
It sounds like you can remove types from runtime, but any currently instantiated objects of that type still stick around, all you are removing is the ability to create new instantiations of that object. What happens when the system is then restarted?


It picks up where it left off. Or you can scrap it and start up the original. BTW, you don't exactly remove types from runtime. Each entity is an individual. It has certain characteristics that you can compare to what you know of the world. And by *you*, I mean another such object (or indivdual) within the system. Things don't just vanish. You can also retrieve types from any machine and move them around. The current machine will likely REMEMBER a lot of different types and have lots of functionality at its disposal even if the actively running system does not.

My intent is, how would a Project V based design around mbox and SMTP (say, written in the early 90s, way before spam became a problem, but way after it pretty much killed off other competing email protocols) be any easier to modify for a vastly different architecture (under the hood—the user interface doesn't have to change at all) than what we have now.


There are two different ways. One involves changing the app. The other does not. When you build your client, you define an interface for the data store. This only involves how to get data and how to store any if needed. Then you can define on your machine what the default is for that interface. You want to use mbox, then define it as the default. You want to use IMAP, then use that as the default. This is OUTSIDE your app. You are instead adding functionality to the machine. You app is also an addition to the functionality of your machine. It will now remember how to do things according to default implementations for each definition.

We need to move away from defining software as we used to. Instead, we use what the machine itself knows how to do. If it doesn't know how to do something, your app can have a default implementation which the machine will then store locally (ie. your machine LEARNS!) until something better comes along. It will use this implementation for other software as well if it encounters the same definition. It will also remember what each app uses and will at time use many different implementations at the same time to compare results and see which one is faster and uses less memory, etc. Authors of these implementations can also update them. You can broadcast changes over the Internet to not use a faulty implementation anymore. Or to use component X when possible.

If another machine connects to yours and doesn't know how to handle the data, but can handle the definition, machines can pass around implementations so that machines learn from other machines, even if the apps are completely different.

I wish machines could create definitions on their own, but I haven't invented an AI yet. So this is what I think is the best compromise. We define definitions for machines to use. But machines can determine on their own and learn on their own what implementations to use and when, with the occasional help from humans of course.

This means your application's parts could be 100% replaced without you even knowing about it. So implementing the low level parts will likely be a complete waste of time.

If you were using a sorting algorithm that was kind of slow, you can tag your app to use a different one. This will spread around the Internet and your app will start using the new algorithm you specified. But guess what? The machines will likely have already determined that the old algorithm was no good and are probably using a better algorithm than the new one you chose. So the machines could completely ignore your recommendation (unless you hardcode the implementation which is your right as a developer).

I'm not saying you can use these features all the time. But with Project V, you will hardly ever write custom components. It simply won't happen that often. Instead, you'll be putting together apps from parts that already exist.

Also, when you make custom changes beyond what's mentioned here, any component you modify only affect the neighbouring components. You can subdivide your software in ways that are simply impossible in today's programming languages. This is really what I'm after. The rest is just fluff.

By Vorlath, # 26. July 2008, 05:05:01

Write a comment

Comment
(BBcode and HTML is turned off for anonymous user comments.)

Please type this security code : 53e030

Smilies

October 2008
SMTWTFS
September 2008November 2008
1234
567891011
12131415161718
19202122232425
262728293031