Project V Type System
Sunday, 2. March 2008, 04:14:46
The first thing to look at is formulating a model of computation that can work on multiple machines. I decided to throw out any model that does not fit the physical model. The reason for this is simple. No software developer can change the way hardware is done. What I mean is that multiple computers are here to stay. More than that, this trend will continue with smaller and more numerous computing machines. Nanotechnology will eventually be upon us. Mobile and portable devices have been around for a while. So connectivity is where it’s at.
Now, let's compare the current computing model with the distributed hardware model. See the mismatch? That's why concurrency is hard. It's not because concurrency is itself hard. It's because functions use the substitution model and hardware doesn't. No matter how many computations you perform, the hardware is still there. With functions, you end up replacing it with the result. So there's a serious mismatch in how computations are done when you compare functions to hardware.
How exactly would computations be done with the hardware model? Well, we need to have a way to perform computations and still have the "thing" that performed the computation be there afterwards so that it can process more items. So if you have a computer on the Internet, we need a model that allows us to send it input, have the machine process this data and then send its output to another machine to perform more processing. That is a computing model that fits the Internet perfectly. And you get concurrency for free because machine 2 can be processing the previous output of machine 1 while machine 1 is processing some new data. All machines can be processing data at the same time and that's exactly what we want.
Ok, so that's the computing model. It's often called data flow although Project V likes to use components (processing entities) even for the lowest levels. Data flow normally uses conventional languages for performing computations for each "node" or component. You would then connect these components together according to what you want to do. With Project V, you would use more components for the internals of compound components. You can still use other languages, but that's for legacy applications and other uses.
Next, I decided against any notion of "everything is X". I believe this to be the wrong way to do things. And I actually found out the hard way that "everything is X" has way too many drawbacks. I still try to keep things simple, but I strive for duality. It has a nice sense of balance too.
The core of the type system is built on the duality of sets and items. These are different things, but neither one can exist without the other. Sets can only be found as part of items and items can only be found inside sets. Well, this is true MOST of the time. You can use standalone sets and standalone items if you wish in special cases.
There are two kinds of sets, forward sets and backlink sets. As the names imply, forward sets link to items. And backlink sets link to items that reference this item (where the backlink set is contained). From this, you can form any kind of relationship or network you wish. This is NOT how components are linked together. This is how relationships for types are done.
Items may contain any number of forward or backlink sets. Sets are always named and this is how they are referenced. OTOH, items have a unique ID. Items do have a name, but this is optional in many cases. When items are used as Members, Types or Properties, then a name is required. As one can see, it would make perfect sense to create a set named Members and a set named Properties for those purposes. And that's what I did. I also reserved the backlink set Base for base types as well as the Derived set for derived types. "Derived" is the forward set name that links to the "Base" backlink set. I should have probably given them both the same name as with Members and Properties, but whatever. Members are used at runtime and properties are just like Members except they are used by the compiler or GUI at design time.
So now you know how Members and type hierarchies are defined. Multiple inheritance is possible, but there are certain things to keep in mind. Because sets can link anywhere, you can create recursive inheritance chains. When this happens, the compiler goes back down the tree during a traversal. So two types can be part of the same hierarchy, yet produce two different type chains. You can specify type A with base type B which specifies base type A. That's perfectly legal. As far as the compiler is concerned, it stops just short of a recursion. So it would see type A having a base class of B. It would also see type B as having a base class of type A. Both at the same time. This may seem like a contradiction, but it isn't. A is independent of B.
In order to understand this, we should look at how human descendence works. A child will start with the template of both mother and father. This can be seen as multiple inheritance. In Project V, the first parent has priority over the second. Any duplicate properties or types in the second (or extra) parent will be discarded. After this, the child will override SOME of the properties that come from the parents' templates. These overridden properties are the only UNIQUE properties of the child. But the child is completely separate from its parents.
Now, we get to a weird aspect of Project V's type system that doesn't work like human descendence. In order to understand this, we need to have a slightly different view of the parents. Let's assume that the parents also have UNIQUE properties that they will override. So when you compose a new person, you inherit only the UNIQUE properties and then apply your own unique properties. So if the father was a descendant of the child (I know, it's weird), then you would take the unique properties of the child and override these (and add to them) with the parent's unique properties. That's how you can have recursive inheritance that works. It's perfectly legal in project V. It's up to the developer to see if this has any uses. Project V actually needs it to bootstrap its type system.
We need to look at two more topics, instances and values. Instances are defined using something called modifiers. A modifier is exactly like a type, except that it can only have one base type. A modifier is actually what you put into the Members and Properties sets. A modifier allows you to override properties and give a name to a specific instance that is different from the type (though being different is not a requirement at this point).
Values are a little more complicated. I originally wanted values to be derived types. So 4 is a derived type of Integer. This is indeed so, but there's a problem. How do you differentiate between a type and a value? Type comparisons are different than value comparisons. What I discovered was a serious problem in most type systems. Once I figured how this was supposed to work properly, I went looking if anyone has written about this. So far, this topic has remained elusive. Mainly because most type systems start off with the notion that items are values in a set and that types are qualifiers. I took the other route. Project V uses the notion that there is no difference between a value and a type as far as sets are concerned. How did I solve this problem then?
What we're really talking about here is the axiom of choice. I use a more practical approach in that there must be a way to select an item from a set. That's what a value is. I find this to be a core requirement in any computing model. A set contains all choices. A value is ONE specific choice taken from within that set. So I simply defined the Enum type (as a secondary type) to define what are commonly known as primitive types. The Enum secondary type is when you have a limited number of possible values. Ranges can also be used, but we'll talk about that later.
I had to define a built in type for Integers. This is a requirement. Once you have tokens for each item in your set, you can build other primitive types by reusing these tokens. The crucial point about primitive types isn't its representation, but rather the ability to differentiate between each value. This is something that most people probably miss though I fully admit that most people need not concern themselves with it.
With this setup, it means that if you create a modifier that is derived from a primitive type, then that modifier will contain a value. That's how you tell the difference. The value is still considered a derived type, but you can at least determine if it is also a value. Remember, by using MetaTypes, you can likewise use types as values. So this fits in perfectly with the type system and everything remains consistent. This is what took me a great deal of time and analysis to get right.
How do you define a primitive type? Simple. The Enum base type requires that you define the Choice set. Simply insert the items in this set that you want to use as choices for the value and you're done. You've just created your very own primitive type.
In order to be able to do all of this, you need something to work with. That's where the built in type of Integer comes in. You can use a value of 0, name its modifier FALSE and insert it into the Choice set of the Boolean type. FALSE will have a base type of Boolean. And Boolean will have a base type of Integer. You also define 1 as TRUE and put this is the Choice set as well. You've just defined the Boolean type within the Project V type system. And although the IDE/compiler already defines the Boolean type, it is built on top of the type system exactly as described here.
The specific details of the Integer type are built in as I've said. It has three properties of Signed, Endian and Size. Size is itself an Integer. So if Size is an Integer, then it too has a Size property. To stop the recursion, the compiler assumes the size is 4 (bytes) and this is what is found in that location, itself using four bytes. You may override this property using modifiers to create Integers of different sizes. Signed is a Boolean as described above. This is why it's defined internally instead of being a normal type. The compiler uses that type though it doesn't define it. Endian is exactly like a Boolean but with different names (BigEndian and LittleEndian). I don't actually use this property right now. I'll activate it in another version.
That's how you define types and values. The big question is how are computations done? Computations are done within components. Components are themselves items just like any other type. They use the exact same internal data structure as types and modifiers. For compound components that have internal components, the Members set is used for just this purpose. Components have a few additional sets called Inputs and Outputs. That's basically it.
For a more detailed explanation, we need to get into how components are connected and what actually does the processing. I will include some built in components for basic operations. In a future revision, I will allow anyone to write code that uses a C interface and link this up to a component. I will also allow the use of languages within the IDE. So you can write traditional code and use that as a component. One more thing that will be available is a set of components that represents the opcodes of the machine. You'll still use the generic versions, but they will be mapped to the corresponding opcode components. That way, all you need to do is define a few opcode components for any machine you want to compile under.
The IDE will take care of linking outputs to inputs. Needless to say, this is done through the Connections set. I really like the fact that I can add new sets at any time for extra functionality. Sets have all the normal functionality that you would expect. Searching, indexing, union, intersection, erase, deletion (deletes item from ALL sets), equality, non ordered equality and a bunch of other stuff I can't remember just now.
One last thing must be said for linking components together. When you do this, a type check is done (including a value check if a value is defined). If the destination input is not a primitive type (is a compound type), then a standard type check is done. IOW, they must be identical. No, base types do NOT count. The reason for this is simple. Types only last between connections. Since the destination is the only thing that will ever use the value, then there need only be ONE representation. Hence, only ONE type is needed. You can use a conversion component if you need to change types. The IDE will have automatic tools you can use for this.
If the destination is a primitive type, then the source may be of the same type or be a value of that primitive type. If the source is a possible value of the destination primitive type, then the destination will be replaced with the value itself. This can eliminate a great deal of internal components from the destination component because certain values are no longer possible. Think of it as removing if statements when only one choice is possible. The component can sometimes be removed completely.
Keep in mind that almost everything here will be handled by the IDE. And this is a description of the fundamentals of the Project V type system. There's a LOT more I wish I could talk about like how you can write components that will update design time components. Take the addition component for example. It can start off with 2 free inputs. Once both are connected, a design time component can be activated and it will insert a third input for your use. This is possible on TOP of the existing type system. No special feature needed. The GUI simply makes available certain events that you can connect to. That's how most of the GUI will be handled.
I'm getting near done the low level code for the GUI. It's actually going well now. I didn't like the old version because it wasn't responsive enough for my liking. I want immediate results on screen. I can't stand a machine that is slower than I am, so I'm doing this the right way even if it does take a lot longer to produce a beta version. The compiler is done except for one small part that I need to update. Like I said, this month will be very productive because I'll be able to design components and networks soon. Once that happens, I'll be able to improve the GUI from within the GUI. Still, don't expect too much. My primary objective is having something that works no matter how simple or bare bones it ends up being.

