Lisp vs. Project V
Tuesday, 18. July 2006, 18:16:18
While it's true that Project V is vapourware right now, I thought it would make more sense to compare it with other languages out there in order to describe some its features. I will be using both C++ and Lisp, but Lisp will be the main focus because it has some incorrectly defined features.Lisp and other functional languages are part of what I call the establishment. The establishment is comprised of language researchers and educational institutions that are biased towards languages that have a mathematical connotation. The function being the primary so called abstraction. This is a notion that is incorrectly described 99% of the time. The establishment has seen it fit to use the notion of a function even in situations where they do not apply. It's a rather strange situation where these same people would rather see new ideas killed. The way they do this is two-fold. On the one hand, they ridicule the idea as being nonsense with lawyer like rhetoric. This is done by using one item from the idea and another item that seems to be related on the surface, but where this second item can be discredited. By using these types of arguments, new ideas are killed as soon as they surface. On the other hand, they can also attack new ideas by claiming that the idea is the same as another feature found in their favourite language. The same technique is used as before, but instead of discrediting the secondary item, they use it to explain how this feature is already included in their favourite language.
As to exactly why they would discredit new ideas is quite perplexing. I will take a guess and say that it is related to the same reasons that astrophysicists try to discredit any alternative theories. In astrophysics, funding is very important. It takes a lot of convincing to get a research grant. So if someone else comes along with an alternative theory that's better than what you're working on, then it will make your project look like a colossal waste of money. At first, people would only watch out if it conflicted with their own projects. The larger projects with more funding would usually be able to kill off these competing alternate theories. Because they know the editors of the top science journals, alternate and competing theories would never get published. Nowadays, because of lessons learned that one theory can spill into other (at first) seemingly unrelated projects, all alternative theories are killed right away. Now imagine for a moment if software development was on the wrong track for the past 50 years? So there's a very real vested interest in staying in the dark ages of computing.
I make no claims that Project V is the ultimate solution, but I am saying or hoping it will be a step in the right direction. And while I am biased towards Project V as that is my own software development environment, I have no bias when it comes to all other languages. I think they are all lacking. There may be variations on the level of "lackness", but overall they each have misguided notions. Also note that I am no Lisp expert by any means. I used it some years ago in University and found it distasteful. Not for any specific feature, but for the sheer ugliness of the syntax. Its name being the definition of a dysfunction related to language didn't help either. To help refresh my memory, I looked at the videos found here. The videos are very old from 1986, but I thought they did a good job considering how dry the material is. If the language has changed drastically since then, well, don't expect an update.
I suppose we could start with stating how to call a function in Lisp. In Lisp, anything immediately after a round bracket is a function call. The following items up to the closing round brackets are parameters. Ex: "(add 3 2)" This would call a function called "add" with parameters 3 and 2. You can use symbols as function names, so "(+ 3 2)" is acceptable as well where the function name is '+'. In Lisp, you can use functions as parameters and return values. So if you call a function and it returns another function, then how do you execute this returned function? Well, you add another set of round brackets. Suppose we have a function called "addfn" that returns a function. And this returned function, when called, will return the actual result. If we do "(addfn 3 2)", this will return a function. Remember that to execute a function, it needs to be placed immediately after an opening round bracket? So "((addfn 3 2))" will return the expected 5. The first set of brackets returns a function and the second set actually executes this returned function.
In Project V, there are no functions. There is something much more powerful. It can be used as a function if that is what you want, but there is no need to do this. This entity is called a component. To use it, you simply link data paths up to its inputs. The outputs need not go back to the same location where the input came from like a function does. Each output may go to whatever other, and possibly different, components you decide.
In Lisp, defining a function can be done in two ways. The easiest is by using the keyword "define" or "defun" depending on what you're using. The prototype is "(defun {funcname} ({args}) {body})". Here's a simple function.
(defun square (x) (* x x))
// C++
int square(int x)
{
return x * x;
}
Note that Lisp doesn't use type definitions. Well, from what I can tell, you can actually use them if you wish. But for the most part, it's the algorithm that decides how to use the data and not the type definition. If an incompatibility occurs, then you will get an error. More on this later.
Now I have to discuss the truth behind functions. Functions have a duality that no one talks about. There is the specification aspect and the calling trace aspect. Lisp makes a very odd use of these when defining functions, especially when one is returned. In Lisp, a function is hierarchical in both these aspects. You can define functions inside other functions. And functions can call other functions. This means both constitute tree structures. If you were to follow all the functions definitions, you would traverse a tree. The same goes for traversing function calls.
C++, by contrast has linear function definitions. You cannot have functions defined inside other functions. In C++ and most other imperative languages, the stack is destroyed when a function returns. This doesn't necessarily happen in Lisp. In many cases, the stack is actually a persistent tree structure. When a function returns, another stack trace is created alongside the original one. So although C++ does have a tree like (or recursive) nature to its call trace, its stack is linear since each context is destroyed upon exit. This doesn't happen with Lisp. Lisp keeps the context of each function even after it returns in certain cases. Lisp uses this hack as its entire foundation. It is quite clever, but it's a bastardisation of every notion of proper software development.
Because a context isn't destroyed, then maybe we can use it for storing things. Contexts only exist within functions, but that's not a problem since we can return functions. So let's write a function that returns another function and still keeps the original arguments in its context.
(defun pair (a b) (defun selector () ()) selector)
If we call "(pair 2 3)", this will return a function "selector" that has access to both arguments since its context is saved. So instead of using actual data structures, we use the context to store data. We've just created a pair and a way to pass it around. But we have no way to retrieve the values. Well, "selector" is a function (which has no body right now), so why don't we define this function to retrieve the values. But there are two values. We would need a function where we can specify which item we want to retrieve. That's easily done. We'll just specify 'a for the first item and 'b for the second. Singly quoted items at the beginning like that are literals.
(defun pair (a b)
(defun selector (x)
(cond ((eq? x 'a) a)
((eq? x 'b) b)
(else error)))
selector)
Note how "selector" on the last line has no opening bracket in front of it? This means it won't get evaluated. The function itself of "selector" will be returned (from "pair"). This "selector" function now contains a way to retrieve either the first item or the second item from the pair. That's what the "cond" part does. It just checks if you passed 'a or 'b and returns the appropriate value.
((pair 2 3) 'b) >> This prints 3
"(pair 2 3)" returns a function that has access to both arguments of 2 and 3. This reduces to "(selector 'b)". Since "selector" still has access to its original context, it still has access to a which is 2 and b which is 3 and will return 3 because the literal 'b was passed to it. But we don't need to put only numbers as arguments to the pair. We can use other pairs. This means that "(pair 2 (pair 3 4))" would link 3 numbers together. We can continue doing this for any length list. Usually, the last item is nil to indicate the end of the list. Making a pair in Lisp is called "cons", retrieving the first item is called "car" and the rest of the list is called "cdr".
(defun cons (a b)
(defun selector (x)
(cond ((eq? x 'a) a)
((eq? x 'b) b)
(else error)))
selector)
(defun car (list)
(list 'a))
(defun cdr (list)
(list 'b))
(cdr (cons 2 3))
>> This prints 3
This is equivalent to what we had above, but now we have a function to retrieve the first and second values. From this, all other data structures are built. Internally, it's not implemented this way (I hope), but with only functions, Lisp claims to be able to define everything as functions, even data. But this is a lie. The context is there. What gets passed around is a pointer to the context and to the function. So saying that everything is a function is false. Everything is a context, maybe! And this context is defined by the argument list and the function entry point. Sorry, there's no function here, only data. It's been incorrectly labelled is all.
Here's an equivalent C++ example where T is a variant or some generic type. Other modifications are inner functions and the ability for these inner functions to have access to parent scopes. Oh yeah, there's string equality too. Does C++ have anything built in?
T (*f)(string a) cons(T A, T B)
{
T selector(string option)
{
if (option=="a") return A;
if (option=="b") return B;
return "error";
}
return selector;
}
T car(T (*X)(string option))
{
return X("a");
}
T cdr(T (*X)(string option))
{
return X("b");
}
void main()
{
// cons(2,3) returns the "selector" function pointer.
// cdr(selector) converts to selector(“b”)
// selector("b") returns B from the parent scope which is 3
cout << cdr(cons(2,3));
}
>> Prints 3 in this virtual C++ modified language.
Now look closely at the above examples. Notice how we have THREE tree structures. The first is the nested function definition of "selector" inside of "pair" (or "cons"). We can have multiple functions defined inside a function, so this is indeed a tree. The next tree structure is the context. Because internal functions can have access to their parent's context, it must be saved. But now notice another curiosity. The calling mechanism also works like a tree but does not match one to one with the context. For example, when you call "selector", the caller is the main program, yet the context is that of the "pair" (or "cons") function. So we have three intertwined trees. What’s worse is that most of this is hidden from view. The only directly visible tree is the definition of the nested functions. Contrast this with C++ that has linear definitions and linear context with a recursive calling mechanism. And we wonder why Lisp isn't as popular!
Before I get into comparing all this with Project V, we need to look at how computations are done in Lisp. What controls the execution is having an opening round bracket directly in front of a function name. The rest is substitution. But no data gets transformed without a function call. This may sound like a redundant statement since everything is supposedly a function anyhow. So what do I mean by it? Basically, that Lisp and every other language is operation directed. We state each operation that needs to be done and ordered sequences are specified with nesting.
Now we can talk about how Project V executes. You do not chain operations together so that they form a list (or tree). Instead, it is the data that decides what operations should be done on it. It's a complete reversal of what all current languages do. Many people cannot differentiate between these two concepts. Indeed, many people have never even thought of this concept. Instead of telling the machine what to do, you tell the data how it should be transformed. This way, the machine can accomplish the transformation at any time and any way it chooses. The machine is irrelevant. Also, data does not need to return to the caller since there is no caller. Neither does it need to return to the previous component. In all current languages, results must always return to the calling entity. In Project V, this restriction is no longer there. You can have multiple outputs, each doing their own thing once they exit a component. Not only that, but since it's the data that decides when components are activated, this means we can have many data items going to the same component one after the other or even in parallel if possible. While Lisp allows the execution of parallel sub functions in the calling mechanism if all sub functions have no side effects, all calls must eventually return to a single point. With Project V, because data does not need to return to the previous component, every component has the possibility of executing at the same time. There is also no concept of an entry point or a "main" function since we are not concerned with giving the machine a global list of operations. Instead, we create a network of paths connecting our data to components and the computer can decide how best to execute these components and when.
Note that all this implies further advantages. No longer does a function have to fetch data manually and then operate on it. The fetching and storing part is automatic and controlled by the data itself. So if you have a list of items to process, you don't need to have a loop. The items will go to the component of their own free will. Every time an item enters the component, it will activate and produce a result. All of a sudden, mapping instructions over data, recursion and looping are irrelevant.
Now let's look at the three trees we spoke of earlier and see how it applies to Project V. First, let's look at component definitions. You can have nested components just like Lisp has nested functions. So this part is similar. The second tree is the context. In Project V, a component can only have access to its inputs. So there is no other context. It's not even linear. It's singular at most. The last part is the calling mechanism. There is no calling mechanism. Components are activated when needed as data arrives on its input channels. So this tree is absent too.
There is a lot of simplification when it comes to the computation or transformation of data. You no longer have to think about context or the calling mechanism. Where Project V gains in complexity is in the design aspect, and rightfully so. Although component definitions are hierarchical, the layout is more versatile. The network is a free flowing graph. You can connect components directly with a data path or you can define components one inside the other. Functional languages can be represented in Project V with a top level component and all other components inside of it to ever deeper levels, but where the component itself must handle the flow of data. This last part means that this program is made for one specific set of data. If you decide you want to use more (or less) data on a second run, you will have to modify the looping mechanism or alter the function parameters when there is no need to do this with Project V. So Project V is a superset of functional languages as it doesn't have the return to sender limitation as well as other aspects related to context and execution path.
So Project V frees us from the context, the calling mechanism, looping, data access, the return to sender restriction and the notion of an entry point. These are all machine specific issues. We don't need to worry about these things anymore. We can concentrate fully on our data transformations and none at all about the machine.
If one can't tell the machine what to do, then exactly how do we select one algorithm versus another or how do we process something X number of times? Well, we can't tell the machine to go to a certain point in the code, but the data can select different locations to go. So you can have data loops if you wish although they will be a lot less common. Conditional components likewise redirect data. Control is applied to data instead of the machine specific execution point. Not many people realise this, but the execution point is a hardware specific concept to get around current technological limitations. The ultimate computer is one where all operations can execute at once. An execution point no longer makes sense in this case. The problem today is that we are in a transition period between a sequential and an infinitely parallel computer. We need to move away from this notion of an execution point. Computers transform data. That's what we should be concentrating on.
We haven't gone into mutable state and I don't want to make this longer than it is. So I'll just say that Lisp can be object oriented and have state. Project V can also have state, but is not object oriented. The concept of object orientation is that it defines a bunch of mini machines. I've just explained why we don't want machines or be command oriented. Instead, we want to be data oriented, so OOP is not suitable for Project V.
The last point is about compiling the language within itself. Go to any lecture about a new language and you're almost guaranteed someone will ask this question. Any Lisp programmer is surely made aware of the 80 lines of code that is the Lisp language itself. Some people think this is something akin to one of the great wonders of the world. Let me tell you it's nothing more than smoke and mirrors. Perl has eval. C++ can compile and run itself. Javascript even has eval. So what's the big fuss? I don't know. They took some parts that could be built in Lisp and put that in these 80 lines of code I spoke about. The rest still uses primitives. So what? Let's take an analogy (which I'm terrible at). Anyone who has visited historical settlements will find that the blacksmith is one of the more popular attractions. The mill and printing press are other notable sites. But at the blacksmith, he can build other tools as well as the very ones he is using. So he can build pliers and a hammer with a pair of pliers and a hammer (and other metals and a fire). Is this ingenious? Not especially, although it's cool to see the guy work. So why would anyone think that a programming language that can be written in itself is great? The real challenge is how the first hammer was built. Show me how the first Lisp interpreter was built and then you can tell me if it only took 80 lines of code. This 80 lines is a farce. It's like an old scenario I was told about (forget where it's from). If you take away your arm and replace with a mechanical one that provides the exact same functions, are you still human? Now replace the other arm. Still human? Legs, torso? Still human? Finally the head. 100% cyborg. Are you still human? You're still you. The functionality is the same, but the implementation is different. The issue at hand is what are the primitives that make us human? We don't know. The same applies to programming languages. Those 80 lines contain primitives as well as defining parts of the language for reserved words. So where is the boundary? What are the real primitives that define the language? That's the real question. What language features can I not remove because they cannot be implemented with what remains?
Changing gears now, something that I was worried about for Project V was screen size. You cannot display many components on current screens. But I see this as a current limitation. I cannot believe that in the future, we won't have MUCH more space available. Many developers already have more than one screen. Eventually, larger screens will be made available. I do realise it's a nuisance for now, but I'll live with it. The advantages more than offset this nuisance. The biggest one is being able to connect previously built components. Prior to this, we could not do so. We were stuck with machine dependent code. If one person implemented something one way and you did it another way, you would have to change your code. I want to make it so that you never have to rewrite anything. In theory, you'd think this would be possible in conventional languages, but the truth is that there are a million and one ways to connect two algorithms together. Who calls what in what order and by whom? And the caller has to fetch and set the data which may not be how it was meant to be accessed. We're trying to put together many different machine specific code together. It's no wonder all this code doesn't match. Rewriting the code from scratch resolves this problem for that one time. It causes other problems in the long run. That's why it's often better to have a common interface style even if it may not be the best way to exchange data.
I didn't cover everything, but there's just too much. I hope this gave some insights to how I intend to make certain things different and how it compares to other languages. I never intended on making something revolutionary and in fact, I don't think Project V is overly unique. I just think of it as taking away restrictions. In a way, I'm taking things away from other languages that aren't necessary and making them more general. My main concern is taking away the machine. The job of executing your software belongs to the VERY low level implementations of the components, the compiler and the Operating System. This way, each platform can use platform specific components for use in the final application. This means you can use whatever resources are available to that platform including multiple processors. And I think the goals of taking away execution control (and leaving that to the platform itself) as well as achieving portability by diversity are a very real possibility. Maybe we'll all have theatre sized screens too one day.


Vorlath # 20. July 2006, 06:06
Anonymous # 20. July 2006, 20:27
You should look up the word "continuation" in CS literature sometime, you might be surprised.
Also, what you describe as "Lisp" isn't what Lispers describe as Lisp. What you describe is more of a "proto-Lisp" or perhaps a "Scheme-like language."
Common Lisp is a practical language, oriented towards real-world general purpose programming, and the design decisions in it reflect that.
You seem to be confused with your words. I do not know whether it's because you are trying to simplify for the audience, or because you just don't know. What you call "contexts" are called "closures" by everyone else.
The notion of "functions" in Lisp and languages which descend from it is derived from the notion of "functions" in Church's paper on the Lambda Calculus (the Lambda is used because his original notation could not be printed at the time). The Lambda Calculus is a Turing-equivalent model of computation which uses only two concepts: functions, and function application. Functions can have a parameter. When a function is applied, the argument is directly substituted for the parameter. This is called "normal" evaluation.
What most programming languages use today is called "stri ct" evaluation, or call-by-value, in which the arguments are evaluated prior to substitution.
It turns out that the substitution semantics of "normal" evaluation results in this so-called "context" or "closure" of functions, only it isn't quite so glaring in the absence of side-effects. Naturally, substitution semantics are not very handy in real-world computational models. Instead of substituting code for code, real compilers build an environment in which parameters can be bound to values. Common Lisp and other languages implement the notion of a "closure" as said environment explicitly in order to preserve the semantics of the "function." However, in the presence of imperative code and side-effects, such as in Common Lisp, there is an added element: the possible modification of these bindings. Some languages, like Python, simply do not permit this modification. Others, like Common Lisp, Scheme, and Standard ML (when using mutable values) do.
Oh, and you are terribly out of date. ANSI Common Lisp was standardized in 1994 and has a sizeable community working with it and on new libraries. Scheme is Scheme, but the core itself is small, and there is an active community defining new "standards" for additional functionality (called SRFIs).
Anonymous # 20. July 2006, 20:29
Practical Common Lisp -- http://gigamonkeys.com/book/
CLiki, a CL resource wiki -- http://www.cliki.net/
Vorlath # 21. July 2006, 02:35
About what true Lisp is, the only thing I've described in my article were functions and very primitive data handling through the use of context. By no means have I shown the entire language since, as you mention, there are so many variations. I had to pick one. So I picked what features seemed to be common amongst all the variations. But as I've mentioned in the article, the boundary between core features and added functionality is blurred unnecessarilly by none other than the people who use and promote it.
BTW, closures consist of a function pointer (in whatever representation) and to a chain of context for any previously bound variables. By "previously", I mean previously in the source and not in the *current* execution point, but that of the prior excution point when the closure (and context inside of it) was created. So no, a context is not a closure. I actually made doubly sure I was using the right terms. There are three trees going on at once. One is the function definiton, the second is the context for bound variables and the third is the execution point. All three decide when a context is created or used and what is placed in it. A closure is a snapshot of the context tree and the definition tree at a precise execution point. All three have to come together.
This statement is false: "The Lambda Calculus is a Turing-equivalent model of computation which uses only two concepts: functions, and function application." That's the big lie I spoke about. It uses context trees as well. Just because you don't see it at design time doesn't mean it isn't there. Instead of only worrying about the call trace which isn't obvious, you now have another hidden aspect to your programming which you must keep notice with the context. And about side-effects, I explicitely mentioned that I didn't have "space" to describe mutable state. But this confuses the issue even more.
May I ask why I am out of date? I gave you the benefit of the doubt and checked out your links and found nothing that would indicate that this is so.
It's unfortunate that there was nothing noteworthy in your comment.
As a last note, if you REALLY only need functions and application of those functions for your model of computation, then make me a closure using only ONE function. First order functions, right? Let's see one. I've never seen it yet.
Now allow me to recommend some reading for you: "Currying".
Anonymous # 23. July 2006, 16:05
http://lambda-the-ultimate.org/node/1632
Vorlath # 23. July 2006, 23:27
Anonymous # 25. July 2006, 06:55
You're hilarious and pathetic. Don't give up, we all enjoy watching a train wreck.
And yeah, continuations are like goto, since wikipedia says so.
(hahahahahahaha... *gasp* hahahahahaha)
Vorlath # 25. July 2006, 15:18
I'll risk derailing any day if it means I can move ahead. In the end, I will have done it and people like you will only be remembered as obstacles.
Anonymous # 26. July 2006, 09:44
I am perplexed by what kind of person it is that produces posts like the last one by "Anonymous". Perhaps I am naive or have an idealistic perception of academia, but I can hardly imagine a professor or a researcher responding in this way. On the other hand, the poster has some grasp of Lisp, but clearly not very sophisticated. A student of computer science, perhaps? Ghastly. :(
Vorlath # 27. July 2006, 02:50
function -> retains context that may be changed later (let and such).
continuation -> created duplicate context that cannot be changed until executed.
Would this be a fair assessment? Not sure it makes any difference though. Seems like adding more state somehow.
Anonymous # 6. August 2006, 06:43
I would recommend you take a look at a language called Oz (mozart-oz.org).
It uses the concept of extremely lightweight threads and execution scheduling by
detection of when all input arguments have a data value ready (i.e. are bound),
and subsequently splitting processing into as many parallel threads as logically
possible, with the original thread either blocking to wait for new value bindings it
needs, or just being finished, as in continuations.
Oz variables, I postulate, have some relationship to the kind of self managing
data items that you talk about.
I think it admirable that you think and work boldly with broad sweeps in these
areas, and wish you good luck with the project, Being broadly aware of what else
is out there can also help you.
I'd be interested in your opinions on Oz, and how it compares to your ideas,
if you have time to look at it.
Vorlath # 6. August 2006, 10:22
Anonymous # 2. April 2007, 13:38
I'd like to hear when you post on this again, do you have a mailing list?
Great post.
Vorlath # 3. April 2007, 08:27