Skip navigation.

The whole Enchilada

A sequence of thought

Cheap equality

This post is about equality and what it could possibly mean for Enchilada.

One major goal of Enchilada is to achieve massive scaleability. Interestingly, this goal can be achieved with immutability - i.e. not to allow mutable data. Why is this immutable property so important? The key is referential transparency. With referential transparency, the relationship between value and reference will persist and never change in time.

With immutability, checking for equality comes pretty cheap - just compare references instead of huge complex nested values.

But this doesn't work out if two values are created from two different reductions. For instance, the expressions (1 3 +) and (2 2 +) reduce to the same answer (4), but produce different references. The result? Two distinct references can point to the same value. So that leaves us with two options:

1) the references are equal => the values are definitely equal.

2) the references are not equal => we don't know and may be have to compare the complex nested values (costly).

In case 1, that's all nice and well, but how are references created in the first place?

In C or Java, references are typically created by the memory manager: free heap space is allocated for new objects. In C, references are equal to memory locations. In Java, the memory addresses are hidden, but behind the scenes, Java references eventually map to memory locations nevertheless.

For one single Java virtual machine such references work pretty satisfactory, but what if we want to have global references, spanning over millions of heterogeneous machines? One solution would be to use a separate global reference service with the sole purpose to create unique IDs (kind of similar to a memory manager) for every request.

However, such service would be likely to choke on the number of requests coming from so many machines. Moreover, it would have a hard time coming up with unique references. For instance, a sequential number scheme (1, 2, 3, etc.) would kill concurrency because of the need of serialisation.

For maximum concurrency, it would be better if every virtual machine would create its own unique references (this also applies for ordinary UNIX processes). But how can we achieve this? By hashing the value to a reference locally. And with hashing we can further strengthen the relationship between reference and value.

But there is one snag that we have to deal with: Two different values could map to the same hash reference (collision). Interestingly, with hash references, the options turn the other way around:

1) The hashes are the same => there is a high probability that the complex values are the same but we may have to compare the values to be absolutely sure (expensive).

2) The hashed are not equal => then we are absolutely sure that the values are not equal too.

Because all values in Enchilada are final, their hashes are final too. With hashes we have a very cheap non-equality test. Conversely, we have an equality test with a very low probability of being false.

Enchilada v0.2

Enchilada v0.2 includes signed lists and some additional operators. The full list of (12) operators are as follows:

unary:

- (neg)

# (num)

^ (enl)

\ (cut)

! (dsl)

$ (hsh)

binary:

+ (cat)

* (mul)

% (zip)

: (swp)

" (dup)

? (isq)

Note that I've redefined (:) to swp and that I've added a seperate enlist (^). I also dropped the idea that there should be an absolute minumum of operators (but I don't want to exceed 16 operators if I can help it).

Another newcomer is hsh ($) operator. The hash operation can be used to calculate the canonical hash of an operand. You can use this to quickly test the equality between humongous lists (with a very low probability of being false).

The new num (#) operator is easy enough. It replaces every expression in a list by the empty expression.

[1 2 +, 5 4, 3] # => [', ', '] == 3

[1, 2, 3, 4] # => [', ', ', '] == 4

I'm also redefining the cut (\) operator so it behaves also like the 2log operation. It's related to the previous definition of cut, but then recursively applied:

[] \ => []

[1] \ => [[1]]

[1, 2] \ => [[1], [2]]

[1, 2, 3, 4, 5] \ => [[1, 2, 3], [4], [5]]

[1, 2, 3, 4, 5, 6] \ => [[1, 2, 3], [4, 5], [6]]

120 \ # => [60, 30, 15, 8, 4, 2, 1] # => 7

That's it: the whole Enchilada v0.2. Hope you like it.

Factorial again

Another stab at factorial.

This time I won't use iota but instead define a seperate factorial step - following the good old Forth style: break down programs into smaller ones.

Note the general theme: looping expressions N times should be replaced by the multiplication of expressions N times.

fac-step == enl swp enl [*] % [1 +] + * !

1 1 fac-step => 1 2

1 2 fac-step => 2 3

2 3 fac-step => 6 4

2 3 fac-step ==> [3] [2] [*] % [1 +] + * ! ==>

[3] [2 *] [1 +] + * ! =>

[3] [2 *, 1 +] * ! =>

[3 2 *, 3 1 +] ! ==> 6 4

fac == swp [1 1] [fac-step] * + ! drp

3 fac => [1 1] 3 [fac-step] * + ! drp =>

[1 1] [fac-step, fac-step, fac-step] + ! drp ==>

1 1 fac-step fac-step fac-step drp ==>

6 4 drp => 6

So if the number of iterations are known in advance, looping is easy in Enchilada. A general while construction could also be devised in Enchilada, but this would be not in spirit with the APL or K slogan: 'no stinking loops'.

Signed lists

Although I've thought about it for a long time, I'm still somewhat reluctant to allow signed numbers (i.e. lists) to Enchilada.

For signed lists to work, the semantics of every single operator has to be modified accordingly. Lists and numbers will be prefixed with _ (like in APL) to indicate negative sign.

Let's try it out for numbers first:

4 - => _4

_4 4 + => 0

_4 2 + => _2

_4 4 * => _16

_2 _2 * => 4

4 _2 % => _2

_2 _3 % => _2

This looks promising but what about complex lists?

[1, 2] - => _[1, 2]

[1, 2, 3] _1 + => [1, 2]

[1, 2, 3] _1 * => _[1, 2, 3]

[1, 2] _3 + => _1

_[1, 2, 3] 1 + => _[1, 2]

_[1, 2, 3] _1 * => [3, 2, 1]

_[1, 2, 3] ! => 3 2 1

_[1, 2, 3] _[4, 5] % => _[3 5, 2 4]

The reversal of the elements in the list may seem strange at first but works beautifully in all cases: for numbers (lists with empty expressions) everything is solid mathematics. For more complex lists you get some additional semantics (reversal).

Factorial without stench

Just to honor the 'no stinking loops' site of Stevan Apter and to see if Enchilada is up to the challenge.

We start of with the familiar APL construction iota (minus a, plus 1):

2 4 iot => 2 3 4 5 6

iot == [dup 1 +] * !

On top of iot we can define factorial (plus 1)

0 fac+1 => 1

1 fac+1 => 2

2 fac+1 => 6

3 fac+1 => 24

fac+1 == dup enl [1 swp iot] + swp [*] * + !

3 fac+1 => 3 3 enl [1 swp iot] + swp [*] * + ! =>

3 3 enl [1 swp iot] + swp [*] * + ! =>

3 [3] [1 swp iot] + swp [*] * + ! =>

3 [3, 1 swp iot] swp [*] * + ! =>

[3, 1 swp iot] 3 [*] * swp + ! =>

[3, 1 swp iot] [*, *, *] + ! =>

[3, 1 swp iot, *, *, *] ! =>

3 1 swp iot * * * => 24

Of course, we can transform fac+1 to fac with the following trivial (but smelly) code.

fac == fac+1 drp

So the challenge remains. What is the shortest Enchilada expression to calculate factorial?

Yet Another Joy?

You can argue if Enchilada brings something new to the other concatenative languages: Joy, Factor and Cat.

I think one of the big improvements over Joy is how programs/expressions are handled. This becomes particularly obvious when we look at Joy's map operator.

JOY: [1 2 + 3 4 *] [dup +] map => [2 4 + 6 8 +]

ENCHILADA: [1, 2, +, 3, 4, *] [dup +] * => [2, 4, + dup +, 6, 8, + dup *]

Note the regularity of Enchilada's semantics: there are no strange exceptions like in the Joy case. I believe Joy's quotations were always considered problematic (even by Manfred von Thun, the maker of Joy) and I think the map operator shows why.

Cat is very similar to Joy and is likely to suffer from the same confusion between quotations and programs. That could change in the future however, because its creator, Christopher Diggins, is still finalizing Cat's exact semantics.

Another improvement over Joy, Factor and Cat is that Enchilada is completely typeless. There is no special number, string or character type. In that respect, Enchilada is closer to Forth and (pure) Lisp then anything else. I'm aware that being typeless might not please the language purists out there, but I like my language to be simple (but not simpler). If there is need for compilation (or theorem proving) you can build such compiler in Enchilada without much effort.

Factor is a very interesting language, especially because it is reaching some critical mass. However, Factor is not purely functional like Enchilada. I believe that referential transparency was never one of Factor's major design goals, but I could be wrong here.

I'm eager to learn some more Factor, because I want to know how it is like to build big concatenative programs (although Forth and Postscript could tell me that too).

A sequence-based language

I would like to coin Enchilada as a sequence-based language, rather then a stack-based language. Because, at its core, Enchilada is build on immutable sequences.

To elaborate on the inner workings, consider to following expression with 1 billion elements (including +):

a == 1 2 3 ... 500.000.000 500.000.001 + ... 999.999.999

There is only one redex in the middle. A stack based language would have to go through 500.000.002 steps to get to it. In Enchilada, the sequence is split into three parts: the redex, the prefix and the postfix (of the redex).

split(a,499.000.000) => (prefix,b)

split(b,502.000.000) => (redex,postfix)

reduce(redex) => reduction

cat(prefix, reduction, postfix) => result

This is roughly Enchilada's reduction algorithm. What remains, of course, is how one can quickly find a redex in such a huge expression.

Concurrent lists

A hidden feature of Enchilada is its ability to concurrently reduce expressions. When you combine zip, cut and mul, the number of concurrent reductions can scale to dramatic sizes.

Consider the following code:

cam == \ % [*] *

[1, 2, 3, 4, 5, 6, 7, 8] cam => [1, 2, 3, 4] [5, 6, 7, 8] % [*] * =>

[1 5, 2 6, 3 7, 4 8] [*] * => [1 5 *, 2 6 *, 3 7 *, 4 8 *]

All 4 expressions in the final list can be reduced simultaneously and concurrently. Note that if we continuously apply cam n times, we can concurrently multiply 2^n numbers!

Twice the syntax

I'm worried that Enchilada's syntax is to awkward.

May be I should drop the excessive parenthesis for something more parseable. But then all the re-writing rules become awkward.

Then I thought of the idea to use two kinds of syntax, one canonical and one... well.... user-friendly (due to Sjoerd Visscher)

The user-friendly syntax drops the round parenthesis, includes numbers and static 'macro' definitions.

The canonical syntax can be transformed to user-friendly and vice versa. However, on the background, reduction rules only apply to the canonical form.

So the following Enchilada expressions:

([()()][()()()]:!)

([([()][()()])([()][()])][(:!)]*)

Would become:

swp := :!

2 3 swp => 3 2

[1 2, 1 1] [swp] * => [2 1, 1 1]

Empty expressions () are replaced by '

4 [1, 2] + == ([()()()()][([()])([()()])]+) => ([()()()()([()])([()()])]) == [', ', ', ', 1, 2]

or

[', ', ', '] [['], [', ']] + => [', ', ', ', ['], [', ']]

Adding zipper?

On Lambda the Ultimate, Sjoerd Visscher remarked that the zip operator was missing from Enchilada, or that it would be difficult to implement based on top of +, *, !, : and \.
I think Sjoerd is right but for more then one reason: you can use the zip operator (%) to chop lists into a smaller pieces, or use it to implement the min operator.

[(1) (2)] [(3) (4)] % => [(1 3) (2 4)]
[(1) (2)] [(3)] % => [(1 3)]
[(1) (2) (3) (4)] 2 % => [(1) (2) (3) (4)] [() ()] % => [(1) (2)]
9 4 % => 4
January 2010
S M T W T F S
December 2009February 2010
1 2
3 4 5 6 7 8 9
10 11 12 13 14 15 16
17 18 19 20 21 22 23
24 25 26 27 28 29 30