Why Recursion Sucks
Wednesday, March 7, 2007 3:35:40 AM
Here's a simple example. The syntax may be alien, and yes, it looks busy though not as bad as Java. This is a C++ version that only takes one event handler. It doesn't even have the problems with lists of event handlers such as with Java or Qt. And the closures here are simply method pointers that remember what object is associated with it. It's basically a dual pointer. One to the object and one to the method.
class Region
{
protected:
Rect frame; // rectangular boundary for our region.
CRegion region; // Internals for region segments.
public:
typedef void (__closure *RegionEvent)(Region *source); // Event type
protected:
RegionEvent XEvent; // listener
public:
__property RegionEvent Event = {read = XEvent, write = XEvent};
__property Rect Frame = {read = frame, write = frame};
Region();
void Clear(); // Clear clipping region.
void Fill(); // Fill clipping region.
};
Region::Region() : XEvent(NULL)
{
// Default to maximum boundary.
frame.Fill();
}
void Region::Clear()
{
region.Clear();
if (XEvent) XEvent(this);
}
void Region::Fill()
{
Clear();
region.Add(frame);
if (XEvent) XEvent(this);
}
Ok, there's LOTS wrong with this code. It's rather simple as far as functionality is concerned. You can have a rectangular boundary called 'frame' where everything outside of it gets clipped or removed. The only other functionality I did provide is that you can Clear or Fill the clipping area. Oh, and you can get notified with an event handler if the clipping region is updated. Simple enough? All right. And yes, those ugly and multiple protected and public sections are because of dependencies. They need to show up in that order. If someone has a better way, please let me know.
Other than ugly syntax, what's wrong with it? Well first off, it doesn't check if the region is actually modified. It just calls the event handler if you call any method that could update the region. Secondly, when you update the boundary frame, the clipping area is not updated to be restricted to this new area. It all depends on what you want to do with the implementation. You could clip only when you actually use the region on an image for example.
But in order to be notified by the frame (Rect object) of any updates, you would need to register an event (which isn't currently supported) or modify it to return a boolean. We have an event in this example, so let's look at that. If you register an event and you call Fill(), your event will get called twice. Hmmm... How to solve this? Well, you have to separate the public interface and events from the functionality. So you need to duplicate all your functions. The public ones call the private (or protected) ones that do the actual work and when they return, you can call the event. This sucks beyond anything that has ever sucked before.
Truth be told, it's not a bad practice in the OOP world though it still sucks hard. The key is that public methods should not call other public methods from the same object. It avoids recursion and all sorts of other problems. My irritation with this is that there's no real need to do this other than the fact that we are dealing with recursion. If the execution point wasn't there, this problem would be gone. There'd be no need for any of this.
Here's a new version of the code that works, but still doesn't check if the region is actually updated. It just triggers the event if there's a chance it was updated.
class Region
{
protected:
Rect frame; // rectangular boundary for our region.
CRegion region; // Internals for region segments.
public:
typedef void (__closure *RegionEvent)(Region *source);
protected:
RegionEvent XEvent;
// Internal functionality
void SetFrame(const Rect &r);
void iClear();
void iFill();
public:
__property RegionEvent Event = {read = XEvent, write = XEvent};
__property const Rect Frame = {read = frame, write = SetFrame};
Region();
void Clear(); // Clear clipping region.
void Fill(); // Fill clipping region.
};
Region::Region() : XEvent(NULL)
{
// Default to maximum boundary.
frame.Fill();
}
void SetFrame(const Rect &r)
{
frame = r;
// Cut out anything outside our boundary.
region.Clip(frame);
}
void Region::iClear()
{
region.Clear();
}
void Region::Clear()
{
iClear();
// Call event handler
if (XEvent) XEvent(this);
}
void Region::iFill()
{
iClear();
region.Add(frame);
}
void Region::Fill()
{
iFill();
if (XEvent) XEvent(this);
}
I haven't checked this, so it may have some syntax errors, but this would work. We have internal methods that do all the work without triggering events and public methods that are only allowed to call these. Internal methods likewise cannot call public methods. So everything is hierarchical as it should be with anything that uses a call tree. This is why this last example works so much better. But what a waste of space and calls just to accommodate a certain style of programming. I'm working around language side-effects that I don't like.
I also declared the property for the boundary as const. This is a nasty hack IMO. What it does is that if you retrieve the boundary (with "myregion.Frame.Left = 10;"), you can't change it because the compiler will generate an error. I did this because I want to know to know when you update it. So you have to set it all at once using "myregion.Frame = Region(0,0,10,10);" or something like that. This causes the SetFrame() method to be called and from here I can trigger the event. Otherwise, I'd have to modify the Rect object to support event handling and have the Region object register itself for that event which would then trigger another event. Personally, I think a rect object should be fast. It shouldn't be checking if it should call an event.
Here's what's actually going on with events. Events are an inversion of control. Your software at initialisation seems like it's in control by requesting resources, creating windows and all sorts of other things. But when events come in, you're no longer in control. The object that's calling your event is in control. So now you're just a piece of code that fulfills some action based on an external event. What if you want to trigger an event from here? Hello! You get into yet another recursion problem. Generating events should be allowed from anywhere.
And this is what I don't like about exceptions either. An exception is an event. Well, it should be. But we handle this differently. So we have all sorts of conflicting language tools. In any language with an execution point, whether you're using OOP, FP or imperative, everything must eventually come back up to the top. With events, we're fighting this. There should be a mechanism to return from the object completely and then call the event. Otherwise, we're calling from the bottom up. This is dangerous no matter what way you look at it.
The above example still has problems. It solves the immediate, single use problems. But what if you have two objects that need to know when each other is updated? Then you get the same problems. You can't even publish the internal methods because that will lead to missed events (and possibly misuse).
Events are also strange in that they must be coordinated between each call. If you're drawing a line, you need to keep track of the initial position as well as the last position. This way, you can erase the old line and draw the new one with the updated position. These events would likely call some other methods to draw these lines (or whatever else you're doing). Then these can trigger more events.
That's why I've always said that queuing requests was always best. Then you give each system a turn to process its requests. What normally happens though is that people think that calling another object is simply passing control to that object. They think this other object is still at the same level as the first, but this is no longer so. It's one level below. It's restricted in what it can do because if it needs functionality in the caller either directly or indirectly, you'll end up in a world of hurt. Unfortunately, there's no way to tell what objects make use of what other objects because of this thing called encapsulation or interface driven programming. Queuing and taking turns solves this problem entirely. Again, this is working around the language.
The problems just continue. If you have lots of internal methods, you can't inline them because some of these can be rather large and call other internal methods. My example is simple, but doing region intersections and unions are a lot more involved. You can't inline this stuff. Or at least you shouldn't.
But out of all these problems, I care about none of them. I've dealt with all of them before. There's nothing new here. But it did take me quite a long time to figure out that round robin or taking turns does work so much better. If I hadn't done any game programming, I doubt I would have stumbled upon it. In game programming, you have something called a game loop. This loop gives each system a turn to do some processing. Usually, it grabs inputs. Then it updates the environment whether it be 3D objects, collisions or simply 2D sprites. Then it renders the next frame. Finally, it displays it. There's an order to what can be updated and when. In game programming, it's out of necessity. Games don't just stop. They're usually always doing something. So a game loop is natural. Too bad you can only have one infinite loop in your software. This means taking turns is not a viable option in all cases.
Then what is actually troubling me? It's the event triggering system itself. If I don't use an event, the code still checks for it every single time I use that particular object. Multiply this throughout your entire software and it gets rather ridiculous. This is where Java may be better. Maybe it can determine that there's nothing there and generate better code. I don't know. What I do know is that in C++, you can't do this. What I'm aiming to do in Project V is that if you don't use an event, then no code is generated for that part. You can use the same component in multiple places. So each use of it can and will be different. If at some point an event is used, then a new version of that component will be generated.
Here's the nice thing about this. Not only does it remove code when you're not using an event. But when you actually ARE using one, it doesn't need to check. It can just call the event directly because it already knows it's there. It can also insert the event pointer directly into the generated code. No more member to store the pointer. The value is injected directly into the code. This will at least double the speed of your code. Double indirection is still really slow, especially because of cache misses. If it's in the code, then no cache misses.
What I like is that you can monitor how often the event handler is changed. If it changes a lot, then it can leave the original checking code in. If there are extra idle processors, you can have multiple nodes compiling more efficient versions in the background until they are ready to be used. There's no limit to what you can do with idle processors.
Note that all the earlier problems also go away. There's no such thing as a call tree in Project V. So you can send events wherever you want without any of the nastiness discussed so far. No more recursion preventive programming. No more worrying about synchronising multiple event calls. No more stack overflows. The list goes on.
Then again, I may be the only one on the planet that has these problems. That IBM paper may just be a figment of my imagination. Frankly, I don't see this kind of thing spoken about very often. Events cause recursion. And not the clever kind you saw in some Computer Science class that used sentinel values. No, this is the kind that hits you in the face and blows your entire foot off. If you don't get memory corruption, then you'll eat up all the memory on the system and eventually run out of swap space and then you'll know how annoying it can be trying to kill something when there are no resources left to bring up the kill dialog. And if it does come up, it's so slow because it now has to deallocate everything and clean up the mess that's now on your hard drive.
This stuff is WAY more common than anyone lets on. I've rarely seen an application that doesn't do this at some point. Java apps seem to be especially prone. I'd rather it crash and hand me my computer back. I never understood why people were happy when it was said that Java will keep your app running. Ok, my foot isn't enough. Let's finish the job.
Recursion is just nasty. Yes, it looks nice and pretty in source code if it's your intention to use it. But on the machine, it's pure evil. Recursion is nothing more than implied stack management. The reason iterative methods look messy is because you have to be explicit and handle your own stack. However, you can now have limits on resource usage. You can't do that with recursion. Well, you could. But it would use up more memory and look just as ugly.
Recursion is not something I recommend people actually use. Maybe you should know about it. But I just can't promote something that causes this much trouble. I laugh at people who say they think recursion gave them an aha moment when they finally got it. C'mon. If you think recursion is somehow inspiring, I think you should try something other than programming.
I've said it before and I'll say it again... why don't we learn about this stuff in University or in any programming course about what the execution point actually does? Is it taboo? I know you can't see it. Still, I think every course in programming should talk about it. I would have liked to have been told a long time ago that if you call a method in another object, then that object effectively becomes an internal object for the duration of the call, but without scoping being in effect. If someone had told me that, things would have been so much better. It would have made so much more sense. And it would mean a completely different way of programming. Maybe that's why it's not spoken. Dynamic hierarchies with hidden side-effects aren't things that are particularly sexy the programming language universe. We don't talk about it and pretend those problems don't exist. There's a real reason why PostMessage() returns immediately and why this is needed on top of SendMessage().
Now I can better describe the code example I gave above. If I have a main object that calls functionality in the Region object, then this Region object becomes an internal object to the main object. If the Region object then triggers an event in the main object, the main object becomes an internal object to the Region object. So we have a recursive hierarchy. The main object is both at the top and at the leaf of the call chain. We're only using two objects. Throw in hundreds more and you've got a very real problem. Ever see refresh problems in GUI's? Now you know why.
I'll end by quoting Stevey.
You do realize you can make a computing machine out of just about anything, right? And they don't all have to work like Turing machines do. Turing was one of the greatest geniuses of the century, but he'd have been the first person to tell you that there are infinitely many machine designs capable of the same computations. His was just one model, and his professor's (which led to Lisp) was just one other model. But who's to say they're the best models?


SachsenCoder # Tuesday, February 15, 2011 12:15:42 PM
The "event" keyword in C# is a big lie to the developer! It has nothing to do with an natural event. It's plain method delegation. That sucks really, as you described in this article!