Random Thoughts On How We Code
Saturday, 2. June 2007, 04:15:30
It may sound like a stupid question at first, but when you think about it, listing steps is not the best way to write software. Actually, I think it's one of the worst ways of programming. It took me a long time to resist that urge. Today, I do not do this. I haven't listed steps in a very, very long time. 99% of my programming time goes into design. Actual typing is quite minimal. Sitting in front of the computer screen actually slows me down. That's the reason I'm writing Project V. So that I can be productive in front of the computer for a change, but that's another story.
Let me put it this way. When someone tells you that they need a certain button to display a certain result, do you instantly think of how you would code that up or do you instead think of the information needed and how this will be coordinated with other parts that will also make use of this data? If you don't see the difference, then perhaps you never will. It's true. For many, many years, I could not have understood this. Over the years, I naturally evolved a different way of programming. One that requires very little time in front of the computer. Once I'm sitting in front of the computer, I'm pretty much done all the programming. It's just data entry at this point. Don't think that my time spent typing is short. Rather, I spend much time on non-programming tasks such as getting other people's code to work properly with mine. Only in programming do people think this is part of the job. It's not. At least, it shouldn't be. No other field thinks this is normal. In other fields, this is wasted time. It's only done because someone messed up.
I'm finding that many other fields work in the exact opposite way that programming does. Programming usually thinks of how something should be coded up BEFORE thinking of what information and how different parts interact. In other fields, people think of what the final picture is and then proceed to deal with the steps that will accomplish it. Everyone knows this is true. In fact, most people hate that clients don't know what they want. This is not true. They're just not speaking your language of defining steps. Instead, they're telling you the overall picture. It may not be complete. But so what?
In a house, I can remove a wall and put up another one somewhere else within hours. Even with electrical, plumbing and all that, it can be done relatively fast. The floor, walls and roof of a house can go up in two week with only two people. Heck, I'm doing that now and the floor will be done in under three days (with one of those days completely wasted). Another three days and the walls should be completed or close to it. But if there's anything wrong, I can fix it. If you want a deck, patio, or extension, no problem. Why isn't software that way? No, it's not because software isn't tangible.
It's because of many factors. Your computer is really dumb. That's the main reason. It knows how to do absolutely nothing on its own. Zero vocabulary. Once a piece of software terminates, all its know-how goes with it. There's no way to evolve. Next is that there's no way to define the overall picture in the tool that we use to create software. NONE! No other field works this way. Engineers have plans. From these plans, they build things. I've never seen a real design in software ever. No one has. It just doesn't exist. You don't believe me? Show me a design that indicates exactly in your code what it does and how you can change it to something else without breaking anything. Can't do it, can you? Real designs work that way. You can look at it and locate anything you want.
I say again that 99% of coding is design. Unfortunately, design in software isn't what I know as a design. And frankly, it's tough creating one the way we write software today. Can I locate anything I want in my code? Sure. But I write my code in a way that you can patch into it. I have a list of core services and then you can add your own services or make use of those services. Anything I want to track down, I just look at those services, or at what makes use of them. Services are things that handle certain types of data. So if you want to use data, you must assign a worker to handle it. Little did I know this is exactly how the real world handles physical structures. Workers are assigned to manipulate the actual "thing" you want modified.
I've mentioned in the past why I thought this worked so much better. I've also asked a few questions about why it worked so well. In programming terms, there are plenty of real reasons why this system works better. But this new way of looking at things where you assign workers to do the work on data just makes sense. I never really thought about it, but all the code that I have where I queue a worker to a service that handles the data works so much better. When a button must do some task, the activation code should not do the work itself. It should queue a worker to the service that handle that data and if it needs to wait for the results, then assign another worker for that part and return immediately. Doing the work directly from the activation routine is just asking for trouble. You're basically assuming you have control of everything when you probably don't. Am I off base here? What I like is that if anything goes wrong, you know what service it was, what worker it was and what activated it.
Every single time that I coordinated the data first and then assigned workers to those services, everything worked so much better and I could write MUCH larger software. The problem is that setting up these services along with resource handlers takes a long time. There are no built-in tools to let you program in this way. No really, there aren't any. Not where you can just click and it's done.
One thing that was missing in Project V that kind of annoyed me was what I was going to do with storage. You can't get away from it. All functional languages support storage nowadays too. In Project V, I was thinking too much in imperative terms of having components that retain data where you could request and update data (with lots of automatic management of this data). That has always bothered me. Now, I realise that a better way would be to assign workers to certain pieces of data depending on triggering information. This would allow yet another way to convert data by staying within the flow based methodology. So worker components can specify what data they need as inputs as well as the triggering information. When all the inputs are there, it can use and update the data. The outputs would also specify what services need updating. So coordinating data can be guaranteed to be consistent. The order in which the updating gets done is still the programmer's job to handle, but at least there's a way to get rid of most of the troublesome details. Plus, you can chain workers if you want. So you can guarantee order in that fashion. This means that certain workers can't be activated out of order. A protection that is lacking, well, pretty much everywhere unless you handle it manually.
An example of chained workers is when drawing a line. The initial component is triggered by a MouseDown event. Then the chain is passed on to the next two workers who are triggered by MouseMove which loops back and MouseUp which ends the chain. If you get a MouseUp while waiting for a MouseDown, the MouseUp component will not activate. In current languages, you must manually check for this. Also, events are triggered separately. You usually have a routine for each one. So something gets activated no matter what. This cannot happen in Project V with chained worker components. With the web, this would be of great value just to keep track of sessions between forms and page views. If they clicked back on their browser, you'll know instantly that they went back and can correct for this. How many web programmers take this into account? How many times did you get messed up data before you did? You can link up your web pages in the same manner. Just click and drag and you're done. Because dataflow software is concurrent, you can have as many sessions as you wish handled automatically. No more waiting on something. Instead you chain the next worker.
The more I look at this, the more it makes sense to work from the data first. It's no wonder that I used to hate web sessions and writing event handlers that must coordinate themselves. I was writing software backwards. It's still going on today and actively promoted. It's contrary to every known real world way of doing things. For me, I'm done banging my head against the wall. You can use these data-first techniques in your own software today. It takes a bit of setting up, but it's well worth it. You'll be glad you did.


Anonymous # 3. June 2007, 20:48
This is really interesting.
I've had the same thoughts, but i have just never really executed them in full fashion..
It would be really interesting if you could show a concrete (but of course, extremely barebones, and simple), showing your implementation of workers (preferably where a chain is used, so one can see the interaction between the workers.)
I like the idea very much, but i think i need to see a full example before i can begin thinking about how to put this into my own programming methods.
Overall, very interesting - and the analogies to real life construction vs, coding software also makes a lot of sense.
Futher writeups on this technique would be very much appreciated.
Anonymous # 4. June 2007, 11:54
What you are doing is a method for Topology Oriented Programming (i call it TOP):
Do the connections in the editor (when you are writing the text) instead of doing it in your head (when you are reading the text).
Or if someone makes a T-skirt:
POP
.Procedure
.Oriented
.Programming
The sound when
your head explodes
TOP
.Topology
.Oriented
.Programming
Let your editor
do the connections
instead
Anonymous # 4. June 2007, 12:15
Oh no! -POP-
I meant T-shirt :o)
Anonymous # 4. June 2007, 12:18
Oh no! -POP-
I meant T-shirt :o)
Anonymous # 4. June 2007, 18:54
I could not get past the spelling error in the second sentence. How can you expect to be taken seriously when you can't spell?
Anonymous # 5. June 2007, 14:53
I'm not sure. I think the code is the design. Writing it and refactoring as we go creates the walls, decks, and in the end the house. Sure it's all in your head or on a napkin. Isn't the code just design the complier can understand? The nail is something the wood can understand. The holes in the electrons something power can understand. Sure it's a good idea to measure twice and cut once. But with software you can get a new piece of wood really really fast.
Personally I had something similar with regards to data setup. I used the easymock stuff (love that dyno-proxy)for jUnit and interfaces for the data access so I could solve the problem of how to talk to the data or whatever insert service. That way I can focus on the small problems one at a time. I think we see all the little problems of the big picture all at once and get wrapped up in the details. This creates a large mess we can't seem to show the user fast. Seeing the big picture and then breaking out the small problems gets the design aka the code done fast. Now we can build walls and decks fast. BTW I'm not an Agile guy, I'm a do what fits guy for the client but end up using this way most of the time.
Vorlath # 5. June 2007, 19:58
From my point of view, code is just too slow of a mechanism to be productive. I'm searching for something faster. Something that doesn't slow me down. Something that is worthy of being used in the 21st century. We need something that can be added on with relative simplicity. Code is known to not have those qualities after a certain point.
Anonymous # 6. June 2007, 02:21
Vorlath, have you looked at graph transformations approaches to software engineering, also at programming models that use pattern matching and term rewriting rules like Maude? Personally I have had some look at them and their semantics are far awau enough from the von-neumann paradigm: no control instructions (i.e no flow), no threads, no variables as alias to addressable locations in memory, etc. I think they can be the humus from which a high level programming model could be constructed. Also Peter Wegner's work on Interactive Computation is interesting in the sense that he argues a paradigm shift from algorithms to interactive components.
In fact I am working myself on a programming model where agents communicate through async message passing, messages are typed semantically and structurally, semantical subtypes must be structural subtypes, each agent defines in its interface the types of messages it can receive as input and the type of messages it can emit as output. The runtime routes messages according to their structural/semantic type to all those agents that have declared to receive messages of that type or supertype as input, so decoupling is achieved, ala type-based subscribe or "implicit invocation". Finally each agent is programmed using term-rewriting rules that thransform the history of messages received, this is the state of the agent, some of these rules may also be defined to emit a message conforming to the agent's interface.
Would this approach be high level enough for you? If not, I would appreciate that you ellaborate. Thanks a lot.
Anonymous # 6. June 2007, 16:04
I think what is meant by the source is the design comes from the Jack Reeves articles.
http://www.developerdotstar.com/mag/articles/reeves_design_main.html
http://c2.com/xp/TheSourceCodeIsTheDesign.html
Maybe its better to say, the source is the blueprint (it's always up to date). The point made here is, we don't deliver source code to clients. We give them executable programs. Anyway, its nit-picky.
Back to the subject of design. If the actual writing of the source code is now "data entry", then why not make the computer do it? (I would, I'm a lazy developer, data entry sucks)
Hell, if the computer does it, why not skip the source code step and go straight to machine code. Now, your design documents probably need to be structured to simplify interpretation by the program that makes machine code from your design documents................
The rest of what you are describing in this post is what I would refer to as using the Unix philosophy. Make lots of small simple things that do one thing well. Then just wire them up.
Vladas # 6. June 2007, 17:11
First, let's take a simple example. There is UNIX and many things whithin it which called "Unix System Design". On the other hand there are many implementations of this design with absolutly DIFFERENT code. So, if the design is a code, which one of these implementations is the correct one?
By my experience I can say that code can have not a slightest connection to the design. It has the same connection to it like bird's excrements has connection to the bird and its design (I'm sorry for so rude analogy).
The code can only FOLLOW the design, and can follow either in a bad or in a good way. This depends only on implementor (coder).
The same connection, I daresay, is between programmers on one side, and coders or developers on the other.
I have designed and implemented a big ferry reservation system by recent 10 years. This is my main business now. By the years it transformed from a simple terminal application, through full-fledged Windows application to the Web app. Obviously, the code has totally changed since the first version, but the initial design - not! It was so good, that I follow it until now.
As for design-as-you-go principle, I think these are only words (or even a buzzword). In any case, before you "go-with-design" you must have the initial good design idea, which you can hold whereever you like - in your mind, paper, as a block diagram... Otherwise it is not the design.
Excuse me for my possibly teaching tone, it's just a mental attitude.
Anonymous # 7. June 2007, 02:05
They are all correct. You can design many solutions to a problem. Sure, they all do the same thing but that is the specification.
Do all birds shit the same? But the food is the design, the specification is to shit because of the digestion.
In the end it is run or not run if the design were that good then the code would be done. Because we (Coders) would write a tool to write the code. UML and Rational want that. Those tools fail because you put design in to the hands of the unqualified.
I don't understand your statement about the difference between programmers and coders/developers.
I would argue that your specification and requirement has not changed and that you changed your mind about the design and expressed your design with different code.
I just think that we may all be right. Like it or not we all solve this stuff different. I think because what we do is half Art half Technology and 100% Invention. You have not finished solving the problem until it compiles and meets the spec. If we only had one solution to a problem the world would fucking boring and lame.
Vladas # 7. June 2007, 07:18
In my work I pay very little time and attention to coding (about 10-15% of time). And most of the time I think about how to organize things in program (i.e how to design). My code sometimes looks ugly, and some folks still think that I cannot program. But it's just my tool to express the design. And this code just works. It's not intended to have a nice look or seen by others.
Having good coding skills is, ofcource, a must for programmer, but that's not enough. There is some more of programming.
You can have good painting technique, but it not necessarily means you are an artist.
Anonymous # 7. June 2007, 16:35
>If we only had one solution to a problem the world would fucking boring and lame.
Have you noticed in the database that this is, in fact, pretty much always the case?
If you don't follow the standard practices you're screwed.
Your application code can be cut down to nothing but some super objects if the way you store data in the database is not normalized. You can add and remove custom user input options at will once in the database and have them show up in your application with no additional work.
The only problem with that is searching your data requires self joins and is completely inefficient. In order to get around that you can store meta-data about your tables in the database and cut your application code way down as well, but then for pulling data you may end up doing lots more queries.
There should be ways to store data across a geometric surface other than a rectangle.
I mean, why should our application code have to account for every last detail in the database? I have seen pretty much all web based applications do this. It is just unmanageable.
Vladas # 7. June 2007, 20:57
"If you don't follow the standard practices you're screwed."
In fact, if you follow standards you are screwed
Standards says that database must be normalised, but no theory said, that in practice next thing you should do is - very subtly denormalise it! Then you'll get faster reports and a little slowed input. But anyway, input is nothing, compared to the fast reporting.
Theory is alway good until it faces the pracitcs
Anonymous # 10. June 2007, 18:10
Josh:
Check out Content Repositories. http://jackrabbit.apache.org/
and http://jackrabbit.apache.org/doc/arch/overview/jcrlevels.html
I think this what you are looking for.