Skip navigation.

exploreopera

| Help

Sign up | Help

Software Development

Correcting The Future

Just When I thought OO Couldn't Get DUMBER!

OO has got to be the greatest chameleon ever in the realm of programming techniques. I thought functional was bad, but I was incredulous to what I was reading in this article about OO.

On page 1, we get this classic quote:

First of all, think of an object-oriented system as a bunch of intelligent animals (the objects) inside your machine talking to each other by sending messages back and forth.


Uh... NO!

If you send messages back and forth, you end up with a stack overflow. That's the main problem with OO that I'm struggling with now which I have overcome. You need to often send messages and let the other animal process it later or otherwise the new animal can send something to another animal that may send it back to you and then there's no way out. Happens all the time. As I've said many times, if it was just about sending messages around, I'd be all for it. But there's execution in there too. No one talks about this beast and the problems it causes for some strange reason.

Page 2, we get this:


1. All data is private. Period. (This rule applies to all implementation details, not just the data.)
2. get and set functions are evil. (They're just elaborate ways to make the data public.)
3. Never ask an object for the information you need to do something; rather, ask the object that has the information to do the work for you.
4. It must be possible to make any change to the way an object is implemented, no matter how significant that change may be, by modifying the single class that defines that object.
5. All objects must provide their own UI.


Ok, if this is OO then it's the most flawed system of all. Let's go through each of these and see why this is so.


Point #1 (All data is private)

This is nonsense. If all data was private, there'd be no way to even send a message. Second, private data should NEVER exist. It's a flaw in OO that no one talks about. If you derive a class, you should have access to ALL its data so that you can update its functionality. Protected is fine, but private is fatal. The 'private' keyword should never exist in OOP. There's simply no use for it. It can only serve to limit the extensibility of an object.

Point #2 (get and set functions are evil)

WTF? I prefer properties, but this is just ridiculous. Exposing functions are just as bad as get and setters because functions require data just as much as get and setters do. If you lock in what functions need, then get and setters are the least of your worries. Besides, there are plenty of objects that have settings such as position and size for window GUI elements. I can think of plenty of examples where settings are found in objects. How about a preferences object!?

Point #3 (Ask not what you can do for you object, but what your object can do for you)

So I'm paraphrasing, but this is crazy. How exactly are you to follow this rule without breaking rule #1 or #2? If you are operating on two objects of the same type, then the private keyword becomes meaningless, yet if one of these objects is a derived type, then you've got a possibility of problems. If the objects are of different types, then you must break encapsulation. So it's not just about asking the object to do something for you. Anything worthwhile must involve different objects. What object takes the action? Not clear, is it?

Point #4 (Must be able to change the implementation by changing only the class of the object)

Well, it's nice and all, but it's not always possible. For example, you can't derive a class and change the implementation if the base class has private members. So again, rule #4 conflicts with rule #1 and rule #2.

Point #5 (All objects must provide their own UI)

ALL objects? ALL of them? Why? No reason for this.


These rules are followed up by this little gem.

If the system doesn't follow these rules, it isn't object-oriented. It's that simple.


I think someone forgot to tell the author of this quote that his rules are impossible to follow. Either his rules are wrong or OOP doesn't exist. Or I could be wrong, and I hope I am, but the problems I've listed are simple and common.

The idea is to organize the inevitable complexity inherent in real computer programs, not to eliminate it. Object-oriented designers, as a class, consider the elimination of complexity to be an impossible goal.


That says a lot right there. At least the author admits it. That deserves some amount of respect. But the above quote should be a deal killer under normal circumstances.

If anything, good object-oriented systems are more complex than procedural ones, but in such systems the program is better organized and thus easier to maintain.


So more complex < less complex. I think the author is saying that 2 < 1. I call BS on this, sorry.

For example, an object-oriented solution to the problems I just discussed requires a Name class, objects of which know how to both display and initialize themselves. You would display the name by saying "display yourself over there," passing in a Graphics object, or perhaps a Container to which the name could drop in a JPanel that displayed the name.


This is wrong on so many levels. First off, the "over there" in "display yourself over there" is under the care of another object. So who should do the displaying? The object in charge of the display or the object that has the information? See, whatever way you do it, you break rule #1 or rule #2. You MUST break encapsulation at some point. And rule #3 is untenable when multiple objects need to coordinate. What if the object with the data doesn't know how to deal with a Graphics object? That's the problem with OOP. If you make it handle a Graphics object, then you introduce coupling that is difficult to change later on. That's why you use an interface. But interfaces aren't found in any of the rules. Interfaces provide their own coupling by adding extra definitions that may or may not be implemented by external objects. Interfaces only reverse the delegation anyhow when used in this manner, so most of the rules end up being broken.

To see how an object-oriented point of view can solve the problems I've just recited, let's recast the earlier problem in an object-oriented way, by looking at the system as a set of cooperating objects that have certain capabilities.


This is again the root of the problem. What object should do the task of cooperating? Which one does the action and how do you access the data in the other object considering the rules laid out above? It can't be done. I also reject the notion of passing entire objects as messages. It's this kind of backwards thinking that creates all kinds of maintenance nightmares. You should have objects and you should have serialized data messages. That way, there's no encapsulation to break. Objects simply send data between each other. These messages can and should have functionality to store and extract information. It still doesn't solve the problem of WHICH object completes the task, but at least you now have a communication system that can expand.


I just had to quote this about an ATM example from the same article.


4. The Teller object asks the Bank_records object for an empty Withdrawal_slip. (This object will be an instance of some class that implements the Withdrawal_slip interface, and will be passed from the Bank_records object to the Teller object by value (using RMI). That's important. All the Teller knows about the object is the interface it implements -- the implementation (the class file) comes across the wire along with the object itself, so the Teller has no way of determining how the object will actually process the messages sent to it. This abstraction is a good thing because it lets us change the way the Withdrawal_slip object works without having to change the Teller definition.)


This is specific to Java because it has a JVM and is completely unrealistic. RMI? Really? I've used RMI. It sucks donkey balls. It's complete garbage. And sending an object by value through RMI? That's a contradiction in itself, but whatever.

Here's the problem with this scenario. First, as I said, it's specific to the JVM. Second, requiring a platform such as the JVM causes coupling and exposes implementation details of your object. Case in point, what if someone makes an ATM that has no VM, but uses C++ instead? Or what if they use a different VM? See, doesn't work now does it?

So this brilliant abstraction is dependent on the fact that you expose implementation details of what executes the code. You're basically turning your object inside out, and for what purpose? To easily change an implementation of a shared object? To leave out the machine description is one of the most short-sighted things you can do. Look at P2P. Any machine may link up to the network. The protocol is what's important, not the platform. In the above example, the protocol is RMI. Is that a good choice? I don't think so. Not if you want extensibility.

The main thing to notice in this second protocol is that all knowledge of how a balance or PIN is stored, how the server decides whether or not it's okay to dispense money, and so forth, is hidden inside the various objects.


Who thinks like this? Really! I'm literally smacking my hand on my forehead at how dumb this is.

So what the author is trying to tell me is that instead of using an open protocol, you should use distributed OO objects instead. This is EXACTLY the problem with OOP. Its message passing facilities do NOT extend to a distributed environment, especially when you want different kinds of machines to work together. In distributed environments, there is no need to return from a method. There is no need to respond to anything if it doesn't want to. Who handles the timeouts? How often do you retry before giving up? How many times do you retry? How do you write code that handles a disconnect? What do you do in such a scenario? These things can't be encapsulated because you need to handle these events directly.

All of the things the author is trying to simplify actually get worse. Much much worse. The example itself is bad because the author is trying to extend a local message passing system to a distributed environment without taking care of the extra things that can happen. Distributed environments are different. RMI is a shoddy system at best. It's not like you can simply take a system that works locally, move part of it to another machine by using RMI and think that everything is just fine. Message passing of a language does not make a protocol. They are two entirely different ball games.


This notion of asking an object to do something is pure fantasy. An object can't really do anything by itself. Most of the time, you need two different objects to interact with each other. So who breaks whose encapsulation and who actually performs the action? Should a String class know how to display itself? Should a Display class know how show a String? OOP doesn't provide answers for this. This is why I say that OOP necessarily introduces extra coupling that isn't really needed. At some point, you're gonna have to deal with the raw data. You can't use just methods. It's impossible. Try to add two numbers using only functions. It's impossible. At some point, you're going to have to pass some actual numbers around.

All the best systems involve decoupling different sets of functionality from each other. They communicate via data only (and NOT by methods or OOP message passing). Each system then takes a turn to process its data (or messages). Helper classes can be used to process incoming and outgoing data. This still provides all the benefits of using objects while decoupling your systems. It's basically data flow, but done manually. There are very real reasons for this and I hope the problems listed above show why.

Project V: Core GUI Functionality CompleteAnyone know how to detect 4th & 5th mouse buttons?

Comments

avatar
I'd like to see the proof for “if you send messages back and forth, you end up with a stack overflow” because I certainly can't see how sending messages back and forth lead to a stack overflow. I've worked on a couple of message-passing operating systems, AmigaOS and QNX, and never had that problem. AmigaOS has asynchronous message passing; QNX synchronous message passing, but either form can simulate the other so neither is really more primitive than the other.

You're example with the JVM is also flawed. I can't use your C++ code because it's dependent upon the Intel, whereas I'm using a PowerPC. Yes, requiring the JVM causes coupling and exposes implementation details of the object. But so does using the Pentium. Or PowerPC. Or any platform for that matter (hey! It's Project V! Why can't we use cheap Apple ][s for our ATM? That's just an implementation detail, right?) Also, the last paragraph in that section (the one that starts “So this brilliant abstraction is dependent”) just seems to contradict itself all over the place. For instance, you say “To leave out the machine description is one of the most short-sighted things you can do,” but if you are using the JVM, you aren't leaving out the machine description (no, you really aren't—the JVM is a machine, albeit one usually implemented in software. You do realize there have been physical, chip based manefestations of the JVM, right?). Also, if the protocol is the thing, then why is RMI bad? RMI is a protocol, right? I think your hatred of Java is blinding you.

By spc476, # 27. March 2008, 20:54:26

avatar
Nice article. Now I can see why I intuitively disliked classic OO. I felt the same when I saw getters and setters, because I didn't understand why I can't just access properties?! One may say it is needed to properly encapsulate (and interpret) the object data. Why on earth I need so? Getters/setters are only acceptable when we work in large teams. But even in this case, couldn't we just use good data specification, set out somewhere in separate place (or object)? Why mix the data and it's specification in one object?!!! This is the most wrong thing with OO, I think.

I could compare OOP with SQL and relational database model (mathematics). In database world if you strictly follow relational model you have the same set of problems. When you work with database in low level (4GL or native calls) then the overall system performance is WAY better. If you normalize the database to the end in order to use SQL - you get slow. This is a confirmed fact. But this is not only about low level. This also is about the structure of data. If you like to acheive better performance you should always slightly denormalize databases. And then it fits badly into SQL (and relational) model.

One example of SQL incompetentness:

in 4GL you can go by cylce of the subset of data table and then break iteration under some condition. There is no way to do this in SQL. In SQL you must always prefetch the whole subset of queried data, never mind you need only a little amount of data in some cases. Yes, I know, there are conditionals (functions) in SQL language, but they are static (compiled before query).

The same way I feel when I must use strict (pre-compiled) OO. That's why I like dynamic OOL like JavaScript, where I can organize my objects and data as it fits best my and my application needs. I can rearrange objects, their implementation and other things on the fly and exactly as I need at that time.

The theory is always good until it faces practice. In practice world (are we there?) there should be no place for overcomplicating theories.

Once again, nice take.

By vladas, # 28. March 2008, 08:10:37

avatar
Yoni Rapoport writes:

I you've got a couple things wrong about OOP. Firstly, OOP is about minimizing data flow and not necessarily eliminating data flow completely (which is impossible as you've mentioned. Passing messages which contain complex data structures is unnecessary, error prone and difficult to maintain as it poses the challenge of having both sender and receiver fully "understand" the complex data, it's validity and syntax.

Graphics.DrawString(string str) will always be easier to use and implement than Graphics.DrawString(string str, Font font, Color color, Style style). As more data is passed, more work is required to construct the message, deconstruct it, and handle all possible errors. OOP offers a way to deconstruct large messages into smaller ones by dividing data among objects and having them communicate by sending as little data as possible and expressing intent by choosing which messages to send as apposed to passing more and more data.

You are also wrong to think private members should be avoided. Protected members are actually public to derived classes and that makes them just as bad as public members.

By anonymous user, # 28. March 2008, 15:11:40

avatar
spc476: Messages in Java are function calls. They are not asynchronous. Recursive functions calls will cause a stack overflow or you will run out of memory.

Yes, requiring the JVM causes coupling and exposes implementation details of the object. But so does using the Pentium. Or PowerPC. Or any platform for that matter (hey! It's Project V! Why can't we use cheap Apple ][s for our ATM? That's just an implementation detail, right?)


You can use anything you want with Project V. It doesn't even need to be Project V just as long as you can understand the protocol. That's my point all along. If you have a protocol, it doesn't matter what you use. BitTorrent works with Java as with Azureus or C++ as with ABC or with x86 assembler as with uTorrent. Not so with Java messages or C++ message or SmallTalk messages.

Also, if the protocol is the thing, then why is RMI bad? RMI is a protocol, right? I think your hatred of Java is blinding you.


I've used RMI when I liked Java. I hated RMI back then because it was so horrible. It's still not any better today. Also, it's not simply a protocol. It involves sending code over the wire. And it fails a LOT. So there's a lot more going on than just a protocol. There's a runtime involved there too. RMI without the runtime won't work. That's why it's calle Remote Method Invocation.

Vladas: In today's world's, there's no reason why you can't use properties instead of get and setters. Properties can internally call functions when read or written to, yet they are accessed just like members. You can even restrict if the property is read-only, write-only or some other combination. This is a lot more intuitive. But as far as get and setters go, there is no reason why they should be more harmful than methods. I've yet to see a valid argument for why hiding ALL data is bad, especially when it has to do with settings. I'm glad you agree. I wonder what causes this negative view of data settings.

Yoni: What you describe are helper classes. These come from the 70's procedural way of programming. It's a very old technique and predates OOP. It's called structure programming. I should add that I'm complete agreement with you. Just realise that this is nothing more than a glorified data container and is not what the author of the linked article says OO is about.

You are also wrong to think private members should be avoided. Protected members are actually public to derived classes and that makes them just as bad as public members.


No, the 'private' keyword should NEVER be used in any software, ever. If you believe that you should use the 'private' keyword, then get rid of subclassing completely. I'm sorry, but you're using two contradictory arguments at the same time. Pick one or the other, but not both.

By Vorlath, # 28. March 2008, 16:45:57

avatar
I thought of writing something like "Stacks considered harmful" (in parallel world, indeed) article a while ago. And I've even started writing it. Will finish soon. I received more arguments after reading you.

By vladas, # 28. March 2008, 23:48:16

avatar
Yoni Rapoport writes:

I really think encapsulation is more than just glorified data containers. Once you start using objects which not only contain data members but also do not expose them in any way, which will allow internal changes without breaking external code, you will see the power of OOP.

Mind you that I am not saying you should pass raw data and then use the same "helper" class on both sides of the communication - that simply goes against DRY (don't repeat yourself) and is also error-prone and difficult to maintain. I'm talking about encapsulating the raw data and supplying the object which holds it with various other objects which contain much smaller methods - breaking one large message into many smaller ones.

Regarding the issue of private members I forgot to mention I also believe that implementation inheritance is evil (http://www.javaworld.com/javaworld/jw-08-2003/jw-0801-toolbox.html). However, in the rare exceptions when I use abstract classes instead of interfaces I think that communication between base and subclass should be via overriding abstract protected methods and not by directly accessing data members of the base class. This is very possible and does not contradict subclassing.

I would also like to comment that while I agree that adding two numbers cannot be done without breaking encapsulation - most algorithms contain conditions, iterations and complex data structures, all of which can and should enjoy encapsulation.

By anonymous user, # 29. March 2008, 10:16:51

avatar
I was talking about encapsulation as far as messages were concerned. I also disagree with your notion of "without breaking external code". That can rarely be done for the reasons I gave above. That interesting things only happen when multiple objects interact. Usually, this involves objects that have no knowledge of each other. So how do you make them work together? If you're talking about "asking an object to do an action", then you fall flat on your ass because this object known nothing of the other object.

Mind you that I am not saying you should pass raw data and then use the same "helper" class on both sides of the communication - that simply goes against DRY (don't repeat yourself) and is also error-prone and difficult to maintain.


Actually, I've started doing this extensively. It's awesome. I don't know why you think it's error prone. I've found the exact opposite. It reduces coupling, reduces errors and means that if I need to expand my protocol, I only need to change the helper object.

I'm talking about encapsulating the raw data and supplying the object which holds it with various other objects which contain much smaller methods - breaking one large message into many smaller ones.


This is nice and all. My only issue is that you cannot send methods over the Internet unless you have a compatible runtime on the other side. That's why protocols and OOP messages are two different things. Heck, we don't even know the internals of how most OOP messages are handled, though in C++ it's not too hard to figure out, even with name mangling and all.

Regarding the issue of private members I forgot to mention I also believe that implementation inheritance is evil


I agree with this. If you don't like inheritance, then your position is indeed consistant. Sorry about that. I'm gonna check out your link after I finish replying here.

I think that communication between base and subclass should be via overriding abstract protected methods and not by directly accessing data members of the base class.


Looks like I spoke too soon. How can you change the behaviour through methods if there's something you need to update, but there is no method that will accomplish this in the proper way? See, this is the biggest reason why reuse is a no go in OOP most of the time. You end up rewriting from scratch because there is no other option. But if the members weren't private, there'd be no problem. Inheritance is different than delegation. If you're gonna use inheritance simply as delegation of functionality, then there'd be no reason for inheritance in the first place.

I would also like to comment that while I agree that adding two numbers cannot be done without breaking encapsulation - most algorithms contain conditions, iterations and complex data structures, all of which can and should enjoy encapsulation.


I'm in full agreement. I just see too many examples where there is no concrete methodology as to the "where" and "when" of encapsulation. The items listed above in this very reply contain some of my objections towards this. Frankly, I believe that the main issue with OOP is with unwanted recursion and with decoulping of systems. Those are topics that should be talked about, but these have to do with flaws in OOP. So these issues get left behind.

By Vorlath, # 30. March 2008, 06:12:17

avatar
Yoni Rapoport writes:

"Usually, this involves objects that have no knowledge of each other. So how do you make them work together? If you're talking about "asking an object to do an action", then you fall flat on your ass because this object known nothing of the other object."

Objects know each other by interface. This means they are able to pass messages which may include references to other objects or (god forbid) even raw data. It is usually best to ask an object to do some work by calling a method and optionally sending in an object, rather than asking the object for data - because changing the implementation of a method without breaking external code is easier than changing the structure of raw data.

"Actually, I've started doing this extensively. It's awesome."

We are almost talking about the same thing... What if you could have your helper class not only encapsulate the process of reading and writing the data - but also encapsulate the data itself. Simply arrange for that class to have methods which send the data over the network and also have it register for receiving the data on the other end. This way the structure of the raw data can change without ever affecting external code.

"My only issue is that you cannot send methods over the Internet unless you have a compatible runtime on the other side."

I didn't say anything about the Internet, I'm only talking about software design. Design has nothing to do with network issues - it's about removing duplication, making it easy to change code bases and facilitating code reuse. OOP is not a distributed programming methodology, though I do believe OO concepts can be translated into good distributed programming practices.

"Inheritance is different than delegation. If you're gonna use inheritance simply as delegation of functionality, then there'd be no reason for inheritance in the first place."

The main point of inheritance is polymorphism. I don't use inheritance to gain access to a class's internal structure but rather to impersonate a base class (or usually an interface) in order to extend the functionality of a code base (which uses the base class) without changing it.

By anonymous user, # 30. March 2008, 20:29:49

avatar
What if you could have your helper class not only encapsulate the process of reading and writing the data - but also encapsulate the data itself.


No, the exact reason why I'm using helper classes is so that the data isn't encapsulated. Instead, I give it the raw data and it deals with it. I can reuse the same object over and over again. I can also use it in different projects so that the two can communicate together via any interface such as TCP/IP or even a serial cable. It doesn't matter.

Simply arrange for that class to have methods which send the data over the network and also have it register for receiving the data on the other end. This way the structure of the raw data can change without ever affecting external code.


No, the problem here is the method itself. The method should NEVER send the data. The helper class is simply to manipulate the data. NEVER to actually interact with anything else. It should only be used as a delegate. The socket engine would use the helper object to extract the information it needs and would then send that data out. What you're talking about simply does not work in large projects. In projects with less than 5000 lines maybe. Over that, with projects that I've been involved with that deal with 20,000 or often millions lines of code, it simply does not work.

If you do try what you're talking about, you must include two different sets of functionality. Not only does it need to handle its own data, but it must know how to interface with external objects, thus increasing coupling. With helper objects, the helper objects are 100% independent without any coupling at all. I then use classes in the conventional way to create a link between two or more helper classes in order to add more functionality. These can sometimes be super helper classes. But most often, they are independent systems that NEVER call other systems. Instead, you use data only message passing (as in TCP/IP or streams, but never OOP messages).

Another problem is that you still haven't talked about WHO does the action? Do you ask the object that has the data or the object that handles the output resource such as the screen? Even if not dealing with a screen, destination of the data is still important. So do you ask object1 to do an action with object2 as an input? Or do you ask object2 to do an action with object1 as input? That's the problem right there. It should be NEITHER! I can't tell you how many times I ran into this problem. Each set of data should have helper classes where you pass to it the data set. Then what regular objects do is use combinations of data sets with the help of helper classes. This reduces coupling as far as possible (while staying with language facilities such as OOP message passing). If you use data only message passing instead of OOP, coupling is further reduced and maintenance is extremely easy because you know that each system won't be affected by changes in another system.

I didn't say anything about the Internet, I'm only talking about software design. Design has nothing to do with network issues - it's about removing duplication, making it easy to change code bases and facilitating code reuse. OOP is not a distributed programming methodology, though I do believe OO concepts can be translated into good distributed programming practices.


I agree with the first part of this quote, but I strongly disagree with the last part. OOP can NEVER translate to distributed programming practices. If it did, concurrency wouldn't be such a hot topic. RPC, RMI and other techniques have always shown to be terrible. There's a real reason for that. Mainly, it's because there's an impedence mismatch. OOP is about requests and responses. The web is actually modelled around this and we're seeing the effects of that today as people try to come up with Web 2.0 (which died 20 years before its inception). But the main point is that request/response doesn't work well in distributed environments. What does work well is data message passing. While there is handshaking and SOME back and forth, the main point is that the back and forth is about maintaining the protocol and not about the actual information passed along. Take emails. You don't care much about a response. You just want to send the data. Take P2P. You don't care about receiving notifications. But you do want to send out what you want as well as receiving what you need, but the requests and responses aren't coupled. If one node doesn't give an answer, it will ask someone else. That's the missing ingredient with OOP messages. You must wait until you get a response and you MUST get an answer specifically from the entity where you made the request. That fails in concurrent environments.

The main point of inheritance is polymorphism. I don't use inheritance to gain access to a class's internal structure but rather to impersonate a base class (or usually an interface) in order to extend the functionality of a code base (which uses the base class) without changing it.


Sorry, but that position is untenable. You cannot change its functionality without having access to its data. If you can, you are not doing anything interesting and are probably doing it wrong. Take a complex number for example. If I want to extend its functionality by adding rotation or retrieving it's angle and scalar value, then you're out of luck. You have no access to the data. So you have to start from scratch. If you can simply use the functionality that's already there, there's no need for subclassing. You can just use delegation.

I'm writing a GUI system right now and have a base class for on-screen GUI Elements. I need access to the data contained there. I need access to the position. I can't just use methods to manipulate this because the SetBounds methods does things that I don't want it to do while in the MouseMove event. SetBounds will cause a screen redraw. In the MouseMove event, I may want to trigger this redraw later on or not at all. I also need to update the internal state of the object so that the redraw system and the GUI system are consistent. And adding methods to the base class makes no sense when that method belongs and is used by only ONE of the subclasses.

By Vorlath, # 31. March 2008, 15:23:48

avatar
Yoni Rapoport writes:

"Each set of data should have helper classes where you pass to it the data set. Then what regular objects do is use combinations of data sets with the help of helper classes."

And then there were three!

Now you have a helper class, a data set class and a socket engine class which uses them both. Let's say a new data item is introduces into the data set:
1. The data set class has to change to include the new data.
2. The helper class needs to change because the new data will probably require some attention and manipulation.
3. The socket engine (!!!) needs to change because that class sends the data by adding each item to the transmission.

If this is close to the situation you are referring to then you cannot claim it is decoupled.

A more OO solution would be to combine the data set and helper classes into one class which may include a method:

void sendData(DataSender dataSender)

DataSender may include a method:

void SendBytes(byte[] bytes)

or whatever... And one implementation may be the socket engine.

This way whenever the data set changes only one class would change. The socket engine will only change once something changes in the way socket data transfer works. This is decoupling.

"OOP is about requests and responses"

You are confusing OOP and procedural programming. OOP is, as the article you quoted describes, about passing messages and not about requesting anything (Java allows both and you as a designer have to know what you are doing). I am sure some OO patterns make some use of the stack but if you get down to the basics of OO asynchronous messaging can be easily applied. Moreover, the concept of encapsulation fits nicely with the idea of concurrent running processes - each with its own inputs and outputs and its own memory. But please don't get me wrong, I'm not implying Java can be a multi-core language using RMI, I'm simply implying that OO concepts may come in handy when designing distributed systems.

I'm out of time so I'll have to reply to the last part of your last comment at a later time.

By anonymous user, # 31. March 2008, 17:57:31

avatar
Now you have a helper class, a data set class and a socket engine class which uses them both. Let's say a new data item is introduces into the data set:
1. The data set class has to change to include the new data.
2. The helper class needs to change because the new data will probably require some attention and manipulation.
3. The socket engine (!!!) needs to change because that class sends the data by adding each item to the transmission.

If this is close to the situation you are referring to then you cannot claim it is decoupled.


The data set class IS the helper class. So your scenario doesn't make much sense. Unless you mean that the data set class is simply a data structure, then yeah, ok. There is data and then there is the helper class.

First off, #1 would apply in OOP too.
#2 would apply as well in OOP. Only difference is that the data is separate from the object. But is this that far removed from interfaces? I don't think it is.

And #3 is wrong. The whole point of using helper classes is so that the main system doesn't need to change at all. Or if it does, it's to add extra functionality. For extra data, there is clearly no need for the socket engine to change.

As an example, I have a buffer helper class. You can insert strings and extract strings as well as integers and other data types. What it does is correctly format the buffer data so that it's ready for transmission. It can also receive data (by giving it raw data) and you can extract strings, numbers and other data from it. Other helper classes will use this helper class to extract other information it needs if necessary.

What this means is that the buffer helper class is 100% independent. You can use it no matter where the data comes from. You can even insert extra data into the feed if you need to. This is often done on Windows to insert keystrokes that are converted from mouse clicks. In any case, the socket engine only deals with raw data, as it should. And anything that deals with formatted data uses the helper class to convert to and from raw data. The helper class acts as a library (or a tool) for other objects. While there is some dependency there, it's far better than the alternative. I've reused this code many times over already without any changes to the core system or to the helper class and is extremely easy to maintain. In fact, I don't remember the last time I changed it.

I have numerous examples like this. For example, I have a game engine where you can swap not only the game rules, but the game data. So I can reuse the same object for different tables. Why would I not combine them? Because I want to make sure that the helper class is 100% independent. Different parts of the system need different data from a game table. Having a helper class that each one can use makes perfect sense because it requires that this helper class NOT use anything outside object or data. IOW, other objects can use it, but it cannot use other objects. It forces me to be 100% coupling free. With game engines, it makes it easier to have different systems on different machines. If I had used a complete object with data and functionality in one, then there's just no way that I could decouple the dependencies in order to move the code to another machine. The helper class makes it easy because it only cares about data, and data is easy to send over the wire. Along with the buffer helper class in the socket engine, creating a link between the two machines is a snap. I can mix and match whatever I need. See, I make a distinction between user classes and helper classes. One acts as functionality toward your goal (IOW it is unique), and the other acts as libraries.

There's an old analogy about OOP. Not everything can be asked to do something. For example, paper is something that you act upon. You don't ask the paper to write on itself. Same thing with a pencil. You don't ask it to do anything. Instead, you use it. In the software world, the human would be the object and then you would have a helper object that would handle the data (pencil and paper) for you. That's how I model my software.

BTW, your OOP version is quite scary.

void SendBytes(byte[] bytes)


Seriously? C'mon! Apologies for my tone, but this is far removed from good OOP. No disrespect intended.

as the article you quoted describes, about passing messages and not about requesting anything


What? It specifically says that OOP is about requesting an object to perform an action.

I am sure some OO patterns make some use of the stack but if you get down to the basics of OO asynchronous messaging can be easily applied.


No. OO and asynchronous messaging don't go hand in hand. This is the entire reason why OOP doesn't translate well to distributed computing.

I'm not implying Java can be a multi-core language using RMI, I'm simply implying that OO concepts may come in handy when designing distributed systems.


The part about local data, I have no problems with. And I do take a lot from OOP. I just don't like a lot of it like private members for derived classes, or message passing that is coupled with execution or the single return path or the request/response methodology, etc...

By Vorlath, # 1. April 2008, 06:44:17

avatar
Yoni Rapoport writes:

I'm sorry if I'm not making sense, I'm simply trying to understand your methodology.

Let's take your buffer helper class for example. Is this something like this?

class BufferHelper
{
public static void WriteString(byte[] buffer, string s){}
public static void WriteInt(byte[] buffer, int i){}
//....
}

or rather something like this:

class BufferHelper
{
byte[] _buffer = new byte[0];
public void WriteString(string s){}
public void WriteInt(int i){}
//....
}

Of course the second class is OO while the first is procedural. The OO solution will eventually allow switching the byte array to other internal representations without breaking external code, the procedural solution won't. This is what happens when you separate data and logic. Using the first class is also error prone because the buffer may change externally and be corrupted.

BTW, can you elaborate on why "void SendBytes(byte[] bytes)" is not good OOP?

It's also interesting that you divide your classes into user class and helper classes. Are your user classes encapsulated? What is the difference in terms of OOP, message passing and data flow?

Regarding the issue of whether OO is based on request/response or not I think we'll have to agree to disagree. It is true that Java and other OO languages are stack based and allow return values but this has more to do with enabling procedural programming than with OO. OO principles such as "Tell don't Ask" and "Interface segregation" advocate strictly one-way communication which is why I think asynchronous messages fit well. I can hardly remember when I last wrote a method with a return value or any code which relies on the stack for that matter...

By anonymous user, # 1. April 2008, 22:09:37

avatar
The OO solution will eventually allow switching the byte array to other internal representations without breaking external code, the procedural solution won't.


If you do this, then you're no better off than the procedural method because you still have to rewrite all the code that deals with the internal data structures. To think that encapsulation somehow resolves this is completely unfounded.

About how I would define a buffer helper class, I'll give you a real life example.


// Setup a callback so that when a line comes in it can be processed.
// data is anything you want to pass to it, sb is this class.

class SockBuffer;

typedef void (*SBCallback)(void *data, SockBuffer *sb);

// Handles reading and writing to a socket.
// Also retrieves lines.
class SockBuffer
{
protected:
SOCKET sock;
int epoll_fd; // The epoll file descriptor.
char *buffer; // input buffer
int buffer_length; // # bytes used.
int buffer_size; // # of bytes allocated.
char *send_buffer;
int send_buffer_length;
int send_buffer_size;
SBCallback callback; // This gets called when data is available.
int flags;
void *data;
list<char*> *lines; // queue of lines
int LockCount;
void Setup();
public:
SockBuffer();
~SockBuffer();
void SetSocket(SOCKET sock, int epoll_fd);
void SetCallback(SBCallback callback, void *data);
void CheckReadBufferSize(int len);
void CheckWriteBufferSize(int len);
virtual void AddReadBufferData(char *data, int len);
virtual void AddWriteBufferData(char *data, int len);
virtual bool ReadyToSend();
virtual bool ReadyToRead();
bool IsLineAvailable(); // Checks if a line is available.
char *GetLine(); // Returns line without the newline.
virtual void SendText(char *text); // Sends data.
virtual void Flush(); // Tries to send remaining data.
void SetFlags();
void AllowReadWrite();
void AllowReadsOnly();
void AllowWritesOnly();
void DisableWrites();
void EnableWrites();
void DisableReads();
void EnableReads();
void DisableReadWrite();
inline SOCKET GetSocket() {return sock;};
inline void *GetData() {return data;};
bool IsLocked();
void Lock();
void Unlock();
int user_data; // use it for whatever you want.
};

enum SSLError {ssl_ok, ssl_write_want_read, ssl_write_want_write,
ssl_read_want_read, ssl_read_want_write, ssl_accept_want_read,
ssl_accept_want_write };

// SSL version of SockBuffer
class SSLBuffer : public SockBuffer
{
protected:
SSLError ssl_error;
int write_length;
char *read_cache;
SSL *ssl;
int SendWriteBuffer();
int ReadInputBuffer();
public:
SSLBuffer();
~SSLBuffer();
void SetSSL(SSL *ssl);
bool Accept();
bool ReadyToSend();
bool ReadyToRead();
void AddReadBufferData(char *data, int len);
void AddWriteBufferData(char *data, int len);
inline SSL* GetSSL() {return ssl;};
void Flush(); // Tries to send remaining data.
};


See how I subclassed the original helper class? I can do that because the data is protected. If it wasn't protected, I couldn't do it. All that functionality wasn't in there at first. I added as I developed the software. But this helper object is used specifically for strings and sockets on Linux. The Windows version is identical except it doesn't use epoll.

This is a helper class as it should be. It does NOT call any other object. It is meant to be used. However, it NEVER uses anything else. This means it's completely reusable for all my socket needs on Linux or Windows. That's why I can add functionality and know with 100% certainty that it won't break anything (because it doesn't use any other objects). And about the callback, it is used to parse commands as they come in and put them in the command queue.

This object can be used with any data. Once you're done with one socket and buffer, you can empty it and use it with different data if you wish. This is often useful when you need to clone the object in order to send the same data to multiple locations. Note that it doesn't clone the internals. You do NOT use constructors with this. They are manually set up though you could use constructors if you wish, but that is contrary to the notion of helper classes which should take any data at any time. You'd just end up replicating functionality that's already there.

edit: I should add that this helper class makes it possible to break through the 10,000 socket barrier (with epoll). I can actually have well over 100,000 sockets if there's enough memory and bandwidth.

It is true that Java and other OO languages are stack based and allow return values but this has more to do with enabling procedural programming than with OO.


Unfortunately, that is incorrect. It has to do with the fact that messages in OOP are done via the execution point. So what goes in must come out. Even if there were no return values, you'd still have the same problems with the execution point and with the stack.

I can hardly remember when I last wrote a method with a return value or any code which relies on the stack for that matter...


If you use OOP messages, then you use the stack. It's not something you have a choice in.

By Vorlath, # 2. April 2008, 04:15:47

avatar
Oh, forgot your question about SendBytes(byte[] bytes). It's that byte[] is raw data. You're exposing unformated entities. Not very OOP. That's nice for helper classes and internal methods, but not for general OOP usage (what I call user classes).

I should add one more thing that's been bugging me. This is concerning your comment about changing the internal data structure. I found out a long time ago that you should never do this. What I have are helper classes. So if I need to change the internals, then it means that the socket system has also changed the data it is using. So what I do is write another helper class for that data type leaving the original intact so that other parts of the system aren't broken. The SSL helper class is an example of this happening.

I was using the regular helper class for remote users (live humans) and for the remote database. Then I added SSL functionality. I only changed the users socket handling by replacing the helper class with the SSL one. The one for the remote database stayed the same because it doesn't need to be encrypted as the database should be on the same internal network not accessible by anything else.

That way, you can grow systems from more primitive parts. You can remove and replace parts with ease as well as adding new ones. It's a maintenance paradise.

By Vorlath, # 2. April 2008, 04:18:28

avatar
"Messages in Java are function calls. They are not asynchronous. Recursive functions calls will cause a stack overflow or you will run out of memory."

Yes, this can happen in any langauge that allows recursion, even C++. So what about an operating system like AmigaOS or QNX? They have messages and functions. I'm so confused ...

"I should add one more thing that's been bugging me. This is concerning your comment about changing the internal data structure. I found out a long time ago that you should never do this. What I have are helper classes. So if I need to change the internals, then it means that the socket system has also changed the data it is using. So what I do is write another helper class for that data type leaving the original intact so that other parts of the system aren't broken. The SSL helper class is an example of this happening."

That way lies madness, or Windows, take your pick.

By spc476, # 2. April 2008, 20:34:54

avatar
Yes, this can happen in any langauge that allows recursion, even C++. So what about an operating system like AmigaOS or QNX? They have messages and functions. I'm so confused ...


This is what I've been saying all along. Messages like you find in AmigaOS or Windows are NOT like messages in OOP languages. That's the whole point I've been trying to make. One does not translate to the other. But the article I linked in the original post tries to say otherwise.

Also, about OS functions, if you add a callback that calls an OS function that then triggers the said callback, you're gonna be in a world of hurt after you initially call that OS function.

That way lies madness, or Windows, take your pick.


Untrue. It took Windows this long to realise what it was doing wrong. The D3D interfaces took years (over a decade) to get right. Version 9 of D3D is built exactly like I describe in my previous comments. I was surprised at how well it was put together. Vertex lists are probably the most visible use of what I'm talking about. You can configure your lists any way you want. You then tell the interface what the layout is and it'll take care of it for you during the rendering stage. Or you can use vertex shaders and again use it like you want. You can even have multiple vertex lists. Another area that is like this is the fixed function pipeline. You have an interface that configures each one. It's a little on the procedural side since you have to give it an index of the stage you want to configure, but it works exactly like a helper class.

Code is MUCH cleaner when done this way. Like I said, it's a maintenance paradise. I even created my own helper class to handle RTT textures. I can build on top of what's there. The software can grow. That's the missing element with OOP. OOP cannot grow. Not for large software anyhow.

By Vorlath, # 2. April 2008, 21:45:14

avatar
Yoni Rapoport writes:

First of all your SockBuffer class is very much encapsulated! Most of its internal data is hidden from external code and most communication with it is done by passing messages with little data (i.e. "AllowReadWrite") or by asking for very little information (i.e. "ReadyToSend").

I do have a couple of problems with this class:

1. "void SetSocket(SOCKET sock, int epoll_fd);" - why isn't this data passed in the constructor? Also, I'm not sure what SOCKET stands for, but it is both passed into the object and can later be retrieved - which is a situation which should be avoided. If SockBuffer uses SOCKET then maybe it should encapsulate all usage including that which happens when other code fetches it using "GetSocket". I really think this can help and make SockBuffer easier to switch with a different implementation if needed later on.

2. "void SetCallback(SBCallback callback, void *data);" - same question: why not add this to the constructor?

3. "GetLine()" and "GetData()" - it would be interesting to see the code which uses these methods. I think a "push" mechanism (by sending in an object which has "AddLine" and "AddData") will result in less code and less duplication than this "pull" mechanism.

4. Inheritance - I strongly oppose implementation inheritance (inheriting from non abstract classes). This introduces the notorious fragile base class problem and eventually makes the base class unmaintainable. When data members are marked as protected the base class no longer controls all access to them and cannot predict what values they might hold at any given point in time - this adds a lot of stress to the process of maintaining the base class. In your example, I would recommend adding an abstract base class to hold the public methods of both SockBuffer and SSLBuffer (so they remain inter-exchangeable), and then use composition by having SSLBuffer create an object of type SockBuffer and use it.

5. Class size - it would be interesting to see just how long SockBuffer is in terms of code lines. It seems to me that much of its implementation, such as the locking mechanism or the effect of the "flags", can be delegated to other objects (constructed internally).


Note, however, that in general, this class uses methods (message passing) much more than data flow. There are very few getters and setters and it even uses a callback which means it calls an externally supplied object. So all in all it is very much OO. I think that although you refer to it as a helper class, it is actually a part of the socket engine - do you see any usage for this class outside the context of a socket engine?

You should design your "user classes" in the exact same manner. OO is OO and classes are classes it doesn't matter what their implementation is like. The same benefits encapsulation and polymorphism bring to your helper classes can help your user classes as well.

By anonymous user, # 4. April 2008, 20:08:29

avatar
"void SetSocket(SOCKET sock, int epoll_fd);" - why isn't this data passed in the constructor?


Because it's a helper class. You need this as standard functionality. Helper classes MUST work with any data (that is specific to the helper class). So you need to be able to set the socket information at any time. You can add it to the constructor if you wish as I've said earlier. But there's no real need just to save one line of code.

For example, if you have data you want to send out before you've completed the connection.

Why would I want to change the socket instead of creating a new object? Usually you dont. However, that's not what it's about. It's about being able to mix and match data and functionality. If there was no access to this data, then you're limited in what's possible. There is no way that this class can know every possible usage as far as sockets are concerned. No one can predict the future. So you need to make this stuff accessible so that it is extensible later on. Also, sometimes helper classes are used one time per data item. Allocating and deleting objects every time would be asinine especially in a socket handler's loop.

Also, I'm not sure what SOCKET stands for, but it is both passed into the object and can later be retrieved - which is a situation which should be avoided. If SockBuffer uses SOCKET then maybe it should encapsulate all usage including that which happens when other code fetches it using "GetSocket". I really think this can help and make SockBuffer easier to switch with a different implementation if needed later on.


SOCKET is an OS type on Windows and I've defined it as an int on Linux (again used by the OS). It's like a file ID, but for sockets.

Setting and retrieving socket is critical for the helper class to work. Without it, you couldn't expand on what's possible externally to the helper class. For example, a socket engine needs access to the socket ID so that it can use OS socket calls. That's beyond the purpose of the helper class and that's why it's not included there.

Your last sentence doesn't make sense. Helper classes aren't meant to switch implementations. They are meant for a specific set of functionality and data. If you want to change the implementation, build a new helper class and don't break the old code.

"void SetCallback(SBCallback callback, void *data);" - same question: why not add this to the constructor?


Because different functionality can use the same helper class (hence different parts would switch to their own callback and then restore the original when done or passed along). Again, it's not about doing this ALL the time. It's about leaving it accessible.

3. "GetLine()" and "GetData()" - it would be interesting to see the code which uses these methods. I think a "push" mechanism (by sending in an object which has "AddLine" and "AddData") will result in less code and less duplication than this "pull" mechanism.


No, that's bad programming practice to use reverse invocation. The helper class is NEVER to use another object. Otherwise, it introduces coupling. Exactly what I don't want. What you're talking cannot be sustained in large programs because you never know who's responsible for what.

The code that uses GetLine and GetData are obvious. GetData is used by anything that isn't socket related like logging output to a file instead of sending it out on a socket. Again, this is about flexibility and not changing the helper class. It's about growing software. Not about changing what's already there. BTW, GetLine is used by everything from the game engine to the command engine to the chat engine.

4. Inheritance - I strongly oppose implementation inheritance (inheriting from non abstract classes). This introduces the notorious fragile base class problem and eventually makes the base class unmaintainable.


The problem with this logic is that you're thinking about changing the base class in the first place. When you grow code, you don't do this. You leave the original intact. You only add on top. This changing code notion is very strange. Where did you get such an idea anyhow? I've never seen that work on large software, ever.

When data members are marked as protected the base class no longer controls all access to them and cannot predict what values they might hold at any given point in time - this adds a lot of stress to the process of maintaining the base class.


Why are you maintaining the base class in the first place?

In your example, I would recommend adding an abstract base class to hold the public methods of both SockBuffer and SSLBuffer (so they remain inter-exchangeable), and then use composition by having SSLBuffer create an object of type SockBuffer and use it.


The problem with your scenario is that SSLBuffer wouldn't have access to the internals of SockBuffer if I use encapsulation (and not use get and setters). That's why SSLBuffer is derived from SockBuffer. SSL code needs access to the socket and to the buffers directly. Besides, it's redundant to just patch through most of the functions when only a few of them are changed.

5. Class size - it would be interesting to see just how long SockBuffer is in terms of code lines. It seems to me that much of its implementation, such as the locking mechanism or the effect of the "flags", can be delegated to other objects (constructed internally).


It's 850 lines. Half of it having to do with SSL handling. If you've ever done SSL, then you know how many different error conditions there can be.

About delegation, you're missing the point about helper classes. They should not use other objects. Well, they can if you're writing a compound helper object, but that's another story.

Actually, the locking has to do with the buffer and socket handling. It's not about multithreading. When it's locked, the data is not sent or retrieved from sockets. This is necessary when the helper class is used inside the socket engine in order to avoid the same socket triggering its own OS events over and over resulting in the starvation of the other sockets. It's a common problem when dealing with over 10,000 sockets. You could claim it's bad naming, but I called it that because it's like locking a gate.

Note, however, that in general, this class uses methods (message passing) much more than data flow. There are very few getters and setters and it even uses a callback which means it calls an externally supplied object. So all in all it is very much OO.


About this using methods, I AM using C++. There's not much choice in the matter. But it is data flow in the sense that raw data (whether text or just SSL data) is being used and passed along. If anything this is a perfect example of how a TCP/IP data channel would be implemented in a data flow environment. This helper class would make that possible.

About the callback, you'll note that it's not dependent on any particular object. That's what the "data" field is for.

sh->SetCallback(myCallback,this);

myCallback would then call a specific function within the "this" object. I'm still very reluctant to put in callbacks. I would agree if callbacks are bad practice for helper classes. But they served their purpose for sending data to the proper system. IOW, you should never attach a callback from another system. If this helper class is used in the socket system, then the callback should also be in the socket system.

Is it OO? I really like helper objects, but there's no encapsulation, there are getters and setters, there is no copy constructor, it uses a lot of raw data and never calls another object. It's not what most people have in mind when you talk about OOP.

I think that although you refer to it as a helper class, it is actually a part of the socket engine - do you see any usage for this class outside the context of a socket engine?


Yeah, the database part of the software uses it. I used to also have a separate chat engine (external software) that used it, but I've since merged it into the core server. The database is used through TCP/IP. There's a generic front end that receives generic database commands and then translates that to specific database calls. This is the biggest reason for the callbacks. When a response arrives for a query, the proper code is notified of it.

Also, like I said before, the game engine, command engine and chat use these helper classes. The client also uses it for chat and inserting moves when you play. Note too that the client is on Windows and the server is on Linux. There is no epoll on Windows, so that part is different and rightly so. Helper classes are meant to be for specific uses.

You should design your "user classes" in the exact same manner. OO is OO and classes are classes it doesn't matter what their implementation is like. The same benefits encapsulation and polymorphism bring to your helper classes can help your user classes as well.


You can't design everything in the same manner. It simply doesn't work. That's the way my software (where the above code comes from) was before. Though I was using encapsulation and passing objects around and asking objects to perform actions just like you mentioned, it was pure nightmare. Maintenance was becoming increasingly difficult. Any little change would affect everything else because of encapsulation. Any time a part of the system needed extra info, you'd have to change the object in charge of it to handle the extra functionality, but that would require changes in other parts and it never ended. Encapsulation flat out just doesn't work.

What I do now is specify exactly who is responsible for what. The above helper classes are responsible for socket and buffer interaction. Only that specific interaction. Every time there is an interaction, I create a helper class. Every time there is ONE single kind of functionality (such as the socket engine), then I create an object (or several objects) where there is a main object and all others can only be used by this main object. In any case, there is a clear hierarchy. If object A can call object B, object B cannot ever call object A or anything that can call object A. And under NO circumstance do you EVER pass around an object from one system to another.

Systems communicate via data ONLY. That allows me to mix and match systems. If I need a socket engine, I plug it in. If I need a database engine, I plug it in. If I need to create a new system, I can write it and plug it in. Same thing for chat engine, physics engine, game engine, etc.

Then comes the main loop. This object is responsible for making sure that each system gets its fair share of execution time. So it alternates between systems. That's why it's important to only send data between systems. Otherwise, you will change a state and corrupt the system or worse yet, create a recursive feedback loop that will slow or crash the entire thing.

Encapsulation is a no show in large systems. Use encapsulation at your own peril. The things I talk about are impossibe if you use it.

Polymorphism is useful in a lot of cases. Mostly, it's useful for crossover functionality. Where you have a subset of functionality that is shared amongst different objects. For example, if you want to put different objects in the same list, then these objects would share a common interface. I use this in my GUI engine for Project V. All objects have a position and size. No point in making this abstract. It's always gonna be there.

You have to be careful how you think of OOP. Your suggestion about the AddLine in an external object is one such case. One is supposed to ask an object to accomplish some task, yet you suggest that you pass it another object that will actually do the task. This other object is usually the one doing the request in the first place. So what you end up with is a serious risk of recursion if you need yet another line in order to complete the task. This is exactly why helper classes don't call other objects. Even the callback, I'm not too fond of.

Hope this clears things up. I do agree that the SockBuffer class LOOKS like OO in parts. But that's because this is what OOP strives to accomplish when it talks about less coupling and reusable objects. I just think the other stuff gets in the way of these objectives and when you get to more complex tasks, it's impossible to avoid coupling with that kind of methodology. Then you get the big ball of mud. The way I do things, I can avoid the big ball of mud.

By Vorlath, # 6. April 2008, 05:20:26

avatar
Yoni Rapoport writes:

"The helper class is NEVER to use another object. Otherwise, it introduces coupling. Exactly what I don't want."

I agree that every method call introduces coupling between caller and callee (they both have to change when the method signature changes). The thing is that this form of coupling is usually easy to maintain because it is explicit and checked at compile-time. The other form of coupling which causes most bugs in software is that which happens when code in two different locations relies on certain data to be formatted in a certain way or contain certain data items at a certain time. This form of coupling is implicit, it is not checked at compile-time and eventually leads to code which is difficult to maintain. In your case such coupling may occur when the 'data' member of SockBuffer is fully exposed to both its subclasses and to code which uses SockBuffer.

If method calls are something you regard as problematic then OO is probably not for you. I'm not sure what you mean by the "risk of recursion".

When I mentioned that your class is very much OO I was mostly referring to most of its members (i.e. epoll_fd, buffer, buffer_length, buffer_size, etc.). If only you marked these members as private you would end up with the precious ability to change most of SockBuffer's internal structure without impacting external code.

"Yeah, the database part of the software uses it. I used to also have a separate chat engine (external software) that used it, but I've since merged it into the core server."

I don't see how the database and chat part can know what SOCKET or epoll_fd is. It seems that the only system able to fully construct and prepare the object is the socket engine.

"The problem with this logic is that you're thinking about changing the base class in the first place. When you grow code, you don't do this. You leave the original intact. You only add on top. This changing code notion is very strange. Where did you get such an idea anyhow? I've never seen that work on large software, ever."

This may very well be the core of our disagreement. For starters, I would like to refer you to an excellent book called "Refactoring" by Martin Fowler. This book, among others, advocated constantly changing code and internal structure of classes to improve design and facilitate making unanticipated changes. Code has to be able to change because requirements and features change. Every line of code I write is correct when I write it but may become obsolete or even wrong a day or a year later. That's why every line of code I write needs to be able to change safely and easily at a later time. Only adding on top is what gets you to the million-code-line mark way too soon. You end up with large amounts of legacy code which you can't change or remove, even though much of it may be underused and buggy.

I think you are confusing extensibility with maintainability. Code which is completely extensible (or rather hackable) but cannot change (due to protected members for example) simply does not scale well to large applications. Other common methodologies, such as Agile Development and Test Driven Development, advocate making constant changes to your code to accommodate change, instead of allowing unlimited extensibility beforehand (YAGNI - you ain't gonna need it).

So you see, I think the entire discussion is pointless if we see maintainability, and thus quality of software design, in a different way.

By anonymous user, # 10. April 2008, 21:26:31

avatar
I agree that every method call introduces coupling between caller and callee (they both have to change when the method signature changes). The thing is that this form of coupling is usually easy to maintain because it is explicit and checked at compile-time.


It's not easy to maintain because the coupling between the caller and the callee is more than just that. It means that the caller's object cannot exist without the callee's object. This means that you must maintain two objects and their interdependencies while a helper object has no external dependencies and less complexity.

The other form of coupling which causes most bugs in software is that which happens when code in two different locations relies on certain data to be formatted in a certain way or contain certain data items at a certain time.


Like HTML? Like databases? Like images? Like movie files?

What you talk bout is the exact reason reason helper classes exist.

This form of coupling is implicit, it is not checked at compile-time and eventually leads to code which is difficult to maintain.


You state no reasons why this would be so and is contrary to my experience.

In your case such coupling may occur when the 'data' member of SockBuffer is fully exposed to both its subclasses and to code which uses SockBuffer.


No! Again, you state no reason why this is so. My SSLBuffer class uses it and no coupling as you speak of occurs. Note that there is coupling. But short of rewriting the entire class, there is no way to achieve the functionality required. That's the problem with encapsulation. So no matter what option you take, there'll be coupling. I simply prefer the more maintainable of the options which is using helper classes.

If method calls are something you regard as problematic then OO is probably not for you. I'm not sure what you mean by the "risk of recursion".


I wrote the following off the top of my head.


class UIElement;

typedef void (*UIBoundsEvent)(UIElement *source, int oldx, int oldy,
int oldwidth, int oldheight);

class UIElement
{
protected:
int x,y;
int width,height;
UIBoundsEvent boundsEvent;
public:
UIElement() : x(0), y(0), width(0), height(0), boundsEvent(NULL){};
virtual ~UIElement() {};

virtual void SetBounds(int x, int y, int width, int height)
{
int oldx,oldy,oldwidth,oldheight;
if (boundsEvent)
{
oldx = this->x;
oldy = this->y;
oldwidth = this->width;
oldheight = this->height;
}
this->x = x;
this->y = y;
this->width = width;
this->height = height;
if (boundsEvent) boundsEvent(this, oldx, oldy, oldwidth, oldheight);
};

virtual void SetBoundsEvent(UIBoundsEvent *event)
{
boundsEvent = event;
};
}

void MyBoundsEvent(UIElement *source, int oldx, int oldy,
int oldwidth, int oldheight)
{
source->SetBounds(oldx,oldy, oldwidth, oldheight);
}

void main()
{
UIElement *elem = new UIElement();
elem->SetBoundsEvent(MyBoundsEvent);
elem->SetBounds(10,10,100,100);
delete elem; // This never gets executed.
}



Someone might write this to make sure the size and position of a GUI element never changes.

Here, you get unwanted recursion. You may think it's stupid, but it actually happens a lot. For example, I'm sure you've seen software that redraws itself WAY too much. That's unwanted recursion. Also, you could want a widget to have a minimum size. Without access to the internals, you'd have to check and reset it. The problem isn't so much that the callback is resetting values. The problem has to do with events that call methods in objects that are currently being called. Sometimes, you will call a method and don't know for sure what it, in turn, will call. So you can very easily end up with the kind of recursion shown above. I just skipped the indirect calls and made it direct so you can see clearly where the recursion comes from.

When the callback method gets called, it is being called from the SetBounds() method. So it should be illegal to then call any object that is currently awaiting for a method to return.

This is why there is a common, but oft forgotten rule, that you should never do any actual work within events. You should queue it and let the main software handle it. Why? Because you don't want unwanted recursion. If you don't know about this, it's something extremely important that all OOP programmers should learn.

In short, you should have a hierarchy of objects. Lower level objects should never be able to call objects that are on the same level or at a higher level. This assures that no unwanted recursion happens.

If only you marked these members as private you would end up with the precious ability to change most of SockBuffer's internal structure without impacting external code.


Actually, what you speak about is a myth. I've learned this the hard way. It simply does not work. Also, WHY would you want to change the internals? Why in the would would you do this? The reason I have helper classes is so I don't have to do this. If I want to change something, I create a new helper class and use that while not breaking the old code. Your way, you will break a lot of code. Please don't ever do what you suggest. It's extremely bad programming practice and very error prone.

I don't see how the database and chat part can know what SOCKET or epoll_fd is. It seems that the only system able to fully construct and prepare the object is the socket engine.


That's not what you asked. You asked what parts use the helper classes and I told you the answer to that question. Of course, they don't know what epoll and SOCKET are. They don't need to. But they do need to extract messages and commands from the input queue and they do so with the help of the helper classes.

The socket engine knows nothing about the final format of the data. That's not up to the socket engine. That's why there are other helper classes that uses the SocketBuffer class in order to create database Statements and ResultSets. Only the database system should know about this stuff.

This may very well be the core of our disagreement. For starters, I would like to refer you to an excellent book called "Refactoring" by Martin Fowler. This book, among others, advocated constantly changing code and internal structure of classes to improve design and facilitate making unanticipated changes. Code has to be able to change because requirements and features change.


This is wrong on so many levels. There's a difference between changing code and changing the internals of an object that is already in use.

Every line of code I write is correct when I write it but may become obsolete or even wrong a day or a year later. That's why every line of code I write needs to be able to change safely and easily at a later time. Only adding on top is what gets you to the million-code-line mark way too soon. You end up with large amounts of legacy code which you can't change or remove, even though much of it may be underused and buggy.


No one said to only add on top. No one said that. So your argument falls flat. What's important is the EASE of changing code. And if you use helper classes, you ensure that there is the minimum amount of coupling possible. Not so if you do things your way with using 'private' and allowing calling any object without any thought to the hierarchy of objects (because you'll get unwanted recursion and other difficult to handle dependencies). Make sure you understand your way is the buggy way, not mine. I used to program your way. I used to do everything you talk about. It does not work.

I think you are confusing extensibility with maintainability. Code which is completely extensible (or rather hackable) but cannot change (due to protected members for example) simply does not scale well to large applications.


The reason I use helper classes is because they DO scale well to large applications. It's the only programming method that does if you use an OOP language. I've explained why already. The reduced amount of coupling means that you can use these objects anywhere in your code without worrying that you'll break anything. This is a requirement when dealing with large applications. Your way, you cannot do this. There is simply no way.

So far, I've explained each of my points, yet you have not done the same. You have not provided examples or any situation that shows why I should believe you. OTOH, I have shown multiple examples and there are an abundance of real life examples that represent exactly what I'm talking about.

Other common methodologies, such as Agile Development and Test Driven Development, advocate making constant changes to your code to accommodate change, instead of allowing unlimited extensibility beforehand (YAGNI - you ain't gonna need it).


I can change my code as well and quite easily. If I want to change a data structure, I only need to change the helper class because I'm adding on top. If I need to remove or completely change data, I can write a new helper object and not worry about maintenance because I know nothing else will get broken.

YAGNI can also stand for "You Are Gonna Need It".

If you use YAGNI, be prepared for a world of hurt when you actually do need certain functionality. You need a way to add functionality at any time without worrying if your existing code will break. You can't do that your way. It's impossible on a large scale.

I'm sorry to say, but everything you mention is bad programming practice. OOP is failing across the board for a very good reason. Yet, it works fine for me. Why is that? It does take a lot of setup code because OOP is lacking in many areas, but it can work. Just not the way you mention and this is becoming an accepted truth that OOP has problems. It's no longer a point of contention. Only difference with me is that I explain why it's failing and what one should do to make it work properly.

By Vorlath, # 12. April 2008, 01:37:59

avatar
Yoni Rapoport writes:

"Like HTML? Like databases? Like images? Like movie files?"

Exactly! Every web designing tool and every web browser is coupled to the HTML specification. That is why HTML changes once in a few years (!!!). Moreover, there are many changes which simply cannot be done to HTML without "breaking" web browsers at run-time.

This is fine for universal protocols but I simply cannot have my code coupled to such complex data structures. My code has to undergo vast changes every few hours, this is the nature of large applications. Using a helper class, a class which encapsulates the handling of such data (like your SockBuffer class) is a great way to make sure that changes to the way data is structured and handled remain localized and do not break external code. So far I'm with you. It is when you exposed this data to other classes that problems arise. As you've mentioned somewhere above, once the SSLBuffer can used members of SockBuffer, SockBuffer loses its ability to change because it is coupled to SSLBuffer. It is as if they are now two web browsers that display HTML. So now, changing even a single line of code in SockBuffer may result in a state which is legal for SockBuffer but corrupted for SSLBuffer - at which time a run-time error will occur. This is true for any passage of data from one side of a communication to another, and that is why OO advocates encapsulating data and passing as little data as possible between objects.

"But short of rewriting the entire class, there is no way to achieve the functionality required."

Using delegation, composition and design patterns can help you achieve this without any duplication. You can find some examples for this in the article I referred you to ("why extends is evil") and in many other resources on design patterns and OOP. Note that the solution will probably include the passing around of objects as that is how OO software is designed.

I see what you mean by "risk of recursion" and that is indeed something to watch out from although the case you described does not adhere to the Interface Segregation Principle which is a big part of the problem. Namely, your event handler should not receive a UIElement typed object but rather a smaller (or at least different) object which will contain operations which can be performed safely within the context of the event handler.

"There's a difference between changing code and changing the internals of an object that is already in use."

If I want to change the behavior of the application I want to change classes which are already in use. If an Employee has a Salary data item and I now want to change it so that the salary is evaluated using a standard salary database according to the employee's job title, than the internal structure has to change because the object that is in use no longer reflects the desire of the users. What you are offering me is to either add on top or replace the entire class, but I'm looking for a safe way to maintain the existing class and make changes locally which will change the behavior of the application but will not result in run-time errors.

"Not so if you do things your way with using 'private' and allowing calling any object without any thought to the hierarchy of objects"

Though I have never encountered the term "hierarchy of objects" in the literature, OO does have rules. An object should only call its immediate "relatives" - objects which are its members or objects which were passed to it in the currently handled message. There are other rules as well but the important thing is you shouldn't just call any object at any given time because of the reasons you stated and other reasons.

"If you use YAGNI, be prepared for a world of hurt when you actually do need certain functionality."

I do use YAGNI and TDD all the time and I never plan ahead for any specific extensibility - all I do is use OO practices, principles and design patterns. It is becoming accepted that planning ahead is not cost effective in software because more often than not you find yourself in need of some unforeseeable extensibility while at the same time making very little use of extensibility options prepared in advance.

This is why encapsulation is the key. It is no myth that you can have classes which encapsulate a certain aspect of the application in such a way that without prior notice, you will be able to make changes locally and even add new extensibility options without breaking external code. Your helper classes are a step in the right direction - that's why I think all your classes should be helper classes. They should all be helping each other while each class hides its internal structure properly and can change without changing others. The way I see it there is no 'main' block which constructs objects and uses them but rather objects are constructed, passed on, and acted upon by other objects, while every class knows as little as possible about the internal workings of other classes.

By anonymous user, # 13. April 2008, 22:13:05

avatar
"Like HTML? Like databases? Like images? Like movie files?"

Exactly! Every web designing tool and every web browser is coupled to the HTML specification. That is why HTML changes once in a few years (!!!). Moreover, there are many changes which simply cannot be done to HTML without "breaking" web browsers at run-time.


No, HTML only changes once in a few years because the browsers have to catch up. It wouldn't matter what technique you used, ANY change would break browsers.

This is fine for universal protocols but I simply cannot have my code coupled to such complex data structures. My code has to undergo vast changes every few hours, this is the nature of large applications.


Sorry, I just don't buy it. Vast changes every few hours??? Don't insult me. Also, if you think that HTML is complex, then I'm afraid I have to question the true nature of what you mean by "large applications".

It is when you exposed this data to other classes that problems arise. As you've mentioned somewhere above, once the SSLBuffer can used members of SockBuffer, SockBuffer loses its ability to change because it is coupled to SSLBuffer.


I'm using virtual functions. If the derived class doesn't call the base class' virtual functions, then I can change the implementation all I want. As to the data, it has the same problem as the exposed arguments in functions. They must always stay the same and rightly so. So your argument falls flat. Note that I have no intention of changing the base class in any way. Only adding to it. If I need to change something, I'll create a new derived class or simply create a new class entirely.

So now, changing even a single line of code in SockBuffer may result in a state which is legal for SockBuffer but corrupted for SSLBuffer - at which time a run-time error will occur.


But I NEVER change helper classes. You still haven't explained why you would do such a thing. I only add to them. And even then, I make sure that it's not something that would interfere with derived classes. If so, then I create a new class.

BTW, helper classes rarely have derived classes. I did in the above case because SSLBuffer is almost exactly the same except for the encryption. I'd rather not have duplicate code.

Using delegation, composition and design patterns can help you achieve this without any duplication.


Nice buzzwords, but my previous comments show why this does not work. Unwanted recursion, inaccessible intermediate results, lack of proper functionality in delegates to achieve composition, etc...

You can find some examples for this in the article I referred you to ("why extends is evil") and in many other resources on design patterns and OOP. Note that the solution will probably include the passing around of objects as that is how OO software is designed.


Oh man. Where do I even begin with that article?

The first part about using static names is true, but the problem lies in the language implementation and how people use the language. It has nothing to do with inheritance. In Project V, I have inheritance as well as interfaces (but for components). But the only way that inheritance could work in my environment is if you can override EVERYTHING. I can override members of internal components. In OOP, that'd be like being able to override a member that's inside A that's inside B. I'd doing the overriding from C which is a derived class of B. Why so much flexibility? Because each derived class is a NEW class. Once you derive something, you may NOT modify the base. Once you do, the IDE will automatically create a new class (or component) for you. Anything that was using only the original base will be updated though to reflect the changes.

But programming languages don't have this flexibility. It's all static text. So that's why I use helper classes. Helper classes deal with the details. It localizes the coupling to those specific areas and aren't propagated anywhere else in my code.

So that's that for the coupling and fragile base problems.

And guess what? On the third page of that article, what does he do? He scraps everything and builds two helper classes!!! He does what I'm doing, but the author gets there for all the wrong reasons. His previous examples were ridiculous. No experienced programmer codes like that.

And if you're thinking that he doesn't use inheritance, well I've said multiple times that you can create an entirely new class. I used inheritance because I don't change my originals, the functionality was exactly the same except for encryption and I don't want code duplication.

A framework-based system typically starts with a library of half-baked classes that don't do everything they need to do, but rather rely on a derived class to provide missing functionality. A good example in Java is the Component's paint() method, which is effectively a place holder; a derived class must provide the real version.

You can get away with this sort of thing in moderation, but an entire class framework that depends on derivation-based customization is brittle in the extreme. The base classes are too fragile. When I programmed in MFC, I had to rewrite all my applications every time Microsoft released a new version. The code would often compile, but then not work because some base-class method changed.


Using MS as an example is not going to convince me of anything. But if he wants to talk about MS getting it right, look at D3D. That framework rocks!

Everything in that article talks about encapsulation and using interfaces to solve the problem of modifying your code later on. I used to do everything this guy talks about. It doesn't work. It's silly to even think that it does if you've ever tried it for large applications.

My solution has many benefits that aren't even addressed in his article. For example, when he talks about frameworks, the reason they seldom work is not because of fragile base classes. It's because of the possibility of unwanted recursion and intermediate state changes that corrupt the system. His "solution" offers nothing in this regard. Mine does.

Back you to your comments...

I see what you mean by "risk of recursion" and that is indeed something to watch out from although the case you described does not adhere to the Interface Segregation Principle which is a big part of the problem. Namely, your event handler should not receive a UIElement typed object but rather a smaller (or at least different) object which will contain operations which can be performed safely within the context of the event handler.


The event MUST know what visual element triggered the event. But as far as your interface solution, that's not going to make any difference. Inside large systems, you're going to have that system in a certain state depending on what it is doing. Events are extensions to one part of that system. As such, any events should be aware of what should and should not be done. Unfortunately, OOP has a poor reputation in this regard. There are no guidelines at all. In fact, the OOP community seems to want to believe that such situations should never exist.

In order to have a "SAFE" interface, you're gonna want to write serialised versions of each and every method in that system. That's actually what I promote, but only at certain specific and externally exposed junctions (not everywhere). So it's best to just use a proper object hierarchy and be done with it. Again, your method offers no solution. Mine does.

If I want to change the behavior of the application I want to change classes which are already in use. If an Employee has a Salary data item and I now want to change it so that the salary is evaluated using a standard salary database according to the employee's job title, than the internal structure has to change because the object that is in use no longer reflects the desire of the users. What you are offering me is to either add on top or replace the entire class, but I'm looking for a safe way to maintain the existing class and make changes locally which will change the behavior of the application but will not result in run-time errors.


There's something I learned while doing construction work about making changes. Sometimes you can change what's there and have it work. But for anything significant, you're gonna have to tear the whole thing down and start over. It works in the real world and it works in programming. If I need to change something significant, I create something new and tear out all the old stuff. If you have problems tearing out the old stuff, then you did it wrong in the first place. I have no such problems because all my data is always handled by helper classes. It's a no brainer. I know with 100% certainty that my helper classes are independent and not coupled in any way other than what I described above. So there's no problem taking them out and replacing them.

BTW, you have to get rid of this notion that maintaining a class involves changing its internals. You keep up with that, you're constantly gonna have problems. What you're telling me is no different than someone saying they want a different kitchen sink (the literal one) every hour of the day. It's just ridiculous.

Though I have never encountered the term "hierarchy of objects" in the literature, OO does have rules. An object should only call its immediate "relatives" - objects which are its members or objects which were passed to it in the currently handled message. There are other rules as well but the important thing is you shouldn't just call any object at any given time because of the reasons you stated and other reasons.


Don't you think that's a problem that there's no "hierarchy of objects" in the literature? I certainly do. Use this technique. I guarantee good things will happen.

I'm gonna quote this part again:

or objects which were passed to it in the currently handled message


Right there. That's the TSN turning point right here. You can't tell where these passes objects stand in the hierarchy of things. This is one of the biggest problems with OOP. You get spaghetti object relationships.

It is becoming accepted that planning ahead is not cost effective in software because more often than not you find yourself in need of some unforeseeable extensibility while at the same time making very little use of extensibility options prepared in advance.


You're confusing bad OOP with what I'm suggesting. What I'm suggesting is leaving accessible internals so that you may add functionality later on. What you're talking about in your quote is predicting all possible future uses. Those are two different things.

This is why encapsulation is the key. It is no myth that you can have classes which encapsulate a certain aspect of the application in such a way that without prior notice, you will be able to make changes locally and even add new extensibility options without breaking external code.


You talk about encapsulation, but it's a dead end for future functionality because it is hidden away. And if you do change the implementation, do you realise that you ARE changing the internal data and code ANYWAYS? The only problem is that encapsulation has a whole host of problems that makes it impossible to use in large applications. My way tackles many obstacles where encapsulation is simply a total failure.

Your helper classes are a step in the right direction - that's why I think all your classes should be helper classes. They should all be helping each other while each class hides its internal structure properly and can change without changing others. The way I see it there is no 'main' block which constructs objects and uses them but rather object