Software Development

Correcting The Future

C++ Protected Interfaces Inadequate

If the title is obvious to you, move on. I just want to debunk the myth that is protected interfaces as proposed by many who support the use of the private keyword.

Here we have another one of my issues with C++ that I neglected to talk about. The protected keyword works great in most occasions. You can visit my comments on C++ and the private keyword here and here. The point is that using protected instead of private is a better coding style since extending a class should allow modifying its state and behaviour beyond what's already there. If a derived class cannot access the internals, then there's nothing there that inheritance will solve. Delegation will work just as good unless you're using RTTI or some kind of classification scheme using inheritance.

First, the background.


The only counter argument I've heard is that you don't want users messing with the internals so you use private and possibly protected interfaces. My objections to the private keyword are as follows. If you use private, don't let anyone derive your classes. If you want someone to alter the behaviour of your classes, then they need access to the internal data. You can't have it both ways. You need to make certain that when you release classes that can be derived that they don't change in the future. IOW, you can't tell someone that they may modify the behaviour of your car and at the same time say that can't look under the hood to change the engine in order to make it go faster for example. Just doesn't make sense.

Here's a simple situation. Suppose you find a bug in your code. Now suppose I tell you that you have to fix it without having access to the internals of your class. You must fix the bug by deriving the class and using this derived class as the new implementation for your interface. Could you do it? My bet is not likely. Except for really simple classes, there's just no way you can fix the bug. Fixing a bug is altering the behaviour of a class. Sure, it's not an extension of what's already there. But now I tell you that I lied about the bug. The bug wasn't a bug at all. It was working perfectly, only that the client wants a slightly different behaviour. What I really asked you was to implement a new method that extended the functionality of the existing class. But you can't implement the new method because it's the exact same problem as fixing a bug.

Ok, with the background out of the way why I don't like the private keyword, back to the actual topic I want to discuss. This has to do with protected interfaces.

Having said that the protected keyword can get us around the problem of updating internal data with the caveat that changes of the internals will break derived classes, we look at a particular case where even protected will not help you.

People who advocate the private keyword often support the idea of protected interfaces for use by derived classes. This keeps encapsulation intact and allows additional methods to be used only by derived classes to extend the behaviour of the base classes. Sounds nice in theory and is ok much of the time if the instance only deals with itself and not other instances. Especially not other instances that have the same base class, but different derived class.

With the private keyword, you can alter the internal of ALL instances of that class. So if you have a method where another instance of the same class is passed along as a parameter, you can fiddle with the internals of this other object all you want.

X a,b;


a can alter the private members of b. And b can alter the private members of a. This is assuming that b is passed to a and vice versa.

Now comes in 'protected' and things don't work quite the same. Take the following example.

class Y
{
protected:
  vector<Y*> Containers;
  virtual void insert(node *n)=0;
public:
  virtual void insert(int t) 
  {
    node *n = new node(t);
    for(vector<Y*>::iterator i=Containers.begin();i!=Container.end();++i)
    {
      Y *y = *i;
      y->insert(n);
    }
  }
};


Basically this is a multilist where if you insert an integer in one list, it'll get inserted in all lists found in the Containers vector. The above code will work fine. It uses the basic principle of separation of concerns. insert(int t) is the public method that the tells all the containers to insert this value. But it can't call itself. Se another insert(node *n) is used for insertion into individual lists.

Everything works. We're happy and we move on.

Except for one thing. Class Y has no internal data. That's up to the derived classes.

class Y
{
protected:
  vector<Y*> Containers;
  virtual void insert(node *n)=0;
public:
  virtual void insert(int t) 
  {
    node *n = new node(t);
    for(vector<Y*>::iterator i=Containers.begin();i!=Container.end();++i)
    {
      Y *y = *i;
      y->insert(n);
    }
  }
};

class Z : public Y
{
protected:
  vector<node*> lst;
  void insert(node *n) {lst.push_back(n);}
};


So far so good. Class Z has its own internal list and everybody is happy. We can use as many objects of class Z or any other class that is derived from Y. This looks pretty good so far. For example, class S could derive from class Y and use a set instead of a vector. You could use any amount of instances from S and Z in this setup.

But what if insert(int t) was a pure virtual function? Or what if you decided to derive insert(int t) to update some internal data in your derived class?

Take a GOOD look at the code I listed above. Now compare it to the code below. You'll see that the code is nearly identical. The only difference is that I have moved the insert(int t) method from the base class to the derived class.

class Y
{
protected:
  vector<Y*> Containers;
  virtual void insert(node *n)=0;
public:
  virtual void insert(int t)=0;
};

class Z : public Y
{
protected:
  vector<node*> lst;
  void insert(node *n) {lst.push_back(n);}
public:
  void insert(int t) 
  {
    node *n = new node(t);
    for(vector<Y*>::iterator i=Containers.begin();i!=Container.end();++i)
    {
      Y *y = *i;
      y->insert(n);
    }
  }

};


One may argue about accessing the Containers list directly as it's not an interface. True enough. But the containers' list can be stored anywhere. It can be stored within the derived class upon instantiation or whatever else. That's not the issue. The issue is the protected interface with y->insert(n). It no longer works.

Derived classes are supposed to have access to protected members and methods of its base class. But only if the pointer or reference is identical to the type of the derived object.

Here's a list describing access to protected methods:

1. Code is in Y, pointer is Y, method called is in Y (OK)
2. Code is in Z, pointer is Z, method called is in Y (OK)
3. Code is in Z, pointer is Y, method called in is Y (NOT OK)

The third scenario will generate an access violation in the compiler. Won't let you do it.

What this means is that you cannot have a protected interface that can be shared amongst instances of different derived types. The IS-A relationship is effectively broken when dealing with base class references which means a good portion of polymorphism is out the window if you use protected interfaces.

Stated a different way, derived objects cannot invoke protected methods using base class references or pointers. This restriction applies to protected members as well. Polymorphism at the protected level only works if the invoking of virtual methods is done by the base class. Beyond that, polymorphism at the protected level is not available in any form to derived classes.

And that really puts one in an odd position with respect to protection levels. Protected interfaces are rather strange beasts. As long as you only use a reference with the same type as the derived object in question, you're fine. But use a base pointer and you lose access to the protected methods and members effectively killing polymorphism.

It's like climbing Mount Everest only to have it disappear once you reach the top. So while protected interfaces are useful in some cases, there are serious flaws that make them less than optimal. Protected members suffer from some of the same flaws, but you can't do polymorphism with data alone (or at all in standard C++ [my kingdom for properties!!!]). The issue of disappearing access is still there though. And although it doesn't come up often, it does happen.

So what I'm saying is that protected isn't that good of an option either though it'll do fine in most cases. Here are two examples where even using protected doesn't work.

The first case is my Project V GUI engine. I have a class called UIElement. I have a list of protected members for a wide range of things like position, caching, visible status, enabled status, active region, events, visual hierarchy, ZOrder, etc. If you derive a UIElement to create a custom component, you're all set to go as long as you only modify your own object's behaviour. So far, so good.

But the next step after a UIElement is a compound component called UIContainer. It contains many UIElements that may have been derived. So it can contain other UIContainers as well as things like UIImage, UIEditBox, etc. The problem is that when the container starts looking at the protected members of UIElement (when drawing on screen for example to figure what portions of the screen is obstructed or not), UIContainer no longer has access to those protected members even though it can access those same protected members within its own instance.

In this case, a protected interface would be of no use since I cannot access protected methods any more than I can access protected members. Anything derived from UIContainer would have identical problems with the UIContainer protected members to access the list of GUI sub elements. Luckily, I've never had to go beyond the public interface for sub elements.

The second example is my compound list that can be ordered multiple ways. The idea for the code examples I gave above stem from this compound list. You cannot create containers that use protected interfaces. And it's not just containers. Any time you want to use two instances together using polymorphism, you cannot do it with a protected interface. It's gotta be public. I was intending to do automatic conversion of iterators between the different lists. I do have a way of accomplishing this, but it won't be with a protected interface. So I'll do it with a public interface. The only thing an iterator needs is the current node. I'm gonna have the iterator have a virtual casting operator to the internal node itself. That way, you can convert iterators by simply constructing them from old iterators without having to know what type it is. You lose security for ease of use. If a protected interface worked, I'd just have the constructor retrieve the internal node from the passed iterator. Oh well.

One can argue about the examples and whatnot all day long, but they're just examples to show the issues at hand. At the end of the day, nothing will alter the fact that private and protected are insufficient to support proper inheritance, polymorphism and encapsulation in a way that is consistent and usable.

After re-reading this post, I find myself surprised that I wrote that last sentence. I originally wrote it as a response that followed from the flaws of the 'private' keyword and protected interfaces. Looking at it now, that statement goes well beyond those two specific things. I stand by it.

If there are some people that don't understand my position and think it's completely out in left field, rest assured that it isn't. Here's a link to the C++ FAQ lite for the question I've been told to never use protected data, and instead to always use private data with protected access functions. Is that a good rule?

Here's a portion of that answer.

Nope.

Whenever someone says to you, "You should always make data private," stop right there — it's an "always" or "never" rule, and those rules are what I call one-size-fits-all rules. The real world isn't that simple.

Here's the way I say it: if I expect derived classes, I should ask this question: who will create them? If the people who will create them will be outside your team, or if there are a huge number of derived classes, then and only then is it worth creating a protected interface and using private data. If I expect the derived classes to be created by my own team and to be reasonable in number, it's just not worth the trouble: use protected data. And hold your head up, don't be ashamed: it's the right thing to do!



I mostly agree with everything in that quote. The difference obviously being with protected interfaces, but even so, I never said they were never useful. Only that there are situations where they are inadequate.

The next question in the FAQ is also good. ROI is what it's about. Is making one member of method public going to make everything easier and move the project forward, or do you want to fiddle with the OO mechanism you were given to find a solution that 100% adheres to the textbook methodologies? Getting it done is often the right way to go when you can go back to fix it.

What I'm trying to say is don't interpret all of this as my using public everywhere. I don't. I use all three protection levels. I just tend to use protected more than others. And I don't frown on using a public method or two at times that solves polymorphism when protected polymorphism is impossible for example.

Font PatentsRubik's Cube Can Be Solved in 20 Moves or Less

Comments

Unregistered user Monday, August 2, 2010 2:24:08 PM

gary writes: "If you use private, don't let anyone derive your classes. If you want someone to alter the behaviour of your classes, then they need access to the internal data. You can't have it both ways." Encapsulation is not merely about protecting data. It is about hiding data for simplicity (OOD). I should be able to extend a class without knowing the inner workings of the base class. Encapsulation allows me to extend a class with new information without the risk of my screwing up the implementation of the parent.

Vorlath Monday, August 2, 2010 3:04:41 PM

I should be able to extend a class without knowing the inner workings of the base class.



Only for simple situations where you wouldn't need to derive a class in the first place. I've already shown how extending a class without knowing the inner working is impossible for anything serious. It's like telling me I can change the engine without opening the hood. These are problems that have already been solved in the real world.

Encapsulation allows me to extend a class with new information without the risk of my screwing up the implementation of the parent.



Only if the extension is independent and as such, should be implemented in a different class.

I keep hearing this stuff all the time about how you should be able to extend a class without knowing the internals and it's just bunk.

I can even prove it (and have proven it in the article). You say, and I requote,

Encapsulation allows me to extend a class with new information without the risk of my screwing up the implementation of the parent.



I give you a challenge. Instead of modifying the implementation of the parent, I challenge to implement the modifications in a derived class without accessing the internals. Can you do it? Sometimes yes. Sometimes no. I'd say MOSTLY no. So if it's mostly not possible, then the argument that you can extend the class without having access to the internals falls apart.

Like I said, you cannot change the engine without opening the hood.

Unregistered user Friday, August 27, 2010 9:01:47 AM

grapkulec writes: my take on the subject is: do whatever gets work done. you think you need private interfaces - use them and move on. you see other way of doing your job - go for it. at the end your boss is not interested in your code but how fast he can throw next task on you and make more money for himself. and about coding: there's no point to make every member data private or protected if you give user of a class public getters and setters for all of that members. it just litters your code and it's work for grunts not software developers. instead just make every data public and if some of them should be readonly make them protected and write public getter for it. or setter if you need something more sophisticated than simple "member = value" statement. and the same applies to interfaces: it should be a class with pure virtual public methods. because main concept behind idea of interface is that it is a set of methods that user of a class can take for sure as implemented in class he uses. clever tricks with forcing derrived classes to implement some internal method is not a core value of interface, at least not in my opinion.

Vorlath Friday, August 27, 2010 2:46:24 PM

I really like the idea of pure virtual classes. It works great not only for what you want published, but also different interfaces that are only for internal use. At the very least, this lets you hide what needs to be hidden from users and lets you have better protection internally as well.

Unregistered user Thursday, June 30, 2011 2:09:28 AM

Mike W writes: Late response here, but it's because I just ran into this problem myself, and found this page by googling "C++ protected interface". I have to say that I completely agree with everything Vorlath says here. I see this as an inexcusable lack of what should be a basic object-oriented design feature. In my case, I wanted to be able to link up different types of objects (not necessarily related) into a doubly-linked list. Yes, I know there are already plenty of other types of containers that can store objects of different types, but for my particular project, I needed finer-grained control over the actual pointers, so a generic container class won't do it for me. My idea was to create an interface class called ILinkable, with all of its methods given protected access. That way, each of the derived classes could choose how to implement the internals, and yet I wouldn't be exposing any of the ILinkable methods in the public interfaces of the derived classes. I want to be able to add the ability to link to other objects that inherit ILinkable, but without cluttering up the public interfaces of the derived classes with ILinkable's methods. After many long hours trying to find a design that would actually work, but without relying on cluttered interfaces and lots of kludgy code, the protected ILinkable interface seemed like the ideal solution. It did exactly what I needed it to do, and kept the derived classes clean. But nooooo, my compiler vehemently disagreed with my plan, and informed me that "Method xyz() is protected." Well, duh, I thought. Of course it's protected. But why the hell is the compiler complaining that it can't access a protected method when I'm calling it from a derived class? Everything I've read says derived classes can access protected members of the base class. But apparently only if you use a derived class pointer??? I thought the whole **point** of polymorphism was to be able to invoke the derived-class implementation via a base class pointer. As Vorlath says, that works perfectly fine if the members are public, and/or there is only one derived class. But if you want a protected interface that can be inherited by multiple derived classes, then sorry, you're screwed. And why? I see no good reason that polymorphism should break down at this point. I'm not a C++ guru (*far* from it, in fact), and maybe there are some subtleties that I'm overlooking, but I just don't see any reason why C++ should (not) work this way.

Vorlath Friday, July 1, 2011 2:04:53 AM

Yeah, your example is EXACTLY my situation and the same problems I ran into. It's tricky too because accessing public members and methods works fine where you think "I'll just make them protected and only let derived classes have access". Uh huh. Nope. Won't work. The reason is this... though I don't quite get why they made it this way (then again, I sorta do in a twisted way).

To answer your question about why it's this way, it's simply a design decision. In essence, they only want methods of type B to access members and methods of type B. If your pointer is of type A, then because your 'this' object is of type B, the compiler would need to be certain that your pointer actually points not only to type A, but also to a derived type B to make sure that the 'this' object and the pointer type are of the same type. Of course, that's not the compiler's job. We currently have to do dynamic_cast or typeid() to find out the derived type. Plus, it's not just one level. It can be many levels of inheritance.

Still, they could have simply went the other way and said that a derived class can have access to ALL objects' protected areas that share the same inherited types as 'this' (for those specific sections). But those in charge of C++ consider this as partial protection and contrary to their design philosophy.

Really, the way it is now, 'protected' effectively means to copy the protected members of the base class and make them private in the derived class. It's a cut&paste mechanism. No more. Sure, base class can access the protected areas regardless of derived types, but it can access its own private members too regardless of derived types, so that point is moot.

Write a comment

New comments have been disabled for this post.

June 2012
S M T W T F S
May 2012July 2012
1 2
3 4 5 6 7 8 9
10 11 12 13 14 15 16
17 18 19 20 21 22 23
24 25 26 27 28 29 30