Project V Hack: Flawed Design?
Tuesday, 20. January 2009, 01:51:44
I have an outer class with multiple inner classes. Currently, I'm using a member inside the inner classes to point to the outer class, but I have tons of instances of the outer class and all these internal pointers are taking up too much space.
Is there a way to access the outer class from an inner class without using a member pointer within the inner class? I think I could probably create a function within the outer class that returns 'this', but I'm wondering if there's another way.
I don't care if it's a hack or if it's bad coding practice. I just want those pointers out.
After reading this, I can hardly understand what I said. But the people in the thread seemed to understand it just fine.
Scorpions4ever gave this piece of code.
class outer {
private:
int x;
class inner {
private:
outer *out;
inner(outer *o) { out = o; }
int do_something(void) {
out->x = 10;
return out->x;
}
};
};
This happens to be very close to what I have as it relates to the pointer I want to eliminate. The 'outer *out;' line is what I want to eliminate though. So how would I do that?
Before anyone else answered, I had found the answer. And it's not pretty.
class outer {
private:
int x;
class inner {
private:
int do_something(void) {
outer *ptr = (outer*)((char *)this - (size_t) &(((outer*)0)->in));
// call some func with ptr.
}
} in;
};
The version here is slightly modified from the one in the C thread linked above (uses char* instead of size_t for casting 'this').
Anyhow, I no longer need to store a pointer. I can simply compute the address of the outer class.
The user LittleGrin has three times claimed that my design is flawed.
#1
However, I repeat, that you have a fundamental design problem, and probably need to redesign your complete class structure from scratch.
#2
Originally posted by Vorlath:
I think you misunderstand the problem. My inner classes don't use any information from the outer class at all. I simply want a pointer to the outer class.
Well, no, by this comment you've convinced me that I understood the problem. Your design is broken, but you are assuming your design is correct, and seeking a hack to make things work anyway.
#3 (My personal favourite)
Originally posted by Vorlath:
It's a rather simple concept. I can't really screw it up since it's so simple. I just wanted to reduce useless memory usage. Just because I use a hack to do this doesn't mean my design is flawed.
The fact that you needed to use a hack is evidence your design is flawed. Your ability to delude yourself otherwise does not alter the evidence.
Is LittleGrin right? I tried to explain that the hack is not a necessity, but rather an optimization. Project V will work just fine without it. But if you've followed how the internals of Project V are built, you know it's built as a set of nodes with backlinks. So the outer class is that very node. The internal classes are for handling forward and back links. IOW, they are for handling sets.
Sets are different than what the node itself is meant to do, but those sets are still an integral part of what makes the entire system work. So I thought the best compromise was to have internal classes. I should mention that I usually never use internal classes, so I'm not sure exactly under what other circumstances you would use internal classes.
Here is that class.
class NEntry
{
public:
// Forward named sets.
class ForwardT
{
public:
NEntry *source;
ForwardMap Forward;
__fastcall ForwardT(NEntry *ent);
__fastcall ~ForwardT();
ForwardPair* __fastcall Create(const UnicodeString &index, NEntry *Link, unsigned int flags);
ForwardPair* __fastcall Create(const UnicodeString &index, NEntry *Link, const UnicodeString &reverseIndex, unsigned int flags);
ForwardPair* __fastcall Find(const UnicodeString &index, const NEntry *Link) const;
void __fastcall Clear();
};
// Reverse named sets.
class ReverseT
{
public:
NEntry *dest;
ReverseMap Reverse;
__fastcall ReverseT(NEntry *ent);
__fastcall ~ReverseT();
void __fastcall Clear();
NEntryReverseLinkList* __fastcall Create(const UnicodeString &index);
NEntryReverseLinkList* __fastcall Find(const UnicodeString &index) const;
void __fastcall Delete(const UnicodeString &index);
};
ForwardT Forward; // Forward sets (where we point to other NEntry's).
ReverseT Reverse; // Reverse sets (where other NEntry's point here).
NID *FID; // Unique ID (Interned)
UnicodeString *FName; // Node name (Interned)
void __fastcall SetName(const UnicodeString &name);
UnicodeString& __fastcall GetName() const;
void __fastcall SetID(const NID &ID);
NID& __fastcall GetID() const;
__property UnicodeString Name = {read = GetName, write = SetName};
__property NID ID = {read = GetID, write = SetID};
#define NENTRY_FLAGS_VALUE 1
#define NENTRY_FLAGS_LOCK 2
#define NENTRY_FLAGS_DELETE 4
unsigned int Flags;
NEntryValue *Value;
void __fastcall CreateID();
__fastcall NEntry();
virtual __fastcall ~NEntry();
void __fastcall MassDelete();
bool __fastcall GetIsValue() const;
void __fastcall SetIsValue(bool v);
bool __fastcall GetIsDeleted() const;
void __fastcall SetIsDeleted(bool v);
bool __fastcall GetIsLocked() const;
void __fastcall SetIsLocked(bool v);
__property bool IsValue = {read=GetIsValue, write=SetIsValue};
__property bool IsDeleted = {read=GetIsDeleted, write=SetIsDeleted};
__property bool IsLocked = {read=GetIsLocked, write=SetIsLocked};
};
There's more to it than that. ForwardPair, ForwardMap and ReverseMap are also internal classes or typedefs.
In any case, "NEntry *source;" and "NEntry *dest;" are the lines I want to get rid of. And I did using the hack mentioned above.
Other than ForwardMap and ReverseMap, what do I have that uses memory?
1. NID *FID (4 bytes)
2. UnicodeString *FName (4 bytes)
3. unsigned int Flags (4 bytes)
4. NEntryValue *Value (4 bytes)
5. NEntry *source (4 bytes)
6. NEntry *dest (4 bytes)
Total = 4*6 = 24 bytes.
If I remove #5 and #6, I get 16 bytes. That's a 33% improvement. Sure, there's an STL map for the forward set and reverse set. Not sure how much an empty set takes, but it can't be a whole lot. So 8 bytes less per node is important. It means that for every two nodes, I can allocate an extra one where I could not before. With the sets, maybe it'll take 3 to 6 nodes before I get an extra free node. But that's still significant.
Project V works just fine as far as creating networks, types, hierarchies, sets, instances and components. If my design is flawed, then everything I've written up until now is garbage, yet it works. I'm implementing events as I type (well, minutes ago) and it's going well. These events will allow the GUI to be automatically updated when something changes in any of these nodes. And how do I register events? Through an event node that is stored in another node's event set. That's possible with those pesky internal classes that keep track of all this stuff for you.
I also use internal classes because I don't want the node itself to be cluttered with set functionality. That's not really its job.
But just to build a component, it takes MANY of these nodes. It takes one for the META_TYPE to say it's a type. Then we have one for the type itself of COMPONENT. Then we have another one that is derived from COMPONENT that will define the inputs and outputs. The inputs and outputs are sets using those internal classes. Inside the input and output sets, you drop more nodes that specify the instances, each of which will derive from its own types. Then there are the implementations that derive the entire component definition along with connections, etc.
The user doesn't actually deal with this directly. That's what the GUI is for. Design time components will be able to deal with it directly for certain things like adding extra inputs and outputs, or adding new connections. But even then, you won't be dealing with RAW nodes.
The point is that there are a LOT of these nodes. The more of them I can store in RAM, the better. It's not a necessity. But it is beneficial.
I suppose this is some sort of code review, or a design review. I really like the way my system works. It's extremely flexible and I can build anything on top of it. That's what made the event system possible. See Steve's Universal Design Pattern for more information. Note that I include more functionality than what is found there and that I actually came up with all of this before I knew there was a name for it. When I had read Stevey's article, I wished I had that information a LONG time ago.
Here's my favourite line by Steve.
If you're writing this in C++, then may God have mercy on your soul.
Seems the deck is stacked against me. Other than SAYING my design is flawed, I don't see any specific reason why this would be so. Is using a hack that BAD that it would turn a good design into a bad one? Or is my design flawed from the beginning?
(BTW, I'm confident enough in what I'm doing to keep going forward regardless of what someone says on a C++ forum. But there is enough controversy that I thought this would make for an interesting topic. I'm also looking for suggestions if anyone has any.)


Sean Conner # 20. January 2009, 08:33
Also, there's an offsetof() macro you should use (at least, it exists in C, not sure about C++).
Vladas # 20. January 2009, 11:52
First of all - you need to correct your topic: Hack AND Flowed Design.
Every time someone needs a hack, it is most probably a consequence of flawed design. From my experience I can remember no case when I needed a hack on a regular basis. That was always in order to tackle with bad or flawed design.
The question is now WHERE the design is falwed. Your counterpart told you right - there is a flaw. But he didn't say where... It may be in different places (your code, C++ implementation, OOP). I'd tell it in short, - the flaw is in all of them (more or less)!
On the top level there is a desing flaw with OOP itself. Some time ago I've posted a semi-joke to StackOverflow: (http://stackoverflow.com/questions/86582/singleton-how-should-it-be-used):
Nobody seems understood what I mean. But that was one thing I wanted to tell about OOP - it has many design flaws, and the main one is a misuse of OOP. But that's a side note.
Speaking about your case - there is a classic 'this' and 'that' problem with OOP. By the way some OO languages even implement built-in 'that' property along with 'this' (by the cost of extra memory space used with each object).
C++ implementation saves this space for you, so you need to do it in your way. Some object libraries (like in-built DOM in browser JS) provide 'parent' property, but pure JS objects don't. This is one of the reasons why JS is so slow in browsers and consume so much memory.
Summing thing up - there definitely is a design flaw in OOP (and in OO languages' implementation, which sometimes may be considered as a feature):
I realised that I spend a significant time just fighting with object scopes (this and that pointers) when I do OO programming.
Consider JS examples (I'll provide it in JSON notation). What could be the difference between these two?
If you can see nothing more than these lines of code, there is no logical difference. But the second one provides you with much easier access to cell's x and y properties, without the need of knowing their parent. In database world it is known as 1:NONE relation. Some DB designers create so-called 'extention' tables to the main table (with the same primary key). Yes, it still have some sence when you like to save the disk space, by the cost of an extra DB access time. But it is flawed from the design point of view, because such split-data could't participate in joins, relations or whatever, in other parts of program. Now, I think, you undestand that this is somehow related to your TNode design?
From your code snippet I cannot find a real reason why you should consider a first scenario (from my example above). Maybe that reason inherits from other your code or design spaces... I cannot know. But taking this snippet, I'd better do it in other way. I'll try to provide it in JS, because I don't write in C++:
This code implements a simple node operation dispatcher. Operation lists can be easily extended or modified if needed. Of cource, it is no need for such advanced code, just to show that you can move your methods directly to outer class without the need of having two separate inner classes (remember DB extra extention tables?).
I think it would be quite easy to implement in C++ using method references or maybe even overload.
By the way Linux Kernel code is a very good example of such OO-style dinamic dispatch implementations, although it's written in pure C.
I don't necessarily think this is your design flaw. I think that this is a consequence of more fundamental OOP design flaw, instead. And we must somehow tackle with it all the time.
As for your hack, I too think that it's very dependent on C++ compiler's paraticular implementation (as all other hacks usually are).
EDIT: Please note that my code isn't an equivalent to your code from the problem point of view. It's just an example. For you case it needs to have some more dinamics inside 'create' and 'find' methods.
Vorlath # 20. January 2009, 15:28
Just the one I'm interested in. I'm using properties all over the place as well which will only compile on Borland compilers anyhow. I will eventually port all this to Project V itself. Even if I port the current code to another C++ compiler, it's not that big a deal.
Nothing. It still works. The hack isn't affected by it.
Doesn't matter.
It might, but it still doesn't matter.
vladas:
The opposite has been true from my experience. A hack is usually to get around existing limitations. If I had unlimited memory, I would not need this hack. Most of today's software would not even load up in the machines of yesteryear. Are hacks bad in those cases?
For games, hacks are often necessary to speed up rendering. Look at some of the articles about iD software. Their hacks are certainly not due to a bad design, but rather that machines of the time did not have enough processing power and RAM. Things like texture compression on video cards are a hack. Is that bad design? The list is endless.
However, if you were to have unlimited memory and unlimited processing power and you still can't write proper code without a hack, then I'd agree with you. But I've never seen this be the case for using a hack, other than maybe software incompatibilities.
So I don't see why an optimization such as mine must be declared as bad when the entire software community that pushes the limits of what we can do uses hacks.
Just my thought on the issue. In short, I see a difference between a hack that makes your software more efficient as opposed to a hack that has to be included no matter what machine you have.
Vladas # 20. January 2009, 17:15
Yes. I agree. BTW, I didn't tell that a hack is bad on itself. And design may be flawed in different ways:
1. Bad design (inconsistent paradigm/implementation)
2. Limited design (with a lot of private methods/properties)
3. Closed design (undocumented)
4. Forbidding design (like direct HW access in modern OS'es)
5. Missing design (when glueing up different frameworks)
It's how I understand 'flawed' design.
And I had in mind, that if design is Ok, there would be no place for hacks.
Anonymous # 20. January 2009, 17:21
Maybe this helps.
http://bytes.com/groups/c/61467-access-private-members-inner-classes
I do not know my C++ anymore: But I would assume the fancy pointer math is not needed; the outer class pointer already exists, only the permissions need to be changed.
Vorlath # 20. January 2009, 17:55
Vorlath # 20. January 2009, 22:25
Oh, I see what you're saying. My design is flawed because I did not take the limited quantity of memory into account. Yeah, that makes sense I guess.
Still, I don't like it. This would be difficult for me to accept!
BTW, where would an iterative development process fit in? For me, I rip out what I no longer want and replace it with what works best (and I write my code to allow for this). In fact, that's exactly what I did that in this case. I certainly don't think it's a case where my design is flawed to the point that I have to redesign my data structures from scratch. (No way I'm doing that. Brrr...)
Vladas # 20. January 2009, 22:36
Anonymous # 21. January 2009, 00:22
You are correct in this, you are merely using an optimization that looks so ugly that people can't understand why you could possibly prefer this optimization to simply having those extra pointers around. It doesn't make your design flawed, it's just a curious optimization choice, but perhaps it's justified depending on how many millions of instances of this class, and on what hardware, there will be.
Vorlath # 21. January 2009, 17:37
Anonymous # 6. February 2009, 15:18
Sorry for the late reply, but this is driving me crazy.
Forgive me, I do not know how C++ defines *nested* classes, but looking at your working hack, it is different than C#'s *inner* classes.
Given your line:
outer *ptr = (outer*)((char *)this - (size_t) &(((outer*)0)->in));
and what little I read from the link I gave you, it seems that *nested* classes are 1-to-1 (or 0-to-1) with their parent class. How else could your pointer arithmetic work? In other words, for every nested instance, there is one, and only one, parent instance. For each parent there is up to one nested instance.
If I am right, then getting the parent pointer is only a matter of permissions: But you say I am wrong.
You also mention that C# *inner* classes work properly, then I must assume there can be many inner instances for any one parent. (and I must admit, the link I sent you also seems to agree with this too) Your pointer math above does not make sense, but maybe I am missing the subtly in the pointer arithmetic.
If parent:inner is really one:many, AND you have many parents, then I would assume you MUST have explicit pointers to those parents: C# does this for you implicitly. Maybe C++ does this too implicitly.
I am confused.
In any case, I will assume you are right, and answer your problem: IF there are relatively few parent classes (still many, but relatively few compared to the inner instances), then each parent instance can be a class in it’s own right. In database terms (which I am most familiar with), the inner classes occupy different (but similar) tables based on who their parent is. In regular programming terms, you can markup each inner instance with few bits which can be switched on to find the parent pointer, or you can have the parents maintain list of inner instances, or you can put the inner instances into an array. I really like the latter: If you can keep all the inner instances together in a contiguous piece of memory, then you can deduce the parent quickly (based on memory position) and have all the benefits of smaller child size.
Vorlath # 6. February 2009, 21:37
There are two ways to produce a result. You can ask the object to perform a task, or an external object can act on the first object.
A different way to explain this is:
1. Pushing the gas pedal.
2. Use scissors to cut paper.
In the first case, the entity itself can produce results. But in the second case, you can perform actions that the object itself is unaware.
So in my example, I have nodes that can contain names, IDs, flags and values. But the act of placing them in sets is not the job of the node. Likewise, the act of this node containing sets is not the job of the node either.
I could have created external containers, but searching for a node on every access is ridiculous. I decided to just include set functionality directly into the node itself. But I left it as separate classes because it really is a different set of functionality.
I wish I could have internal classes that could take any kind of items. I really do. But C++ is kinda picky with that.
In C#, the language allows you to get access to the container object. In C++, you can't. If I could, I'd be all set. So I created a hack that allows me to do it. But it has a few drawbacks. It's ugly and only works on a known instance of the inner class relative to a known outer class. So the 1 to 1 relationship which is different from C#'s relationship is a by-product. It's not something I actually want. I'd rather it worked like C#, but C++ doesn't allow it since nested classes are just regular classes with a different namespace.
OTOH, if nested classes in C# do store a pointer to the container instance, then that's no good either. There is no reason to store a pointer, even in many to many relationships. This is a point I doubt many people understand.
BTW, my hack DOES only work with 1 to 1 relationships, but like I said, this isn't something that I actually have a requirement for. So the hack is slightly modified for the second inner class. I don't use it anywhere else.
Oh, and since you mentioned it, I should add that there is no way to get a pointer to the container object in C++ no matter the permissions (unless you store its pointer in the constructor of the inner class or use a hack like I did). In C++, there are no differences between inner classes and regular classes other than namespace.
Anonymous # 8. February 2009, 04:26
Thanks for clearing that up.
Anonymous # 22. May 2009, 12:51
Hey,
I see this is an old topic, but I just wanted to say, as a professional C++ programmer:
I absolutely agree with your choice of method.
It would work without the hack, by passing a pointer to the parent, but your method is more efficient. There is no design flaw as the design merely requires a pointer to the parent. A flaw of design would manifest itself at a higher level. This is a very low-level issue.
I only have one suggestion in order to make the code more portable. Use the offsetof macro instead of your zero pointer. This will increase the portability to most C++ compilers.
Vorlath # 22. May 2009, 18:49