Software Development

Correcting The Future

STL Documentation

As you may or may not know, I've been implementing my own containers that work as closely as possible to existing STL containers and am learning a lot in the process. Difference with my containers is that they have different performance and often have a combination of functionality from various existing STL containers. For example, an AutoSkipList acts like a set and a vector all at the same time. It's a sorted list and when you insert an item, you get an iterator that also has its index within that list. From then on, you can perform actions from both a set and vector container. For example, you can use arbitrary access and you can use find().

Ok, so I'm going through the documentation at SGI. The docs are actually pretty good. But I cannot figure out which came first, the documentation or the code. First off, they have performance requirements mixed in with functionality. So if you have a container that can only erase an element in logarithmic time, then it's not an Associative Container (containers that work with keys). A lot of algorithms that work with keys work in logarithmic time, not constant. So unless you can earase an element in constant time, none of the other derived concepts are applicable. You can't say you have a Pair Associative Container (like a map). You can't say you have a Unique Associative Container (where the key must be unique). On and on. All because of one little difference in performance requirement.

OTOH, if performance requirements were separate from the functionality, that would be something else all together. You could select the functionality you want and add your own concepts if you want. And THEN you could match up the performance requirements for your container for each of the concepts you implemented.

I'm finding that writing docs is taking WAY longer than writing the code. What I find strange is that despite my complaints about the way the STL documentation is organized, it's kind of fun to describe what each part of your code does. In fact, it helps make your code more consistent.

So what did I do? I added some of my own concepts. A few of them are overrides. Basically, you can use the standard STL concepts, but if you use an override, it alters the performance requirements even if the original concept specifies a different performance requirement. This effectively separates the performance requirements from the functionality. The performance requirements are no longer requirements, but defaults. You can alter them by stating override concepts.

One example of this is "Associative Container LOGN", a new concept I created. This overrides the performance requirement for erasing an element in the original "Associative Container" concept so that it can be performed in logarithmic time. I can now have a container that is both an Associative Container LOGN and a Pair Associative Container. A Pair Associative Container must have erase element be in constant time, but Associative Container LOGN overrides that and allows the container to perform erase element in logarithmic time. In short, I can now have a map-like container that erases an element in logarithmic time.

Since all my containers are SkipLists, almost everything is done in logarithmic time. Now that I had a way to deal with container performance requirements, iterators became a problem. There are about five different kinds of iterators and the first two aren't very powerful. Of the more powerful ones, they all require constant time movement. Some of my containers are Bidirectional Iterators, but many aren't because these last ones can only move backward in logarithmic time. You can use the container that best suits your needs. If someone seldomly needs to go backwards, this could be the right container because you save memory in exchange for loss in performance, an age old truism of computing.

Then we get to arbitrary access. Random Access Iterator is the only one that has it defined and it MUST be constant time. All my SkipLists that support indexing can do it in logarithmic time. It may not be as fast as a vector, but it's just as fast as a map or set (except that we search via index and not by a key) and those containers are used all the time. In this case, I didn't use overrides, but defined new iterator concepts with different performance requirements. It's weird though because they still need to be defined as random access iterators (or bidirectional iterators) to get the right functionality at the code level even if the performance is different.

I do understand that the STL documentation is there to document what they have. I get that. I just find it really strange that there is ZERO room for expansion. No thought whatsoever has gone into adding more components later on with different performance requirements. I've seen plenty of docs where they reserve this and that for future use. Or they leave a few loose ends that will be used at a later time, etc. None of this stuff changes what you have. But it does let someone know that the future is not an antithesis to your work.

Should documentation follow some of the same rules as programming? Or is this more a reflection of how the code was designed? It's hard to tell. Perhaps it's the fact that they had such an easy way to make it all extensible, docs and code alike, that I'm perplexed at the whole situation.

Anyways, documention is really fun for me once I get set up with the layout, formatting and design that I want to follow. I wish I could say it's similar to blog writing in that I try to explain things in a more understandable fashion, but it's not like that at all. Maybe once or twice for the general explanation how to use the container. But after that, you have to make sure the same comments appear in all the container documentation that have the same methods.

Documentation really is the finishing touch on your code. It makes everything look better and creates a complete package. It takes time, but totally worth it.

SkipList: Optimizing Memory UsageQuick Problems

Comments

Unregistered user Sunday, June 6, 2010 9:58:54 AM

grapkulec writes: I found your blog during googling around on "future of programming language" and I have to admit I stopped on your page yesterday and try to read your entries from very beginning. now I somewhere in december 2005 stage and keep reading with more and more fascination about your point of view on programming and related stuff. and some of discussion in comments are even more interesting than entries they relate to. and all this made me seriously consider a serious shift in terms of the future of my carrier and growth as an software developer. I write software since 2001 and I think I'm good at my job. I write in C++ and I know what I know and I know what I don't know (the latter I think is more important because it gives you new fields of improving yourself and not repeating the same mistakes all over your software). But very recently I started to feel that all work I do is basically the same although in different names given by higher powers in my company. I always have to show something as a GUI, I always have to check user inputs, I always have to make basic CRUD operation on some objects representing "data". And I feel more and more bored about doing these very similar things because I can't really reuse parts of code I wrote before because... I don't really know why not but somehow I always keep writing the same code all over again. I hope it make some sense and you can feel my pain about it :) Last week I had an job interview and they asked me why I want to leave my current job and work for them. I was honest and probably blew my chances but I said that I'm bored with writing the same boring stuff and I seek something different. What funny is their reaction for my statement because they really tried to convince me that their job and projects are even more boring than my current job :) Your idea to develop environment where I could just state "open this file and display it's content" and not carring about what that file really is and how it have to be (or should be) presented because it's not "core task" of my application sounds great. I've read your entries about that real programmer should know how hardware works and how today's languages make things complicated rather than solve real problems and how thay restrict us from doing something we want because someone thinks that we don't know what we are doing and his duty is to keep us in shackles for our sake. And it all got me thinking about my programming skills and carrier. I don't know how hardware works. Closest I was to the hardware is shifting bytes when I needed convert integer to four bytes. I always thought that assembler is hard and that I don't need to know anything about it. I have my C++ and I'm really hardcore programmer. But know I think that instead of learning C# and .NET I will get back to the roots and try to learn about assembly and machine code. Sorry for comment that is not related to this blog entry but I felt like writing it here not at some beginning of your blog. Also sorry that is rather chaotic but I felt like I have to give you a sign that I'm here :)

Vorlath Sunday, June 6, 2010 12:29:43 PM

Thank you for your comment. Rewriting the same code is a common topic, not just here, but in all of programming. Reuse never turned out to be what was promised.

About learning assembly instead of C#, I'm not sure that's a wise decision depending on what you want to do. Don't learn machine code. There's no point. Just know that each opcode will be converted to a number of sequence of bytes that the machine understands.

Also, assembly is easy. It's the setup that can be complicated. On Windows, NASM is free and you can use it with the linker that comes with your C++ compiler. Assembly is important to know so that you get a better understanding of how compilers generate their code and what they do. But don't learn this at the detriment of learning another language that could be a source of revenue. I haven't done much C#, but I looked at it before it even came out and liked a lot of stuff that was included. Had you said Java though, my response might have been a little different wink

Thanks for reading my blog. There are some good articles in 2006. After that, I'd only read what interests you because I start talking more about my dataflow environment that I'm building and different ways to look at the foundation of what makes computing possible.

Again, thank you for your comment. It's nice to know the thoughts of my readers and that they are indeed reading my posts. I'm currently busy finishing another project, but more articles will be on their way in a few weeks.

Unregistered user Saturday, June 12, 2010 1:30:01 AM

Anonymous writes: Forgive the editor war spam :), but something that might make programming more interesting and allow for more code reuse (by better understanding the underlying structure of programs) is the Leo text editor: http://webpages.charter.net/edreamleo/front.html The testimonials seem cheesy, yes. As well, it takes a while to get used to it - at first the editor seems to get in the way and be a total waste of time (it took a second time being exposed to it for me to not dismiss it before I figured it out). Eventually I started to "get it" and it has helped me enormously in learning how to program better. It isn't just an editor, it provides a way to arrange the structure of a program - which makes it easier to see what can be made simpler.

Write a comment

New comments have been disabled for this post.

June 2012
S M T W T F S
May 2012July 2012
1 2
3 4 5 6 7 8 9
10 11 12 13 14 15 16
17 18 19 20 21 22 23
24 25 26 27 28 29 30