Skip navigation.

Notes to self

Whatever I feel like writing

Posts tagged with "Programming"

Video links

, , ,

The "ClassThat" Naming Convention

, , ,

I have noticed a pattern in my unit-testing recently. Ah, the word "pattern" is possibly a tad loaded for this purpose, so you might also call it a trend, a style or a convention. It's a recurring thing anyways, and it has to do with how I build my mock objects in my unit tests. I write primarily in Java, by the way, so this is where I noticed this pattern but it may just as well apply, with or without modification, to other languages.

Onwards.

The gist of it that if I have some class called a "Peg" that is used in all sorts of situations throughout my code and I need to build mocks of it for testing, then I will create a companion class called "PegThat" in the test sources tree.

This "PegThat" class will consist of nothing but static methods that all return various forms and instances of Pegs. These static methods will have names telling of certain properties of the Pegs they return. For instance, it might have an "isSquare" method that returns a square Peg.

The beauty of this naming convention is the readability that ensures in my test code. Behold:

@Test public void
squarePegsMustFitIntoSquareHoles() {
  Peg peg = PegThat.isSquare();
  Hole hole = HoleThat.hasEdges(4);
  
  assertThat(peg, fitsInto(hole));
}


That is arguable a very contrived example, but when you are building more complex mocks or instances, and these static methods need parameters or various types, then the benefit increases. Your tests can become a lot less messy and a good deal more readable.

Form Follows Function

, ,

If form follows function then why do we do up-front design when we develop software?

My work place has, for some time now, repeatedly asked me for estimates and designs, and this has gotten me to think about how I feel about those artifacts. In this blog post I am going to try and put my feelings about up-front design into words.

Design is the detailed description of how something is suppose to be built, but software development is plagued by uncertainty to such a degree that any design conceived before the first line of code has been written, is bound to be wrong.

Design is a model and a description of a destination. A place we want to end up. A model and a description of what we want to end up with.

When one embarks on a new software development project, I would argue that such a destination is not a terribly valuable artifact to have. Kent Beck uses the metaphor of driving to describe software development; constantly adjusting the direction as we go, making sure that we don't fall off the road.

Kent Beck is spot on, in my opinion. We don't need a destination when we start a project. What we need is a starting point and a direction. And the people that need those things are the team members, not management. The team members all needs to be on the same page with regards to the starting point and the direction. This is even more so important with a distributed team, which is what I currently have at hand at work.

When the starting point and the direction is all agreed upon, then the destination, the design, will emerge as we go along, and become self evident when we at some point decide to stop. This is the "emergent design" idea from extreme programming.

But before I get ahead of myself here, I have already described how design is the destination in this metaphor, so what are the starting and the direction? Briefly, the starting point is the most pressing and important unimplemented user story. The direction is knowledge of the business problem and a vague conceptual idea of how a system might solve this. I suppose part of what the direction is, is described in the current set of unimplemented user stories, but the direction as a whole is more than just user stories. The direction is sense that all team members are "on the same page," so to speak.

So when I title this blog post as "Form Follows Function," what I really mean is that "Design Follows Implementation" and that this idea is emergent design. When we write code, we are also implicitly designing, and this design that emerges from our code is the real design, so to speak. It is the truth, as oppose to some up-front class diagram describing classes that don't exist and incorporating engineering trade-offs to problems that also don't exist (unless they are designed into the system).

The up-front design describes either something that we don't want, or something that we won't get, or both. And therefor, I think that working to the level of detail that is implied by it, is wasted effort. And I don't like wasted effort.

Code Need

, , ,

The following is something I intend to sleep on:

Introduce abstractions out of need. Abstractions that are introduced through design rather than need are inherently over-designed.

Programming is about satisfying need.

Agility is about reacting to need rather than anticipating it.

TDD is a process of formulating need at a low level.

Breaking the Law (of TDD)

, ,

The Three Laws of TDD is an excellent and productive method of raising the quality-bar of your code.

This "proper" form of TDD ensures a number of things. First, your code is testable. If it is not, your tests will become too hard to write, but once you have a good coverage, you can then safely refactor your code such that it becomes testable. This is a good thing because testability is a property of design that implies modularity, compose-ability and flexibility.

Second, the closed think-test-code-refactor loop makes your programming intentional and deliberate, as oppose to programming by coincidence. It forces you to think about your code and your design before you get typing. I believe that these extra brain cycles aren't just wasted on trying to figure out how to write a test of non-existing code, but also partially translates into code that is better thought through and, ultimately, better written.

Lastly, the most direct effect of this process is the test coverage gained. If followed strictly, then you are in theory guaranteed that all lines in your code are covered by tests. In reality however, the dreaded 100% coverage is not only overrated, but can lul you into a false sense of security. But, proper test coverage enables refactoring and thus makes it possible to continuously improve the code base.

All nice and well. I try really hard to adhere to the laws of TDD, but there are things that you just can't test. And not just the usability, but code-things too. This is especially true for highly concurrent code where failures are probabilistic and sometimes even impossible under certain conditions (hardware, VMs or compiler flags). I have sort of accepted this property of concurrent code, because I find that I can often encapsulate the concurrent parts and make them simple enough to be verified analytically.

However, the other day, I ran into a problem. I needed to write a function. It did not modify any state, was idempotent, did not involve any concurrency (the whole program was single-threaded) and it did not even have anything to do with usability. But I could not write a test for it.

The purpose of this function was to return the absolute path to the current users home directory. That was all. How do I test that? I could hard-code the path to my own home directory in the test, but then it would fail for everyone else. I could try the function manually and check the result myself, but I don't have a windows machine and this program was suppose to be platform independent. Still, that was what I ended up doing. Then I just copy-pasted some code from somewhere to fill in the windows special case and hope for the best.

It felt like a dirty thing to do, because this program had been written with TDD from the start, and I had just written the first function without any unit tests. Indeed, parts of it wouldn't even run on my machine because it was windows specific.

I won't make it a habit, though. I need to keep telling myself that.

Best programmers 28 times better than the worst? I'm not sure.

,

In his book Facts and Fallacies of Software Engineering, Robert L. Glass cites from a research paper from 1968, that the best programmers can be up to 28 times better than the worst.

That paper is Sackman, H., W. I. Erikson, and E. E. Grant. "Exploratory Experimental Studies Comparing Online and Offline Programming Performance." A brief, but publically available, treatment of it can be found here and most like elsewhere too if you ask Google.

Now, I know it's bad form to critisize a paper I have not read, but you need an ACM account to get at the real thing, so I will instead base my opinion the linked blog post, and otherwise be brief about it.

I have two points of critique, and the first and most obvious one is regarding the age of the paper. It was published 41 years ago and its primary purpose was to figure out whether time-sharing or batch processing systems were the most productive - which it did. And then, "almost in passing," apparently, they present numbers showing that the best programmers in their study were up to 28 times better than the worst.

I have to wonder whether the numbers have changed. Have the worst programmers gotten better, or worse? And are we even able to distinguish such a number from a difference among the best programmers? I have absolutely no idea. Most people I know, myself included, wasn't even born 41 years ago.

The second peeve I have with the unkown content of that paper, or perhaps rather the 28:1 quote in the context of that paper, is the dataset. While I don't know how many programmers participated, I do know that they were tasked with solving two programming problems.

While such a method is able to show peaks such as "up to 28 times better," it fails to show whether this grade-28 master is able to keep ahead in the long run. The two programming problems they were presented, were pretty small in size. Especially compared to many of the projects that are going on in the real world right now.

Is the master able to keep his times-28 advantage throughout a project that takes a year to complete - or at least is "normally" estimated to a years worth or work? Assuming they will finish at some point, how long will that same project take the worst programmer to complete?

It is possible that operating under the assumption that the worst programmer will, eventually, finish the project, is a pretty far stretch. But so do the "28 times better" seem when considered in the long run. Given the numbers are based on a mere two programming problems, the 28 times may have been a fluke. Maybe the guy was just lucky - you can argue that it takes skill to be that lucky, but still.

I once spent one and a half months hunting nasty memory leak (and I don't ever want to do that again), but once I found it I just had to change two lines and it was fixed. I could have been lucky and found those lines after a week, or never introduced the leak in the first place, but unfortunately it didn't happen like that.

In conclusion, I don't think that paper has the material to justify "up to 28 times better" as fact. The difference is there, for sure. And many people seem to feel in their gut that it may be around 10 times better. That's also a nice round number, but I won't claim it's a fact though.

My Estimates Have Gotten Worse

, ,

I'm now into the third chapter of Clean Code, and I am already starting to use what I learn in practice.

The book is showing me a new level to reach; it's pretty high but looks doable. Using the words of the software craftsmen, it is raising the bar.

But, as with all newly gained knowledge, its use is not automatic. I am expending a noticably greater amount of brain cycles to keep the quality of my code up at this new level.

This is, noticably, affecting my productivety, as understood as the rate at which I get stuff done. I don't have any hard numbers on this because of my failure to formalize my estimates (I know, I know... many bad excuses goes here).

So, while I'd like to think that the quality of the code I'm producing is higher, I am also taking longer to produce it. This means that previous experiences about how long things take have grown increasingly inaccurate. And due to lack of retrospectives and formalism in my estimation process, this growth has been without control.

I am not saying that I'm now suddenly catastrophically bad at estimating how long things take, just that the growth has been uncontrolled. I will assert that as this new knowledge moves from the frontal lobe to the spine, my estimates will steadily return to normal (unless I start to actively improve them, in which case they will become better).

This movement of knowledge is also known by the name "experience," and the only way to atain it is through practice.

Considering this, all I can do is to keep at it and improve, but remember to keep a close eye on the schedule, and try to add in the extra overhead whenever I estimate.

Reading spree

,

I read Java Concurrency in Practice (Brian Goetz) and it taught me something important. It showed me a whole new "dimension" (I can't explain it any further than that - words fail me) of reasoning about code. This was not the intent of the book, but rather a side-effect of the examples in it and its approach and focus on correctness.

I now write all my code with this same focus and attention to correctness, and I always reason about my code through this new dimension that I have discovered, and I am always concious about the thread-safety of my code.

Then I read Release It! (Michael T. Nygaard) and it showed me that my newfound attention to correctness could, and should, be extrapolated to a much higher inter-system scale. The appraoch is now less about fundamental reasoning, and instead more about making concious design decisions.

The situations where I need to apply this knowledge are fewer and farther between, but their effects are equally profound and, I dare say, even more noticable and visible to the people who surround the systems I work on, and my fellow programmers.

Then I read Java Management Extensions (J. Steven Perry) and it sucked. Moving on.

Then I read Facts and Fallacies of Software Engineering (Robert L. Glass) and while my code is unaffected, I still learned something important. Actually, I learned many things from this book. But the book itself is like a window through which, if you care to look, sense something called experience.

Here's a guy who at the time of writing (2002) had been a software practitioner for 45 years. He has seen things, tried things, evaluated things, researched things and written a ton of code. He does not use the word "fact" because he is pompous. When he says "fact," he implies anectdotal evidence and research papers (that he has actually read and understood).

Where was I going with this? Oh, yes. The facts and fallacies are worth remembering. And that's about it. This stuff is good to know, because, it just is. Okay?

Now I'm reading Clean Code (Robert C. Martin) and I have only just made it past the introduction, but already sense something. I get this weird feeling. You'd think that any reasonably experienced programmer knows what clean code is, knows it when he sees it, knows how to write it and knows how to change bad code into clean code.

Do you know what clean code is? Can you tell me? If you think "yes" to these questions, then try to imagine yourself in the classical elevator-pitch situation and tell me, right to my face, what clean code is. Do you still know what clean code is? Can you even point to an example that you are certain is clean code?

I have just made it through the introduction and just realized that I can't do any of those things. And what's more: clean code is more important than I thought before I started on this book.

I can see my horizon expanding but I have no idea how far it will go - that's the feeling I have about that book, right now.

I wonder what I'll read next :smile:

Offline Issue Tracking

,

I don't code offline that often, but I do sometimes and I really like the ability to do it.

Git, my favorite DVCS, is a huge help in this regard. I can commit, branch, merge, tag and practically do everything offline, except push, pull, fetch and clone.

With Git, I consider my offline SCM problem solved! When you come from a CVS/Subversion background, this is such a big deal that you might be tempted to consider the whole offline problem solved.

But after a little while, you realize that this is not the case.

The Maslow's Hierarchy of Needs for Programmers might look roughly something like this:

  1. Text editor, Interpretor/compiler
  2. Coffee & Snacks (oh, you know it's this high, don't you?)
  3. Documentation in some form (built-in, API spec, etc.)
  4. SCM
  5. Work tracking (TODO list, bug tracker, post-it's, etc.)


Looks like the next thing we need to get working in offline mode is an issue tracker of sorts. And in going offline, it also needs to somehow deal with the distributed nature of the project.

I like the idea of having this data under version control along side the source code. That way, a commit that resolves an issue can also mark it as such in the tracker. But things in version control are subject to merge conflicts, so the data needs to be in plain text lest the tracker comes with its own merge-tool, and a good one at that.

I use a simple TODO text file for Fabric, but experience suggests that the difficulty and inconvenience of this idea increases exponentially with the number of people on the project. Or something like that. Lack of features and extremely poor scalability makes this approach a bad idea in pretty much any case.

So, what other options are there? There's stuff like git-issues and ticgit - both married to Git, and both seem to keep the issues in a seperate branch.

I'm not sure I like those options. I like the hard link between a commit and the closing of an issue, and I don't see why I should keep my issues in a branch of their own. It's one more branch to push and one more branch to pull. Maybe it keeps my history cleaner, and maybe the tool can better control how issues are committet. But I'm still not sold.

I heard someone mention Ditz on IRC yesterday. It keeps the issues in yaml files in a configurable directory which can easily be part your SCM repository. It looks like CLI version of Jira but without all the crap and complexity.

So although I have only just started to play with it, my first overall impression is positive. It seems to be pretty much what I want, and I have decided to use it for textjure. Then I guess we'll see how the ship fares.

But a part of me still wonders what other options are out there, and what people might be using.

Textjure

, ,

What have I been up to, code wise, in my spare time? I felt like talking about this because I think it has gotten mildly interesting lately.

It hasn't been Fabric. I started Fabric, not because I thought it would be a fun thing to do, but because I absolutely needed a tool that did exactly what Fabric does. And ever since Fabric got to the point where it more or less completely scratches my deployment itches, my time-investment in it has pretty much retreated to maintenance mode.

No, my spare time is now primarily spent on the Clojure programming language, in some form or another.

When it comes to programming languages, I learn by doing and I think most people feel the same way; you can't learn how to write code if you don't write code. I learned Python by writing a blogging engine (how original) in it using Django, and then Fabric.

Now, I'm learning Clojure. I started out with the Euler problems but ran out of steam; the math problems just wasn't my thing. Then I was mostly idling around the clojure.core source code and the Java code, looking for a cool idea to try out but nothing really jumped out at me. I proposed a number of what I thought would be language improvements - with patches'n'all, but each and every one of them was turned down. Oh well.

Then, drifting aimlessly around Github, I stubbed my toe on a tiny little thing that Chris Houser had done. I was a text editor, but hardly even complete enough to be called a prototype.

I downloaded it and fired it up. I was mildly surprised the thing actually started and showed a little window with two parts: and editor panel and a build-in Clojure REPL that actually worked.

The interface wasn't terribly exciting, though. The window was tiny and placed in the top-left corner, and the program could not be started without naming a file to open.

I dug into the code - what little there was of it - and fixed those things. Made the window chrome-less and full-screen, and pivoted the editor/repl-split to a left-right configuration to better utilize my wider-than-tall screen resolution. It actually worked ad wasn't terribly difficult.

Then I found out that the thing actually had no functionality in the way of opening and saving files. Also, the key-binding looked messy. I turned to these issues next.

One thing leads to the next, and textjure is now slowly shaping up to be a genuinely useful text editor, though it is still woefully underfeatured.

The code is in my fork of Chris Housers repository: http://github.com/karmazilla/textjure/tree/master

The feature that I'm currently working on is syntax coloring. Following this, is stuff like indenting, incremental search and REPL-history. Then I guess I'll properly announce my work on the Clojure Google Group.

I always knew that I would eventually find something fun to hack on with Clojure, but I would never have imagined it would be a text editor. :smile: