Skip navigation.

Notes to self

Whatever I feel like writing

Fabric 0.0.2 Released.

, ,

Here's a sorry-for-previous-release-not-working-release; It's Fabric 0.0.2.

Hopefully it won't crash and burn as much...

Here's what's new:
  • lots of minor tweaks to the innards to make it a little less hacky (still plenty to do in that regard).
  • added a new download() operation; like put, but the other way around.
  • local() and local_per_host() operations now halt fabric if whatever they happen to be executing terminates with an exit_code != 0.
  • load() shoul now print a nicer message if no fabfile is found.
Appart from that, I've done work on a new and improved (!) shell command, but it's not ready for this version (maybe next). Also worked a little on unittesting this baby, but I haven't got much to show for that, either.

Real World Haskell's in public beta.

,

This means that the book site actually has some content now.. or, supposedly at least - I haven't seen it myself because the server is completely floored with load, but when it gets back on its feet, you can check it out here: http://book.realworldhaskell.org/.

The jQuery JavaScript Implementation Quality Metric

,

The jQJSIQM is, besides a horrible acronym, something that I've invented in the honor of the just recently released jQuery 1.2.2 library.

It is a metric for the quality of a JavaScript implementation, and works on three assumptions:
  • jQuery is a sizable and representative body of cross-browser JavaScript.
  • It is documented in the jQuery source, when ever something will trip up, or surprise a JavaScript developer - quirks of the particular implementation.
  • These quirks are bad, and therefor, the fewer the better.


In short: the fewer times a particular browsers name is mentioned in the jQuery source code, the better its JavaScript implementation is expected to be.

Bold claim? perhaps. Like all metrics, it should be taken with a grain of salt, and two grains of pepper.

Onwards! To the Numbers!

First contender is my favorite browser of them all: Opera. Let's see how it fares:
$ grep "Opera" jquery-1.2.2.js -c
7

Seven remarks. Well, eh. I suppose it could be worse. Let's keep an open mind for now, and remember that the upcoming version 9.5 has a completely revamped JavaScript engine that is rumored to be quite good.

Looking at market share, I'd say that Safari is probably Opera's closest contender. Safari also had its JavaScript engine revised in its version 3. Let's see if the boat floats.
$ grep "Safari" jquery-1.2.2.js -c
6

Woops! Looks like Safari is one-upping Opera on this test. Let's just hope that the Mac fanboys at work don't get hold of this data - I'd be havoc on my browser pride.

Scaling up on market share yet again, I think the Gekko browsers comes next. Mozilla and Firefox both have the same Spidermonkey JavaScript engine, so I'll consider them as one.
$ grep "Mozilla\|Firefox" jquery-1.2.2.js -c
8

Considering that Mozilla and namely Firefox has several truckloads more market share than Opera and Safari combined, I'd say that this result is quite encouraging.

Last on our list is the king of market share: Internet Explorer. It's been lambasted for almost a decade as the number one perpetrator in strangling innovation and holding the Internet back, but do these harsh claims stand a beatin'?
$ grep "IE" jquery-1.2.2.js -c
40

Woops. I suppose market share isn't the only area where IE is in the lead, it's pretty far ahead of the other browsers in quirk-count as well.

Since Opera, Safari and Mozilla/Firefox are so close to each other in this test, I don't think I'll declare a winner - if one of these browsers are your favorite, then feel free to think of it as a winner.

So now that we don't have any outstanding winners, it's time to ridicule the loser... by changing the "-c" to a "-n" in the grep command. Notice there's a couple of memory leaks mentioned in the output as well:
$ grep "IE" jquery-1.2.2.js -n|perl -p -e "s/\\s+/ /"
65: // Handle the case where IE and Opera return items
298: // IE copies events bound via attachEvent when
303: // attributes in IE that are actually only stored 
317: // removeData doesn't work here, IE removes it from the original as well
318: // this is primarily for IE but the data expando shouldn't be copied over in any browser
697: // IE has trouble directly removing the expando
823: // We need to handle opacity special in IE
924: // !context.createElement fails in IE with an error but returns typeof 'object'
968: // IE can't serialize <link> and <script> tags normally
981: // Remove IE's autoinserted <tbody> from table fragments
997: // IE completely kills leading whitespace when innerHTML is used
1047: // IE elem.getAttribute passes even for style
1051: // We can't allow the type property to be changed (since it causes problems in IE)
1055: // convert the value to a string (all browsers do this but IE) see #1070
1066: // IE actually uses filters for opacity
1069: // IE has trouble with opacity if it does not have layout
1120: // We have to loop this way because IE & Opera overwrite the length
1124: // (IE returns comment nodes in a '*' query)
1566: // to avoid selecting by the name attribute in IE
1580: // Handle IE7 being really dumb about <object>s
1808: // For whatever reason, IE has trouble passing the window object
1852: // event in IE.
1889: // Nullify elem to prevent memory leaks in IE
2015: // prevent IE from throwing an error for some hidden elements
2060: // Clean up added properties in IE to prevent memory leak
2080: // otherwise set the returnValue property of the original event to false (IE)
2087: // otherwise set the cancelBubble property of the original event to true (IE)
2293: // If IE is used and is not in a frame
2298: // If IE is used, use the trick by Diego Perini
2299: // http://javascript.nwbox.com/IEContentLoaded/
2365:// Prevent memory leaks in IE
2417: // to avoid any 'Permission Denied' errors in IE
2595: // IE likes to send both get and post data, prevent this
2637: // implement the XMLHttpRequest in IE7, so we use the ActiveXObject when it is available
2795: // IE error sometimes returns 1223 when it should be 204 so treat it as success, see #1450
3307: // IE adds the HTML element's border, by default it is medium which is 2px
3308: // IE 6 and 7 quirks mode the border width is overwritable by the following css html { border: 0; }
3309: // IE 7 standards mode, the border is always 2px
3311: // However, in IE6 and 7 quirks mode the clientLeft and clientTop properties are not updated when overwriting it via CSS
3312: // Therefore this method will be off by 2px in IE while in quirksmode

Fabric 0.0.1 Released!

, ,

Last Sunday, after little more than two weeks of work, Fabric version 0.0.1 was been released. The main goal of this the very first release was to just get something out the door, and I must admit that some pretty horroble bugs made it through the cracks.

Fabric had most of its core functionality in place after just three days. Then came a period of polishing and bug squashing, and finally a lot of work on the help system and the distribution tool-chain these past four days.

I'll be celebrating this first release by transitioning Fabric from Pre-Alpha to Alpha. Also, I'de like to state my intend on release the cycles; that if anything at all has happened on Fabric since last release, then push a new version out the door every two weeks. The big idea is, that if I can just maintain this rythm, then the project is bound to move forward and not die out, regardless of how fast or slow the work progresses.

I'de like to make clear what version-numbering scheme I intend to use:
Before 1.0:
0.0.1 = First release
0.-.X = Backwards compatible update
0.X.0 = Backwards incompatible update

After 1.0:
-.-.X = Update without interface changes
-.X.0 = Update with backwards compatible interface additions
X.0.0 = Backwards incompatible update

And that's it for now. See you in two weeks, if not sooner.

Monads in Pythton

, ,

This is noteworthy: Peter Thatcher has implemented Monads in pure Python and gives examples of how to use them here: http://www.valuedlessons.com/2008/01/monads-in-python-with-nice-syntax.html

I don't quite get what's going in on in his code - needs more thorough reading, but it sure looks interresting.

ANN: Fabric - Simple pythonic remote deployment tool.

, , ,

Deployment woes.

I code in Java at work, and all of my projects can be divided into two camps:
  1. Those that build to .jar files and deploy to our maven2 repo.
  2. And those that build to .war files and are deployed to some application server.


Once configured, maven itself is pretty good at making case #1 run smoothly - "mvn deploy" is all it really takes. But do I have anything similar for case #2?

Well, not quite. I tried banging something together with Capistrano, but that didn't work. For one thing, Capistrano expects you to be deploying a Rails application and these are typically all deployed in the same way, which is very different from deploying a .war file. Secondly, I had a build/compile step in my process that I needed to do locally (because compiling on the server is troublesome and a bad idea), this meant that I had to use the put function to upload my .war file. Put worked nicely for small text files (ie. my project.properties file), but failed miserably when the payload war a near 20 MB heavy .war file.
Googling and tweaking the flags on the File object didn't work. It simply refused to upload that damn .war file.

Itch-scratchers mentality.

I suppose I could have written a shell script and leveraged the existing scp and ssh tool chain. And I did throw a pebble down this route to see how it felt, but eventually rejected the idea. The reason is that I'm deploying to multiple hosts simultaneously. I tried looking at clusterssh to see if it would help me in this regard, but it turned out to be odd, klonky and basicly not designed for such a task. I also took a look at Dancer's Shell, dsh, which was a lot closer to home, but my username contains a backslash and dsh meticulously stripped it out prior to logging in, foiling any and all of my attempts at tricking it into taking it at face value.

What was I suppose to do? I had pretty much given up on automating the process and started to accept defeat, when I happened upon the paramiko module for python. Paramiko is a pure python implementation of the SSH 2 protocol, and lets you log into servers, execute commands and upload files. It prompted me to the idea to write my own pythonic version of a Capistrano like tool, and then use that for deployment. It could be made open source and a perfect oppotunity try out Git on a real project.

Birth of Fabric.

So I swiftly went to work. In two days, I had written the first prototype. It supported the most basic operations such as running remote shell commands, sudo'ing and uploading files. Then I registered my project on nongnu.org (chosen because of their Git support). And ba-da-bim, Fabric became an open source project: https://savannah.nongnu.org/projects/fab/

Fabric looks, on the surface at least (or for those who've only spent a short while with either), a lot like Capistrano. You have a fabfile (as oppose to a capfile) in you project directory, and that file describes all of your deployment tasks (or commands, in Fabric speak).

Commands are really just regular python functions (and you're allowed to call it fabfile.py if you like) that simply makes calls to some other, more magical, functions called operations. Operations are magical because they just sort of exist; you don't import them from a module and don't find them in an object - you just call them.

So, to get the feet wet, here's a hello-world'ish simple fabfile:
set(
    fab_user = 'joe.shmoe',
    fab_mode = 'rolling', # run stuff on one host at a time
    fab_hosts = ['node1.servers.com', 'node2.servers.com'],
)

def deploy():
    "Build and deploy a war file to our app. servers."
    local("mvn clean package")
    put("target/myapp.war", "myapp.war")
    run("mkdir /rollback/$(fab_timestamp)")
    run("cp /$(fab_host)/deply/myapp.war /rollback/$(fab_timestamp)/myapp.war")
    sudo("cp myapp.war /$(fab_host)/deploy/myapp.war")
    sudo("$(fab_host) restart")


Then, all it takes to deploy is a "fab deploy" in the directory where the above file is found. It's still pretty basic, low-level and imparative, but it's a good start.

Fabric also has a built in help system that is powered by your doc-strings; try typing "fab help:deploy" for instance. If you want to see what other commands are availble to you, then simply type "fab list". You can also get a list of operations (the local(), put(), run(), sudo() kind of things) by typing "fab help:ops" and get more details about the individual operation with, for instance, "fab help:put".

And that's basically it for today.

What I've learned this year.

Reflection time. I've been in the IT industry for real, for almost a year. What have I learned?

Python
I've learned the Python programming language; it has taught me list comprehensions and higher order functions such as map, reduce and filter. Python has also taught me what lambda expressions and first-class functions are. Python also has a powerfull api for regular expressions, and combined with the very dynamic nature of Python, I learned to do some very neat stuff - like writing a wiki-markup parser, with tables support, in 50 lines of code.

Django
I learned Django and wrote a blogging application in it - blogging applications are the new Hello World. Django taught me how a templating engine is to work with when it's done right, and Django taught me to love beautiful URLs. Django also introduced me to the meta-class feature of Python, and I wrote some really creazy (and very ugly) code using it - so I learned that meta-classes should only be used when they really make sense (like in Django's ORM).

Haskell
I didn't manage to learn Haskell, but I did study it and poke around. I learned the roots of the higher-order functions that I used in my Python code. I also learned about the underlying foldl and foldr functions that are powering many of these functions, and I learned that we have functional patterns, just as we have object-oriented patterns. Haskell also taught me importance of managing side-effects. I also learned about currying, pattern matching and the coolness of Hindle-Milner type inferance.

Spring Framework
Many of the Java projects I work on are using Spring in one way or the other. Spring taught me the usefulness of dependency injection and gave me the power to introduce that loose coupling that I've always struggled to achive in my Java projects.

AspectJ
Caching, logging and access-checks - these are all cross-cutting concerns and primary use cases for aspect-oriented programming. AspectJ gave me the power to factor these things into their own units and I was happy to learn how clean your code can get when you don't have to think about these things.

Maven2
My relationship with Maven is bitter-sweet; it is a blessing when it works, and a curse when it don't. But fact of the matter is, that I wouldn't start a Java project without it - a dependency managing build tool is something every language should have.

Web Services
I've been a key player in a move at my workplace towards more web services in our systems, and a web service powered single-sign on solution. I've learned about WSDLs, XSD Schemas, WS-Security and WS-I Basic Profile. Since we're using a lot of Java, I've also learned a lot about a web service framework called CXF, and I have written components for it, and extended it is several ways to make it satisfy our needs.

Common Lisp
Haven't spend that much time with it, but I learned that macros are pretty cool. I also learned that networking libraries sadly aren't part of the CL standard so they are unfortinatly implementation specific. I also learned that the Emacs/SLIME combo completely redeems the otherwise liberal use of parentheses you see in Lisp code.

Ruby
I only started playing with it this month, but blocks are pretty neat. Plus, I think I can see why people might like TextMate. Oh, and Rubinus has colored stack-traces (and a cute micro-vm architecture).

Bazaar
This SCM tool showed me the wonders of offline commit and perfect-rename features. Sadly the tool support isn't up par with subversion.

Capistrano
This is the very reason I started learning Ruby in the first place. Capistrano taught me why you should also spend time on your deployment tooling in a project, and not just building and the application itself. I haven't quite gotten there myself, but I'm looking forward to being able to build, test, deploy and configure an application for multiple servers, in multiple environments, with just a single command in a console.

SQL
You wouldn't believe what I've been through with regards to SQL. I inherited a reporting application which is consisting of about 60% pure, hardcore, SQL. I hardly knew any SQL when I started in January, and now I think it's safe to say that I'm an SQL super-user. Joins, subselects, optimization, schema-design and DDL - all tools in the belt.

So that's about it. Off the top of my head, these are the most noteworthy things I've learned this year. You can't be disappointed in that, and if I keep up this pace, 2008 is going to be an interresting year.

Perfect Type System

, , ,

Whenever you venture into the realm of language design, you will eventually touch on type systems. Type systems are an inherent part of any language, and many languages are famous specifically for their type systems - Haskell and Ruby comes to mind.

Boiled down, this means that if you design a language, you will also find yourself designing a type system.

This brings me to an old quote:

A design is perfect not when there is nothing more to add,
but when there is nothing more to take away.



How does this relate to type systems? A perfect type system by this definition would be more than minimal - almost non-existent.

Take assembler: Everything is bits and bytes, and bytes can be grouped in words who's size depend on the target processor architecture.

This might be minimal but it is hardly practical; the whole purpose of a programming language is to provide abstractions, and that can't really be said about assembler.

Assembler has primitives, numbers of varying bit-width, but in order to provider higher levels of abstractions, another meta-type is needed; aggregates - a collection or grouping of primitives. Without both aggregates and primitives, then it'll be pretty hard to define a practical and proper high-level language.

Now, the challenge is to find the minimum set of members for these two meta-types. First up is primitives, and I propose that we can do with just a single member and some syntactic sugar.

Consider how the notion of a number can be the only primitive in a language; characters are really just a special case of numbers, and integrals and reals can be unified into a single type. This could be simulated with cohesion as it is in many languages, or by not considering integrals and reals to be two separate types
at all.

With a sufficiently high-level language, we can also unify finite precision and infinite precision numbers into a single type. The Strongtalk implementation of the Smalltalk language is able to automatically detect overflow in finite precision numeric types, and convert them on the fly to their infinite
counterparts.

If we're willing to defer performance considerations to an implementation detail, we could simplify things even further by making all numbers infinite precision decimals. Characters would be nothing more than a unicode ordinal and a bit of syntactic sugar.

This way, we'd end up with a single primitive type: the Number.

So what of the aggregates? How many members of this meta-type are required for a proper type system? Once again, I propose that the answer is just one; the Function.

Let's contemplate that idea a little, and see how it will compare to the type system of an existing functional language, such as Haskell.

In Haskell, I would count functions, lists, tuples and `data` types to be among the aggregates. The challenge for my alleged functions-as-only-aggregate is now to figure out how to represent each of these meta-types as functions.

Functions themselves are a no-brainer in this regard to let us jump straight to lists: A list is an ordered sequence of values, these values are most often iterated and re-arranged when worked with on programs. For the purpose of iteration, and special kind of functions exists in the Python languages that are called generators - they are functions that can return multiple times during a single invocation, a poor mans continuation, if you will.

So using continuations, it becomes quite easy to create ordered sequences of values - every time you need the next value in an iteration, you just 'continue' to the next return value of the continuation. Then add some syntactic sugar so the complexity of continuations doesn't bleed into code where it is not needed, and we have a very powerful list representation.

If we make our type system dynamic (non-static, to be precise), then tuples would be in the same exact ballpark as lists. I don't think we even need to distinguish between tuples and lists at all, if our type system is dynamic, and I'm not sure it's needed in a static type system either - Java, for instance, seems to be doing just fine without tuples.

Lists was arguable the most important meta-type to get straight, but object or struct-like types, types with named attributes or parts, are extremely useful as well - many languages allow for some interesting magic by facilitating introspection of the names and values of parts of these types.

Let's return to the generators in Python. These functions are able to stop in the middle of their execution to return values. Then they wait to resume execution when the next value is requested - their complete state is preserved while they wait, and they resume execution from the exact point they left off.

Now, Python is a procedural (and object-oriented) language and as such has the ability to assign values to variables. Then imagine a continuation that is waiting to be resumed, and that we were able to access the variables inside the scope of the continuation.

This solution is arguably less elegant than the list solution, and also raises a number of questions; specifically, in a functional language, functions tend to not really have any state at all, much less a state where the values have names.

Functional languages like Lisp and Haskell are declarative. This makes stateful functions obtuse and very hard, if not impossible, to create. Haskell have monads for representing state, but they are not quite the same thing as a function, and as such would be a separate member of the aggregate meta-type, distinct from the functions.

We could instead allow for assigning aliases to the expressions that make up a function, and the values that these expressions generate would be accessible through these aliases when the continuation halts.

This still wouldn't be entirely elegant, and there's also the question of what happens when the continuation is continued - I don't see an obviously correct answer to this question, so a language would be forced to define a convention in this regard, and such a convention will certainly surprise and bite a few people, as it is destined to be misunderstood or assumed to have a different behavior.

Regardless, I think it is doable, albeit it isn't pretty. So a language would probably have to take that fact into account, and either provide some heavy syntactic sugar to make this bearable to work with, or be designed in such a way that you will need these kinds of types less, or both.

At the bottom of it all, I think it is possible to design a programming language around a minimalist type system like this. It would probably be the kind of languages that are easy to learn but difficult to master - a very, interesting, language to be sure.

Would such a type system be perfect? I'm not sure, 'perfect' is tainted with subjectivism, but I do think that it would be rather elegant if we disregard the way that we implemented struct-like types - I suppose you can't make a language without compromise.

Changing the Eclipse workspace dir.

New back-up policies means that I had to change my eclipse workspace directory today.

Anybody who have tried this have probably given up or will testify that it's a pain to figure out (try googling for it and you'll know). It's not that making the actual change is hard, but rather knowing where and what.

Well, I did a grep in my eclipse install dir, searching for my previous (because I had already moved it) workspace dir, and it turns out that this little piece of info is kept in a file at this path:

ECLIPSE_INSTALL_DIR/configuration/.settings/org.eclipse.ui.ide.prefs


In that file there's a property called RECENT_WORKSPACES - that's the sweet spot you're looking for.

Return of the PermGen

,

This appears to be the third installment in the saga of my PermGen problems.

For good measure, here are the previous posts:


The newsflash this time is that I stumbled upon some guy Thompson's blog post about this very problem - by itself nothing new; concluding by recommending a JVM with infinite perm-space (BEA or IBM).

The interesting part, however, is stoney's comment where he links to two posts on a blog by Frank Kieviet.

The first of Franks posts is concerned with what the PermGen error is really all about, namely how these leaks in the reloading classloaders in our application servers happen. His second post is a bit more involved and shows how you can use new profiling tools in Java 6 to track down classloader leaks.

So the good news is that we now have the knowledge and tools required to kill this bug.
The bad news is that we've also learned that walking the walk will probably be a lot tougher than talking the talk.

We have to scrutinize our dependency tree, and think really, really, REALLY hard before we put a jar in our servers indorsed directory.

In fact, I think I'll try an experiment tomorrow (today, actually, after I've slept) where I strip down a Tomcat instance (JBoss is probably too intertwined with itself to really strip down) and create some web applications who's only runtime dependencies are the servlet-api and the JRE!

I should really think that classloader leaks are really hard to create with something as simple as the servlet-api, and if you use your JRE with care, then it should damn well be possible to avoid classloader leaks with that dependency as well.

Man, I'm really excited about the idea that there's actually hope for us lay-coders to actually rid ourselves of this nasty ghost-bug-thing/issue.

I'll be sure to report back when I've done some experiments with this new knowledge!