Fun with GNU Parallel
Tuesday, January 18, 2011 11:50:00 PM
Recently I have being playing a lot with GNU Parallel, which is kind of like xargs on steroids. I really think this is one of the coolest utilities I have used in a long time, so on the off chance anyone regularly reads this blog (and isn't already aware of it) I thought I'd give it a bit of free advertising.
However, it is late and I should really get to bed, so rather than give you a lengthy explanation of how it works and why it is so great I thought I'd give you a very quick taste, showing off just a very small amount of its power!
It probably helps to provide an example Opera users can relate to so let's take the recent blog post I did on the Opera Desktop team blog about Snapshot 11.00-1170. In it, my fairly short change log included:
Originally posted by ruario:
Now some of the changes are obvious but some people may have been left thinking, "What exactly does that mean?" or "How can I get more details of the specifics?". Well if you were one of those people you have come to the right place!Some skin fixes and tweaks

A neat tool for comparing files on Unix-like OSes is diff but the problem with diff is that it doesn't do much with binary files other than telling you they are different. Given the skin files are zipped and hence binary, you hit exactly this problem. Sure you can unpack the zips first but it starts to become a hassle and fairly time consuming. So what to do. Harness the power of GNU Parallel of course!
Assuming you saved down a copy of opera-11.01-1164.x86_64.linux.tar.xz and opera-11.01-1170.i386.linux.tar.xz from the last couple of snapshots into the same directory, the following command would unpack both tar packages and all compressed files within those tar packages, creating an appropriately named subdirectory for each uncompressed component:
parallel -I A --basenamereplace B 'mkdir A_; cd A_; tar xf ../B --strip-components 1; find share | grep -E "\.(ua|zip)$" | parallel "mkdir {}_; cd {}_; unzip -q ../{/}; rm -f ../{/}"; find share -name "*.gz" -exec gunzip {} \;' ::: opera-11.01-1164.i386.linux.tar.xz opera-11.01-1170.i386.linux.tar.xz Remember I said "all"? I actually uncompressed the man pages and Unite files for good measure! 
Note: I am using a Beta level Parallel option here '--basenamereplace', which means for this to work you will need GNU Parallel version 20101222 or greater. If your distro does not include such a recent version of GNU Parallel within its software repository, you don't need to worry as you can get packages for a range of distros (or source code) from the GNU Parallel website.
Once the above is done, you can then recursively diff the skin directories to see exactly what changes where made:
diff -r opera-11.01-1164.i386.linux.tar.xz_/share/opera/skin opera-11.01-1170.i386.linux.tar.xz_/share/opera/skinPretty cool, huh?

To be honest the Parallel man page has plenty of much more impressive examples (like running some of your jobs remotely on multiple machines to speed things up and making better use of available CPU cores) but I have resisted the urge to copy all these great examples out here. If however, I wet your appetite and you want to learn more I recommend you check out the following two introduction videos that the author himself provides. In fact I got the links directly from the Parallel man page (Yeah a man page with YouTube links to instructional videos. Surely this is a first!)
Part 1: GNU Parallel script processing and execution
Part 2: GNU Parallel script processing and execution
I hope you all enjoy Parallel as much as I do!
P.S. Because I uncompressed every compressed file earlier, you can check all the files that changed throughout the packages, i.e.:
diff -r opera-11.01-1164.i386.linux.tar.xz_ opera-11.01-1170.i386.linux.tar.xz_This may be handy in the future.














Ruarí Ødegaardruario # Tuesday, January 18, 2011 11:55:26 PM
to: This would cause them all to completely unpack recursively into named directories. Ripe for diffing!
In fact you could remove that off the end entirely, stick a find command on the front and pipe this into the main parallel command and recursively unpack every copy of Opera on your hard disk!
Kyle Bakerkyleabaker # Wednesday, January 19, 2011 12:41:54 AM
Ruarí Ødegaardruario # Wednesday, January 19, 2011 8:32:42 PM
I do!
seq 0 10 60 | parallel 'wget "http://my.opera.com/desktopteam/blog/?startidx={}" -qO- | sed -rn "s,.*unix/[a-zA-Z0-9]+_([0-9][0-9]\.[0-9][0-9]-[0-9][0-9][0-9][0-9])/.*,\1,p"' | sort -u | tail -n 50Ruarí Ødegaardruario # Friday, January 21, 2011 10:32:06 AM
Building and installing parallel from source is pretty easy. Indeed most packages are pretty easy to build and install from source, though it can seem daunting if it is all new to you.
If you have sudo installed and configured you can install Parallel from source as follows:
If you don't have sudo installed or configured:
Files will be installed in the following locations:
If you ever want/need to uninstall Parallel.
With sudo:
Without sudo: Edit: Updated instructions to GNU Parallel version 20110122.
Ruarí Ødegaardruario # Sunday, January 23, 2011 2:55:08 PM
That has just made my whole week!
http://lists.gnu.org/archive/html/parallel/2011-01/msg00018.html
Thanks to Ole Tange, both for the excellent Parallel and the mention.