Skip navigation.

Software Development

Correcting The Future

Parallel Dataflow QuickSort

I've been asked in the past how I would do a parallel quicksort in Project V by a few of my readers. Now that Project V is nearing completion as far as the construction parts are concerned, I'm in a better position to show how this would work. I'm going to show two diagrams that represent quicksort. The second diagram is one of the components used by quicksort. It's what splits the input into two parts by using a pivot.

Note that this quicksort network is not optimized in any way. There's no stream control at all meaning that certain operations that should work on an entire stream, or substream, is instead acted upon on each and every item. This is why there are generate components to duplicate results for use with each item. This makes some parts of the network much longer than it should be (like checking if a stream is longer than 1 item). So these diagrams use all primitive components except for Select Pivot. You can just pick the first one or something for that component. In this case, you don't even need the Select Pivot component. You can short circuit that component by linking its input directly to its output. While the entire stream is passed along, only the first item will be used by the following component.

The images used here is NOT Project V. I will use different graphics with lots more functionality for manipulating and creating components.





So the top part where it checks the length of the stream decides if there's anything to do. If the stream only has one item (or none where the end of stream marker is the only data), then the stream is sent directly to the output of quicksort. The bottom part of quicksort should be easy to follow. As for the second diagram Split Stream, all it does is split the stream according to the pivot. Value less than or equal to the pivot are sent to Out1 and the rest on Out2. This diagram is actually the proper size. I may decide to not use Dup components and just enable multiple connections from outputs. I haven't made up my mind yet.

Note that there is nothing in these diagrams about parallelism. It's naturally parallel. Each component can execute as soon as it receives all its input. One drawback in this version is the count component. It needs the entire stream before giving an answer. That will stall everything for that stream. There's no need to count every single item just to find out if there's more than one item. So that part can easily be optimized. But I just wanted to show something that does work and is parallel even with the noted drawbacks. If I use arrays instead of streams, then count gives a result immediately. But this requires more tweaking where I need to access individual items rather than the entire stream. The version above has other advantages too. It will start producing output BEFORE the entire input is sorted. This means if there are other components that use the output of quicksort, it need not wait until quicksort is completely done. The data can be pipelined and even more parallelism is achieved.

For those who think this is verbose, it actually isn't. Languages that have terse representations have a LOT of built in functionality. I use only very basic components here. I could do to other components what I did for Split Stream. That way, there'd only be 5 components. There'd be one to check if the stream is longer than 1 item. Then there's Split Stream, the two internal quicksorts and the final component would be the sequencer to put the lists back together. But I didn't want to do that. I wanted to show what was going on down to the most basic operations.

Oh, in case you're wondering, the FCFS component is First Come First Serve. It takes items as they arrive on any of its inputs and sends them out to its output. In our diagram, only one input will ever get used (ie. never both at once), but it serves its purpose. I just wanted to have whatever input that does have data to be sent through.

Note too that all these components will have generic versions. So you can create any data structure you like and define a custom component for GreaterThan. You may also define one that isn't global and tell QuickSort to use that for the comparison. It'll then go and replace the internal GreaterThan component. This is what I mean by using editors. That's a feature I hope to display soon.

I'm thinking of adding some more components that combine comparisons with mux and demux. That would eliminate a LOT of needless components like dup. Yeah, that would convert three components into one. Well, I might update these diagrams later. I'm going to leave them as is for now.

There's also another way to create duplicates of components, but that's an advanced feature. Both methods are implemented in the exact same way. Some may see this as recursion, but it isn’t. The internal quicksort components and the container are exactly the same component. Instead of feeding data into the main one from inside itself, you just use internal components of itself. At runtime, it may duplicate the component so that it can run in parallel, but it'll do this with any component that has multiple streams going into it. However, there is no problem in thinking of it as recursive. Even if that's not what is happening, I don't see anything that would make this view be a problem.

I'd also like to point out that if one is using primitive components, then it should be expected that it'll look busy with components. That's not what I'm writing this for. I'll have editors that will produce low level networks for you. While this possibly has a slower setup time for building the initial set of components, you won't need to ever build them again. And applications can be used as components, so everything that is created is a library. Over time, I hope that more and more components will be available until the time where very little custom component creation will happen.

Anyways, I hope this acted as a little preview of things to come. I'll be working on the runtime engine soon, so stay tuned. I'm finishing up some runtime versions of the primitive components right now.

SetsUpdate on Parallel QuickSort

Comments

avatar
Acutally, I've changed my mind. It is recursive. The execution is not recursive though. Instead, it's the internal data structure that grows and shrinks accordingly (and in parallel).

By Vorlath, # 26. October 2007, 03:59:25

Write a comment

Comment
(BBcode and HTML is turned off for anonymous user comments.)

Please type this security code : 367777

Smilies

November 2008
S M T W T F S
October 2008December 2008
1
2 3 4 5 6 7 8
9 10 11 12 13 14 15
16 17 18 19 20 21 22
23 24 25 26 27 28 29