Jez Higgins

Freelance software grandad
software created
extended or repaired


Follow me on Twitter
Applications, Libraries, Code
Talks & Presentations

Hire me
Contact

Older posts are available in the archive or through tags.

Feed

Thursday 31 December 2020 The Forest Road Reader, No 2.60 : STinC++ - quick sort, part 2

STinC++ rewrites the programs in Software Tools in Pascal using C++

When I headed off to bed last night, our quick sort implementation looked pretty good.

template<class Iterator>
void quick_sort(Iterator begin, Iterator end) {
  if(std::distance(begin, end) <= 1)
    return;

  auto pivot = *begin;
  auto left = std::next(begin);
  auto right = std::prev(end);

  while (left != right) {
    while ((left != right) && (*left < pivot))
      left = std::next(left);

    while ((left != right) && (*right >= pivot))
      right = std::prev(right);

    if (left != right)
      std::iter_swap(left, right);
  }

  if (*left >= pivot)
    left = std::prev(left);
  std::iter_swap(left, begin);

  quick_sort(begin, left);
  quick_sort(right, end);
} // quick_sort

Today I want to fill that out so we can pass in a custom comparator, extending the interface to match std::sort [1].

template<class RandomIt, class Compare>
void sort(RandomIt first, RandomIt last, Compare comp);

I’m still on my laptop, but at the dining table this morning. I have a cup of coffee on my right, and a teenager who might ask me to help with his A-level physics on my left. I’m all set.

Let’s start by getting the default comparator, std::less, into place. If the tests all still run - and, you know, they should - that’s a happy start.

Add it to the function definition first.

template<class Iterator, class Comparator = std::less<>>
void quick_sort(Iterator begin, Iterator end, Comparator comparator = Comparator()) {
  ...

And then hook it up in the body of the function, where

  while ((left != right) && (*left < pivot))
    left = std::next(left);

  while ((left != right) && (*right >= pivot))
    right = std::prev(right);

becomes

  while ((left != right) && comparator(*left, pivot))
    left = std::next(left);

  while ((left != right) && !comparator(*right, pivot))
    right = std::prev(right);

Tests pass. No A-level questions. Looking good so far.

The tests I dropped in yesterday included a set with a custom comparator. I’ll enable those and see what happens.

Sorts up, but not down

Sigh

Oh wait. I’m a twit.

  if (*left >= pivot)
    left = std::prev(left);
  std::iter_swap(left, begin);

The pivot point shuffling is no longer correct now we’ve got a comparator.

  if (!comparator(*left, pivot))
    left = std::prev(left);
  std::iter_swap(left, begin);

That helps a bit, but it wasn’t a miracle cure.

Sorts a bit more down. 10 of 17 tests now passing."
Look, Then Look Again

When I added the comparator parameter, I defaulted the value. I wanted to make the smallest possible change I could, because if I’ve learned anything over the past thirty years it’s the less I type the fewer bugs I put in the code.

Sometimes though, not typing enough will also put bugs in. Here’s one of the test failures.

---------------------------------------------------------------------------
Chapter 4 - quick sort
  reverse sort { 3, 10, 9, 12, 8, 4, 1, 6, 5, 13, 2, 7, 11 }
---------------------------------------------------------------------------
/home/jez/work/jezuk/stiX/c++/chapter_4/0_sorts/quick_sort.cpp:81
...........................................................................

/home/jez/work/jezuk/stiX/c++/chapter_4/0_sorts/quick_sort.cpp:88: FAILED:
  REQUIRE( under_test == expected )
with expansion:
  { 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 3, 1, 2 }
  ==
  { 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1 }

If we momentarily set aside the last three values, this sequence is beautifully sorted. In ascending order. Where’s that less-than comparator coming from? It’s coming from exactly where I asked the compiler to add it for me.

  quick_sort(begin, left);
  quick_sort(right, end);

Quick sort is a recursive algorithm. If there’s a custom comparator passed in at the top, I need to pass that on down as we recurse.

  quick_sort(begin, left, comparator);
  quick_sort(right, end, comparator);
/home/jez/work/jezuk/stiX/c++/cmake-build-debug/0_quick_sort
===========================================================================
All tests passed (28 assertions in 1 test case)

Process finished with exit code 0

Hurrah. Tests - they save you from yourself.

Quick Sort

template<class Iterator, class Comparator = std::less<>>
void quick_sort(Iterator begin, Iterator end, Comparator comparator = Comparator()) {
  if(std::distance(begin, end) <= 1)
    return;

  auto pivot = *begin;
  auto left = std::next(begin);
  auto right = std::prev(end);

  while (left != right) {
    while ((left != right) && comparator(*left, pivot))
      left = std::next(left);

    while ((left != right) && !comparator(*right, pivot))
      right = std::prev(right);

    if (left != right)
      std::iter_swap(left, right);
  }

  if (!comparator(*left, pivot))
    left = std::prev(left);
  std::iter_swap(left, begin);

  quick_sort(begin, left, comparator);
  quick_sort(right, end, comparator);
} // quick_sort

This is, as far as my tests are telling me, a complete quick sort implementation. There’s more that could be done, especially around selecting a good pivot value. Just grabbing the first value works, but for sequences which are already sorted it’s the worst case, and it’s not great for partially sorted sequences. Robert Sedgewick, who wrote his PhD on quick sort and whose Algorithms is pretty much the standard text, proposes the median of the first, middle, and last element.

I am leaving all that good stuff to the library implementers. This is an exercise, and I am exercised. See you in the New Year, friends.

Kernighan and Plauger’s code

{ quick -- quicksort for lines }
procedure quick (var linepos : posbuf; nlines : pos; var linebuf : charbuf);
{ rquick -- recursive quicksort }
procedure rquick (lo, hi: integer);
var
  i, j : integer;
  pivline : charpos;
begin
  if (lo < hi) then begin
    i := lo;
    j := hi;
    pivline := linepos[j];  { pivot line }
    repeat
      while (i < j)
        and (cmp(linepos[i],pivline,linebuf) <= 0) do
          i := i + 1;
      while (j > i)
        and (cmp(linepos[j],pivline,linebuf) >= 0) do
          j := j - 1
      if (i < j) then     { out of order pair }
        exchange(linepos[i], linepos[j])
    until (i >= j)
    exchange(linepos[i], linepos[hi]); { move pivot to i }
    if (i - lo < hi -i) then begin
      rquick(lo, i-1);
      rquick(i+1, hi)
    end
    else begin
      rquick(i+1, hi);
      rquick(lo, i-1)
    end
  end
end;
begin
  rquick(1, nlines)
end;

Now I’ve written my own implementation, this is reassuring familiar.

I think the reason Karnighan and Plauger can swap the pivot into place without adjustment is because they use the rightmost value as the pivot [2].

They don’t explain the little if at the bottom selecting which order to recurse into the two subsequences, but I believe it’s to minimise the call stack. By recursing into the smaller subsequence first, the depth of the call tree is kept to its minimum. This isn’t something we would think about now, except in exceptionally constrained environments. However, as I was reminded by John Cowan about this time last year, by modern standards the machines Kernighan and Plauger used were constrained environments. There wasn’t a lot of memory to chuck around, so code size was important, stack space was limited. Working within those boundaries was a natural and normal thing to do.

Source Code

Source code for this little routine, indeed the whole project, is available in the stiX GitHub repository.

Endnotes

This whole endeavour relies on Software Tools in Pascal and Software Tools, both by Brian W Kernighan and PJ Plauger. I love these books and commend them to you. They are both still in print, but new copies are, frankly, just ridiculously expensive. Happily, there are plenty of second-hand copies floating round, or you can borrow them from The Internet Archive using the links above.

For this project I’ve been using JetBrain’s CLion, which I liked enough to buy and then renew a license for.

The test harness I’m using is Catch. I’ve been aware of Catch pretty much since it was first released, but this project is the first time really getting in and using it. I like it, and it’s what I use by default now.


2. Worked a couple of examples by hand, and this is the reason. Index i ends up pointing to a value equal to or greater than the pivot. The pivot in on the right, so the value being swapped ends up in the proper subsequence

Tagged code, and software-tools-in-c++

Wednesday 30 December 2020 The Forest Road Reader, No 2.59 : STinC++ - quick sort

STinC++ rewrites the programs in Software Tools in Pascal using C++

We’re near the midpoint of Chapter 4 now. Having started by looking at bubble sort and Shell sort before addressing the practicalities of a text sort program, we step firmly back into the realms of Computer Science to look at quick sort.

To be honest, up until I prepped for this article, I didn’t understand how quick sort worked. I knew it had been invented by Tony Hoare, one of the great computer scientists. I knew too it was the algorithm you would expect to find at the heart any language’s library sort function [1]. In terms of how it worked, I knew quick sort involved chosing a pivot around which the sequence was divided into ever smaller subsequences, at the end of which it was somehow sorted, but I couldn’t have told you how it actually worked.

Kernighan and Plauger’s description is correct, but it didn’t really get me closer to be able to implement the algorithm.

Quick sort is best described as a recursive procedure. The essential idea is to partition the original set to be sorted by rearranging it into two groups - all those elements less than some arbitrary value chosen from the set, and all those greater than or equal to the value. Then the same partitioning process is applied to the two subsets in turn until each subset contains only one element. When all subsets have been partitioned, the original set is sorted.

They may, on this occasion, have fallen into the trap of being so familiar with the subject that their explanation is clear to them but not to this novice. I might just not be getting it. This Computerphile video got me over the hump.

Quick Sort

After watching it I did, at last, understand the procedure, although I wouldn’t implement a sort in the way described. I have some meta-knowledge here too, in that I’m aware you can implement quick sort in place, without additional memory allocation. Working in C++ means you can never pretend you haven’t got performance considerations lurking somewhere in the back of your mind 😃.

Fuelled up and ready to go

I’m on the sofa with my Dell laptop on one knee, a bottle of Bishops Finger on the coffee table next to me, and a half formed idea in my head of how to quick sort a sequence. Let’s see what goes wrong.

The best code is no code

I’m going to start with the simplest possible thing I can - a quick sort implementation that sorts an empty sequence.

template<class Iterator>
void quick_sort(Iterator& begin, Iterator& end) {
} // quick_sort

Obviously this is a bit silly, but you’ve got to start somewhere. There is something useful we can do next though. We know, from what we’ve been told, that quick sort is a recursive procedure, working on progressively smaller subsequences. Once a subsequence contains one or zero items, we’re at the limit and should stop the recursion.

template<class Iterator>
void quick_sort(Iterator begin, Iterator end) {
  if(std::distance(begin, end) <= 1)
    return;

  throw std::runtime_error("Oops");
} // quick_sort

Perfect.

Engage!

This is the place where I need to start thinking, and make my first moves to partition the sequence.

First, I need to choose what to use as the pivot value that will be used to partition the sequence into two. The obvious choices are the first value, pointed to by begin, or the last value, which is one before end. Of these, the first value is the easiest, so I’m choosing that.

  auto pivot = *begin;
  auto from = std::next(begin);

Let’s think, if we have a sequence

4 9 1 8 2 7

then the pivot value will be 4, and after partitioning we should end up with something like

1 2 4 9 8 7

If we have an iterator pointing to the left hand end of the sequence (from above, which I think I might rename) working up, and another pointing to the right hand end of the sequence working down then eventually they’ll meet in the middle.

4 9 1 8 2 7
  ^
          ^

While the left iterator points to a value lower than the pivot, it can advance. Here, it will stay put. While the right iterator points to a value higher than the pivot, it can move down. It is able to so let’s do it.

4 9 1 8 2 7
  ^
        ^

Now, neither can move, but they haven’t yet met. If we swap the two values they point to, they’ll both be freed up to move again.

4 2 1 8 9 7
  ^
        ^

Left can now advance to 1 and then to 8.

4 2 1 8 9 7
      ^
        ^

Right can move down to 8, where it meets left.

4 2 1 8 9 7
      ^
      ^

Blimey, this is working, more or less.

Steady on there, big man

Maybe left shouldn’t have gone all the way to 8. I’ll rewind slightly.

4 2 1 8 9 7
    ^
      ^

Perhaps this is the boundary condition, when the two iterators adjacent to each other. The left iterator points to where pivot fits in the sequence, so we can swap it into place.

1 2 4 8 9 7
    ^
      ^

Let me cast that into code.

<a few minute later>

Ok, that’s super fiddly. It’s much easier to let the two iterators meet each other, then wind left back a step afterwards.

  auto pivot = *begin;
  auto left = std::next(begin);
  auto right = std::prev(end);

  while (left != right) {
    while ((left != right) && (*left < pivot))
      left = std::next(left);

    while ((left != right) && (*right >= pivot))
      right = std::prev(right);

    if (left != right)
      std::iter_swap(left, right);
  }

  left = std::prev(left);
  std::iter_swap(left, begin);
Driven by test
  auto sample = std::vector { 4, 9, 1, 8, 2, 7 };
  stiX::quick_sort(sample);

  REQUIRE(sample == std::vector { 1, 2, 4, 8, 9, 7 });

Hmm. I do not love this code, especially the repeated left != right, but it works so far. I’m going to plug in the recursion and see what happens.

Fingers crossed
  quick_sort(begin, left);
  quick_sort(right, end);

Reader, I confess I let out a small exclamation of surprise. The sequence { 4, 9, 1, 8, 2, 7 } popped out all nicely sorted.

Moar tests

One swallow does not a summer make, nor does one test properly exercise a sorting algorithm. I have a pile of tests already hanging around, so I’m going to whammo those in and see what happens.

i should have called it quick not sort
Figure 1. 9 out of 13 tests failing. I should have called it quick-not-sort

Looks like I just got lucky with that first try.

Some of them are almost there,

---------------------------------------------------------------------------
Chapter 4 - quick sort
  sort { 3, 10, 9, 12, 8, 4, 1, 6, 5, 13, 2, 7, 11 }
---------------------------------------------------------------------------
/home/jez/work/jezuk/stiX/c++/chapter_4/0_sorts/quick_sort.cpp:70
...........................................................................

/home/jez/work/jezuk/stiX/c++/chapter_4/0_sorts/quick_sort.cpp:77: FAILED:
  REQUIRE( under_test == expected )
with expansion:
  { 1, 2, 3, 5, 6, 4, 7, 8, 9, 11, 10, 12, 13 }
  ==
  { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13 }

while others aren’t even close.

---------------------------------------------------------------------------
Chapter 4 - quick sort
  sort { 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1 }
---------------------------------------------------------------------------
/home/jez/work/jezuk/stiX/c++/chapter_4/0_sorts/quick_sort.cpp:70
...........................................................................

/home/jez/work/jezuk/stiX/c++/chapter_4/0_sorts/quick_sort.cpp:77: FAILED:
  REQUIRE( under_test == expected )
with expansion:
  { 2, 4, 6, 8, 9, 10, 7, 11, 5, 12, 3, 13, 1 }
  ==
  { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13 }
Where did it all go wrong?

We’re a long way down the page. Here’s the current state of the code.

  template<class Iterator>
  void quick_sort(Iterator begin, Iterator end) {
    if(std::distance(begin, end) <= 1)
      return;

    auto pivot = *begin;
    auto left = std::next(begin);
    auto right = std::prev(end);

    while (left != right) {
      while ((left != right) && (*left < pivot))
        left = std::next(left);

      while ((left != right) && (*right >= pivot)
        right = std::prev(right);

      if (left != right)
        std::iter_swap(left, right);
    }

    left = std::prev(left);
    std::iter_swap(left, begin);

    quick_sort(begin, left);
    quick_sort(right, end);
  } // quick_sort

There’s no obvious error in the code, at least to me. It’s not crashing, for example, nor are the results entirely wacky. There’s a certain regularity to the errors. Many have groups of three elements almost but not quite there, as in the first example above. I’m especially intrigued by the second example though. What is going on there?

We’re starting with a sequence in descending order.

13 12 11 10 9 8 7 6 5 4 3 2 1
    ^
                            ^

Since all the values are lower than the pivot value of 13, the left hand iterator is going to advance all the way to 1, while the right hand iterator isn’t going to move at all.

13 12 11 10 9 8 7 6 5 4 3 2 1
                            ^
                            ^

So far so good - the while loop is ok - so next, I swap the pivot into place [2].

  left = std::prev(left);
13 12 11 10 9 8 7 6 5 4 3 2 1
                          ^
                            ^

You can see where this is going …​

  std::iter_swap(begin, left);
2 12 11 10 9 8 7 6 5 4 3 13 1
                          ^
                            ^

Oh dear. It’s going to be all downhill from here. The two subsequences are { 2, 12, 11 …​ 5, 4, 3 } and { 1 }. The second, shorter, sequence is already 'sorted', but I guess we’re now going to see a similar effect when sorting the first sequence.

2 12 11 10 9 8 7 6 5 4 3
   ^
                       ^

Since all the values are greater than the pivot value of 2, the left iterator won’t move, while the right will whizz down to meet it.

2 12 11 10 9 8 7 6 5 4 3
   ^
   ^

Decrementing left leaves it pointing at the pivot.

2 12 11 10 9 8 7 6 5 4 3
^
   ^

so the subsequent swap is a no-op. The two subsequences are the no-op { 2 } and { 12, 11, 10, 9, 8, 7, 6, 5, 4, 3 }, which is essentially identical to where we started. And so on and so on and so on.

What a balls up.

What we do, we update our formulas

In this case, left has advanced all the way to the right hand end of the sequence but is still pointing at a value which should be in the left hand partition. Rather than pointing one beyond where the pivot should be placed, it’s pointing to exactly where it should be placed. This presumably, is also happening in other sequences, perhaps further down in the recursion. The effect is particularly pronounced here because the elements are in reverse sort order.

Rather than always decrementing left before placing the pivot, only decrement if left points to a 'right-hand' value. That should fix it.

  if (*left >= pivot)
    left = std::prev(left);
  std::iter_swap(left, begin);

<deep breath> And …​

All tests pass. Now that’s a quick sort.

That’ll do for now

There’s a little more to do extending the interface to match std::sort but between this, loading the dishwasher, putting the dogs to bed, and all that other real-life malarkey, it’s rather later now than I had anticipated it being when I set off. That said, I’m pretty pleased with how it’s gone. I thought my way into the code pretty well, and then when it wasn’t right I was able to think my way back out again. I confess I don’t often do quite so much thinking, or at least not so much thinking before taking to code. Sorting algorithms aren’t really the kind of work that lends themselves to iterative development. You have to do a bit of upfront design work.

Source Code

Source code for this little routine, indeed the whole project, is available in the stiX GitHub repository.

Endnotes

This whole endeavour relies on Software Tools in Pascal and Software Tools, both by Brian W Kernighan and PJ Plauger. I love these books and commend them to you. They are both still in print, but new copies are, frankly, just ridiculously expensive. Happily, there are plenty of second-hand copies floating round, or you can borrow them from The Internet Archive using the links above.

For this project I’ve been using JetBrain’s CLion, which I liked enough to buy and then renew a license for.

The test harness I’m using is Catch. I’ve been aware of Catch pretty much since it was first released, but this project is the first time really getting in and using it. I like it, and it’s what I use by default now.


1. This isn’t quite true. C++ standard libraries typically use Introsort, a hybrid of quick sort, heapsort, and insertion sort
2. I’ve fallen into Ron Jeffries habit of using 'we' for the good things and 'I' when it goes wrong. As he says, who else is there to blame?

Tagged code, and software-tools-in-c++

Monday 28 December 2020 The Forest Road Reader, No 2.58 : STinC++ - in-memory text sort

STinC++ rewrites the programs in Software Tools in Pascal using C++

After a couple of warm-ups (from which, if we’re honest, I’ve probably cooled back down from), the first proper program in Chapter 4 of Software Tools In Pascal is sort, a program that sorts a text file line by line into increasing lexicographic order. Kernighan and Plauger present this as a filter even though, they point out, it isn’t really a filter because it can’t output anything until its input is complete. (That little note, along with phrases like increasing lexicographic order, is the kind of precise language that just fills me with reading pleasure.)

The first iteration of sort assumes the data fits in memory. We’ll read the whole of the input in one pass, sort it, then spit it out again.

Data Representation

Pascal’s lack of a proper string type has hampered Kernighan and Plauger throughout, and here it presents a real issue. Using fixed sized character arrays as a substitute string type just isn’t going to cut it. For short strings they’ll waste memory, which is a scarce resource, and for longer strings present a further problem they would need to resolve somehow.

For them, then, their first task is come up with a suitable data representation. They choose to read the input into a single large array, then keep a second array of indexes pointing the start of each line.

Here in the future, I can just write:

auto lines = std::vector<std::string> { };
Sort implementation

Having already written a Shell sort earlier in the chapter, Kernighan and Plauger now have to specialise it for their new data structure. It needs to be extended to extract each line under consideration, then further to compare the lines, and finally how to swap lines. This touches almost every part of the sort routine. They do have one small win - swapping two lines only involves switching the indexes rather than fiddling inside the large array.

I need to specialise my sort too …​

std::sort(lines.begin(), lines.end());
Input and Output

Having considered their data structure and sort, Kernighan and Plauger need to write procedures to read standard in and populate their data structures, and also to print it back out again.

Let’s have a go at that too…​

// read
while (std::cin)
  lines.emplace_back(stiX::getline(std::cin));

// write
std::copy(
  lines.begin(),
  lines.end(),
  std::ostream_iterator(std::cout, "\n");
);
  • stiX::getline is a tiny convenience wrapper around std::getline

  • I’ve went back and forth on emplace_back and push_back here. I went with emplace_back seems to better express intention, but I’d be hard pressed to properly explain why.

In memory sort filter

Arranging the snippets above in the proper order gives

std::vector<std::string> read_lines();
void write_lines(std::vector<std::string> const &lines);

int main() {
  auto lines = read_lines();

  std::sort(lines.begin(), lines.end());

  write_lines(lines);
} // main

std::vector<std::string> read_lines() {
  auto lines = std::vector<std::string> { };

  while (std::cin)
    lines.emplace_back(stiX::getline(std::cin));

  return lines;
} // read_lines

void write_lines(std::vector<std::string> const &lines) {
  std::copy(
    lines.begin(),
    lines.end(),
    std::ostream_iterator<std::string>(std::cout, "\n")
  );
} // write_lines

If we ignore function declarations and pretty printing, this entire program is five lines of code.

The main body of Kernighan and Plauger’s code is equally concise

begin
    if (gtext(linpos, nlines, linbuf, STDIN) = ENDFILE) then begin
        shell(linpos, nlines, linbuf);
        ptext(linpos, nlines, linbuf, STDOUT)
    end
    else
        error('sort: input too big to sort')
end;

Unfortunately, they had to supply all the code behind the gtext, quick, and ptext calls too, while I leaned on standard library components. Expanded out, their full program is somewhat heftier.

{ Copyright (C) 1981 by Bell Laboratories, Inc., and Whitesmiths Ltd. }
{ sort -- sort text lines in memory }
procedure inmemsort;
const
    MAXCHARS = 10000;    { maximum # of text characters }
    MAXLINES = 300;        { maximum # of lines }
type
    charbuf = array [1..MAXCHARS] of character;
    charpos = 1..MAXCHARS;
    posbuf = array [1..MAXLINES] of charpos;
    pos = 0..MAXLINES;
var
    linebuf : charbuf;
    linepos : posbuf;
    nlines : pos;

{ gtext -- get text lines into linebuf }
function gtext (var linepos : posbuf; var nlines : pos;
        var linebuf : charbuf; infile : filedesc) : boolean;
var
    i, len, nextpos : integer;
    temp : string;
    done : boolean;
begin
    nlines := 0;
    nextpos := 1;
    repeat
        done := (getline(temp, infile, MAXSTR) = false);
        if (not done) then begin
            nlines := nlines + 1;
            linepos[nlines] := nextpos;
            len := length(temp);
            for i := 1 to len do
                linebuf[nextpos+i-1] := temp[i];
            linebuf[nextpos+len] := ENDSTR;
            nextpos := nextpos + len + 1  { 1 for ENDSTR }
        end
    until (done) or (nextpos >= MAXCHARS-MAXSTR)
            or (nlines >= MAXLINES);
    gtext := done
end;

{ shell -- ascending Shell sort for lines }
procedure shell (var linepos : posbuf; nlines : integer;
        var linebuf : charbuf);
var
    gap, i, j, jg : integer;
{ cmp -- compare linebuf[i] with linebuf[j] }
function cmp (i, j : charpos; var linebuf : charbuf)
        : integer;
begin
    while (linebuf[i] = linebuf[j])
      and (linebuf[i] <> ENDSTR) do begin
        i := i + 1;
        j := j + 1
    end;
    if (linebuf[i] = linebuf[j]) then
        cmp := 0
    else if (linebuf[i] = ENDSTR) then    { 1st is shorter }
        cmp := -1
    else if (linebuf[j] = ENDSTR) then    { 2nd is shorter }
        cmp := +1
    else if (linebuf[i] < linebuf[j]) then
        cmp := -1
    else
        cmp := +1
end;

{ exchange -- exchange linebuf[lp1] with linebuf[lp2] }
procedure exchange (var lp1, lp2 : charpos);
var
    temp : charpos;
begin
    temp := lp1;
    lp1 := lp2;
    lp2 := temp
end;
begin
    gap := nlines div 2;
    while (gap > 0) do begin
        for i := gap+1 to nlines do begin
            j := i - gap;
            while (j > 0) do begin
                jg := j + gap;
                if (cmp(linepos[j],linepos[jg],linebuf)<=0) then
                    j := 0    { force loop termination }
                else
                    exchange(linepos[j], linepos[jg]);
                j := j - gap
            end
        end;
        gap := gap div 2
    end
end;

{ ptext -- output text lines from linebuf }
procedure ptext (var linepos : posbuf; nlines : integer;
        var linebuf : charbuf; outfile : filedesc);
var
    i, j : integer;
begin
    for i := 1 to nlines do begin
        j := linepos[i];
        while (linebuf[j] <> ENDSTR) do begin
            putcf(linebuf[j], outfile);
            j := j + 1
        end
    end
end;

begin
    if (gtext(linepos, nlines, linebuf, STDIN)) then begin
        shell(linepos, nlines, linebuf);
        ptext(linepos, nlines, linebuf, STDOUT)
    end
    else
        error('sort: input too big to sort')
end;

At around 100 lines it’s not a massive program, but it’s still an undertaking. I’d be feeling pretty pleased with myself to have written something of this size and be confident in it inside an intense day.

Joy To The World

What I actually did was write my program directly into the text of this article. I then cut and pasted it across to my IDE, and it turned out I was correct pretty much first go. The only change I needed to make was to add the newline separator into the ostream_iterator.

Taking pieces you have, which you know to be correct and efficient, and plugging them together is almost the platonic ideal of programming. Join me now in sending good thoughts to the language designers, compiler writers, library implementors, and all those involved in providing the fantastic programming tools we have at our fingertips. Thanks friends!

Endnotes

This whole endeavour relies on Software Tools in Pascal and Software Tools, both by Brian W Kernighan and PJ Plauger. I love these books and commend them to you. They are both still in print, but new copies are, frankly, just ridiculously expensive. Happily, there are plenty of second-hand copies floating round, or you can borrow them from The Internet Archive using the links above.

For this project I’ve been using JetBrain’s CLion, which I liked enough to buy and renew a license for, because it’s gone a year already.

The test harness I’m using is Catch. I’ve been aware of Catch pretty much since it was first released, but this project is the first time really getting in and using it. I like it, and it’ll be my first choice in the future.


Tagged code, and software-tools-in-c++
Older posts are available in the archive or through tags.


Jez Higgins

Freelance software grandad
software created
extended or repaired

Follow me on Twitter
Applications, Libraries, Code
Talks & Presentations

Hire me
Contact

Older posts are available in the archive or through tags.

Feed