STinC++ rewrites the programs in Software Tools in Pascal using C++
After counting characters and then lines, Kernighan and Plauger take us next to counting words.
wordcount.pas
procedure wordcount;
var
nl : integer;
c : character;
inword: boolean;
begin
nw := 0;
inword := false;
while (getc(c) <> ENDFILE) do
if (c = BLANK) or (c = NEWLINE) or (c = TAB) then
inword := false
else if (not inword) then begin
inword := true
nw := nw + 1
end;
putdec(nw, 1);
putc(NEWLINE);
end;
You can see the little path Kernighan and Plauger are on quite clearly now. Start with a simple loop, then extend that loop with a simple counter. Next add a simple conditional, and now extend that conditional into a little state machine. Each little step has added something extra, changing the functionality of each program to provide a new and useful result.
My path has to been to look through the list of functions provided in the algorithm header and pick the one that does what’s needed for this task.
Kernighan and Plauger have a straightforward, although entirely reasonable, definition of a word - the maximal sequence
of characters not containing a blank, a tab, or a newline.
The additional complexity of Kernighan and Plauger’s wordcount
over linecount
is to keep track of whitespace delimiters between the words. They are, in effect, splitting up the sequence of characters into a sequence of words.
Previously, I blythely said that C++'s istream
and ostream
provide a number of different iterators. To get down and deal with the raw character stream we use istreambuf_iterator<char>
. For formatted input, that is anything that needs a bit of work to process those raw characters in some way, we want some sort of istream_iterator
.
Gathering characters up into whitespace delimited words qualifies as a bit of processing work. It’s the kind of thing people do all the time, and consequently is precisely what the istream_iterator<std::string>
provides.
I rather loosely described std::distance(InputIt first, InputIt last)
as counting the hops between first
and last
. More formally, it returns the number of iterator increments needed to go from first
and last
. Incrementing an istream_iterator<std::string>
returns the next word, so plugging that into std::distance
counts the number of words in the input.
wordcount.cpp
#include "wordcount.h"
#include <algorithm>
#include <iostream>
#include <iterator>
namespace stiX {
size_t wordcount(std::istream& in) {
return std::distance(
std::istream_iterator<std::string>(in),
std::istream_iterator<std::string>()
);
}
}