procedure copy;
var
c : character;
begin
while (getc(c) <> ENDFILE) do
putc(c)
end;
Freelance software grandad
software created
extended or repaired
Follow me on Mastodon
Applications, Libraries, Code
Talks & Presentations
STinC++ rewrites the programs in Software Tools in Pascal using C++
Software Tools in Pascal starts off pretty gently. The first program Kernighan and Plauger present is copy, which copies its input to its output. The seemingly trivial nature of this task allows them in introduce how they write and present their code - without getting into detail they mention how they cope with different Pascal implementations on different platforms, the general structure they adopt, and so on. In particular they introduce primitives - functions that interface to the outside world. The two primitives they introduce first are getc
and putc
which, respectively, read a character from and write a character to somewhere,
an interactive terminal or some secondary storage device like a disk.
These primitives, of which they build up a fair library over the course of the book, are how they make their Pascal portable across platforms. These days, we take that kind of thing for granted as part our language standard library.
After this preamble they present their program, or at least part of it
procedure copy;
var
c : character;
begin
while (getc(c) <> ENDFILE) do
putc(c)
end;
They explain
First, and most obvious to people who have used Pascal before, is that this is not a complete program - it is just a procedure. So it needs some surrounding context before it can actually do anything for us. We intend to present all of our programs this way … so we can better focus on the essential ideas.
They do then present the whole program with the procedure in context, and talk it through. They conclude this initial section with a reason why this isn’t as trivial a program as it might appear.
When you encounter a new language, a new operating environment, or just a new way of doing business on a computer, the first hurdle to clear is learning how to run a program. You must master, perhaps: logging on to the computer, creating files with the editor, running the compiler and/or linker, modifying files with the editor, and invoking the program you’ve finally built! With all these potential problem areas, the last thing you need is a complex program to contribute troubles of its own.
The primitives getc
and putc
come, of course, more or less directly from C’s getc
/getchar
and putc
/putchar
. It would be tempting to cast this program directly into C++ by substituting begin
and end
for a pair of curly brackets and having done with it.
void copy() {
int c;
while ((c = getchar()) != EOF)
putchar(c);
}
Tempting, but lazy.
Does this actually work? Well, by inspection it looks like it should, but because getchar
and putchar
are tied to our standard input and output (as we call our interactive terminal or some secondary storage device these days) we’d have to run the program to actually test it. Running a whole program to test it is tedious at best, even for a tiny program like this. Perhaps I can raise the level of abstraction a bit, and make it a little easier to test.
void copy(std::istream& in, std::ostream& out) {
int c;
while ((c = in.get()) != EOF)
out.put(c);
}
Now I can copy from any istream
(which could be standard input, a file, something in memory, almost anything) to any ostream
((which could be the console, another file, somewhere in memory, almost anything). Nice. We can throw a little test wrapper round that, poke some known inputs through it, check the right thing comes out.
This is where I actually started. I wrote the test first, using the splendid Catch test framework.
void verifyCopyString(std::string input) {
std::istringstream is(input);
std::ostringstream os;
stiX::copy(is, os);
REQUIRE( os.str() == input );
}
/* ...
some strings
... */
TEST_CASE( "Chapter 1 - copy" ) {
verifyCopyString(empty);
verifyCopyString(zero_length);
verifyCopyString(very_short);
verifyCopyString(longer);
verifyCopyString(longer_with_line_breaks);
}
I wrote the least amount of code I could to get this to compile.
namespace stiX {
void copy(std::istream& in, std::ostream& out) {
}
}
I then ran the tests, which naturally nearly all failed but now I had something to work with.
The function signature - void copy(std::istream& in, std::ostream& os)
- is pretty perfect, but what to fill it with? Is a while loop really still the state of the art here?
It is not.
The C++ Standard Library provides a function copy(InputIt first, InputIt last, OutputIt d_first)
which copies the elements in the range, defined by [first, last)
, to another range beginning at d_first
.
copy
is a generic function. It takes a pair of iterators delimiting some input range and copies what it find to an output iterator. Classically, we always used to describe iterators as pointer-like, ie they pointed to something, we could advance them, and we could compare them. Equally classically, we generally thought about iterators as operating over some known and bounded region - a block of memory, or a container of some sort like a vector or a list. We don’t usually think of IO in these terms - a file is a file, writing to the console sends things to the screen, that kind of thing. However, if we start to think more broadly about iterators as moving over a sequence, and consider our input as source of characters and our output as somewhere to put a sequence of characters, it becomes quite natural to want to iterate over console input and output.
So, if I can connect my istream
and ostream
up to copy
, it’ll do the work for me. Perfect!
Happily, C++'s istream
and ostream
can provide the iterators I’m after. In fact, there are quite a number to choose from. In this case, I want to pull raw characters from the input and poke them straight down the output, so I want istreambuf_iterator<char>
and ostreambuf_iterator<char>
.
#include "copy.h"
#include <algorithm>
#include <iostream>
#include <iterator>
namespace stiX {
void copy(std::istream &in, std::ostream &out) {
std::copy(
std::istreambuf_iterator&<char>(in),
std::istreambuf_iterator&<char>(),
std::ostreambuf_iterator&<char>(out)
);
}
}
I like this a lot. There’s no loop, no comparison, no worrying about special end-of-input sentinel values. When you read it there’s not even the slightest mental gymnastics involved in understanding it, because there’s nothing to comprehend. It does exactly what it says - copy this here to that there.
Underneath each C++ istream
or ostream
is a streambuf
as its source of input or output target. The streambuf
does all the work regarding the actual I/O and the stream is only concerned with formatting and transformation or conversion from characters to other types such as strings.
An istream_iterator
takes a template argument that says what the unadorned character sequence from the streambuf should be formatted as. An istream_iterator<int>
, for instance, will interpret the whitespace-delimited incoming text as series of int
values.
An istreambuf_iterator
, however, is only concerned with raw characters and reads directly from the associated streambuf
of the istream
that it gets passed.
Generally, if we’re interested in the raw characters we want an istreambuf_iterator
. If we’re after formatted input of any kind, we need an istream_iterator
.
As for istream
, so for ostream
. For unformatted output we want an ostreambuf_iterator
, while ostream_iterator
provide formatted output.
Source code for this program is on Github, with the test harness and build files elsewhere in the same repository.
Obviously, I’m leaning hugely on Software Tools in Pascal and Software Tools, both by Brian W Kernighan and PJ Plauger. I love these books, and commend them to you. They’re both still in print, although pretty pricy new. Second hand will serve you just as well - the words are still the same.
For this project I’ve been trying out JetBrain’s CLion, which so far has been pretty great. CLion uses CMake to build projects. My previous flirtations with CMake, admittedly many years ago, weren’t a huge success. Not so this time - it’s easy to use and works a treat.
The test harness I’m using is Catch. I’ve been aware of Catch pretty much since it was first released, but this is my first time really using it. I like it and will use it again.
Freelance software grandad
software created
extended or repaired
Follow me on Mastodon
Applications, Libraries, Code
Talks & Presentations