Jez Higgins

Freelance software grandad
software created
extended or repaired


Follow me on Mastodon
Applications, Libraries, Code
Talks & Presentations

Hire me
Contact

Older posts are available in the archive or through tags.

Feed

Sunday 12 April 2020 The Forest Road Reader, No 2.44 : STinC++ - include

STinC++ rewrites the programs in Software Tools in Pascal using C++

After the let’s just make sure we can read a file introduction of compare, we get into some proper tool business with Chapter 3’s second program. This program, include, copies its input to output, except any line which begins #include "filename" is replaced by the entire contents of the file filename. This is, Kernighan and Plauger note, similar to a facility provided by the PL/I preprocessor. They decline to mention the C preprocessor, although whether for reasons of modesty or because C was too obscure at the time I can not guess.

This is the point in the book where Kernighan and Plauger start to open the kimono a little. While they’ve talked about the various primitive functions they’ve had to provide, they’ve been rather coy about how those have been incorporated into the programs they present. The answer, they now reveal, is include.

We have used include to assemble most of the programs in this book. This enables us to keep the source files of the program separate for easy editing …​ yet collect them together for compilation …​ to compile a Pascal program we actually run include on a small control file that looks like this:

program outer (input, output):
    #include "global definitions"
    #include "io primitives"
    #include "utility functions"
begin
    #include "initialize the i/o system"
    #include "main program"
end.

The structure of include is as you’ve probably already imagined:

  while (getline(line, file))
      if (line starts with "#include")
          include new file
      else
          output line

Naturally, an included file can itself contain further include directives, so the whole thing is recursive. The original version of include presented in Software Tools had to use a stack to manage nested includes because Fortran procedures were not recursive. Happily Pascal procedures are, so the Software Tools in Pascal version is a little simpler than the Fortran version. The bulk of the program is taken up with continuing to work around Pascal’s lack of a string type. K&P are modelling strings using fixed length arrays. As a consequence, checks for the #include token, and parsing out the filename are somewhat labourious.

In these modern times, files and strings present no such difficulties. I’ve flipped the sense of the test, but otherwise the core of my implementation is almost exactly as above.

include.hpp
template<typename IncludeOpener>
void expand_include(
    std::istream& input,
    std::ostream& output,
    IncludeOpener openFn
) {
    while(input) { (1)
        auto line = getline(input);

        if (!is_include(line)) {
            output << line
                   << (!input.eof() ? "\n" : ""); (2)
        } else {
            auto included = open_include(
                line,
                output,
                openFn
            );
            expand_include(
                included,
                output,
                openFn
            );
        }
    } // while
} // include
  1. As previously mentioned, streams are good (and hence convertable to true) if they are readable.

  2. If I wasn’t concerned about exactly reproducing the presence or absence of a trailing newline, I could get rid of the conditional here.

Yes, I said straightforward and I meant it, despite that exciting looking template declaration at the top there. As I’ve mentioned before, I test-drive these programs and I wanted to be able to write tests that exercised the expand_include function without going anywhere near the filesystem.

There are all kinds of discussions one can have about the purity or not unit tests, whether a unit test that goes to the file system is really a unit test, and all that, but I was primarily concerned with practicality. It’s simply harder to put together a test that uses the file system, and it’s even harder to make sure it runs reliably in different environments. (And different environments can occur on the same machine using the same source tree - consider running cmake from the command line against your IDE invoking cmake for you.)

Rather than let expand_include just go ahead and open files, I need to pass in some thing - some function or object - that would "open" the included "file" and hand back an input stream. The actual program would use the real file opener, while for my tests I could pass in a mock. You might recognise this as an example of parameterizing from above.

The real function’s declaration looks like this

std::ifstream file_opener(std::string const& filename);

while the test mock looks like this

std::istringstream test_opener(std::string const& filename);

The two declarations are similar, but not substitutable. The two functions both return std::istream-derived objects, but the specific types are different. I can’t try and finesse around that by returning a std::istream reference, I need to return the actual object by value. In another context, I might have considered making the file opening function look something like

std::unique_ptr<std::istream> opener(std::string const& filename);

but that feels excessive in a small program like this. Consequently, I decided to template expand_include on the opener function.

Bugs

In line with original, this implementation retains a documented bug

A file that includes itself will not be diagnosed, but will eventually cause something to break.

Source code

Source code for this program, indeed the whole project, is available in the stiX GitHub repository. include is second program of Chapter 3.

Library Reference

Raw strings have actually been available since C++11, but I don’t think I’ve ever encountered one in the wild. I did use them in my tests though, and it felt quite exciting.

Raw strings don’t need escape characters and consequently can be multiline. The syntax is

R"delimiter( raw characters )delimiter"

where the delimiter is some arbitrary word, and the raw characters can be anything except, obviously, the closing )delimiter combination.


Tagged code, and software-tools-in-c++


Jez Higgins

Freelance software grandad
software created
extended or repaired

Follow me on Mastodon
Applications, Libraries, Code
Talks & Presentations

Hire me
Contact

Older posts are available in the archive or through tags.

Feed