Jez Higgins

Freelance software grandad
software created
extended or repaired


Follow me on Mastodon
Applications, Libraries, Code
Talks & Presentations

Hire me
Contact

Older posts are available in the archive or through tags.

Feed

Tuesday 11 January 2022 The Forest Road Reader, No 2.67 : STinC++ - Chapter Five Wrap Up

STinC++ rewrites the programs in Software Tools in Pascal using C++

  • find - searching text using regular expressions

  • change - and then replacing the matches with something else

The bulk of chapter five was introducing and describing regular expressions. Regexes are ubiquitous these days, but you’d still struggle to find an introduction that’s as clear as that in Software Tools in Pascal. Kernighan and Plauger are both good writers individually, but together they’ve got some kind of superpower. Introducing regexes is all very well, but the point of chapter five is implemented a regular expression engine, and that was really good fun.

I don’t have a strong computer science background. I trained in electronic engineering and came into software sort of by accident. I have been doing this for getting on for 30 years, so I’ve picked up a few things along the way but I have a generally utilitarian approach. I probably could invert a tree, but I’d look for a library function first.[1] Anyway, I learned a whole lot building my little regex library, and like last chapter’s sort algorithms, unsophisticated as my implementation is, writing it is something I would never have otherwise written. It’s given me an appreciation of regexes I didn’t have before. In my code, for instance, I can see a couple of pretty obvious optimisations, mainly handling the no-match case. The Kleene star implementation is greedy, but I know how a non-greedy version would drop right in. I can see how the code could be extended to support capture groups. It’s a whole new little arena that’s opened up for me. Thanks Brian, thanks Bill.[2]

It helped cement some C++ for me too. There are always things you know, but perhaps not consciously or with full understanding. I knew, in an abstract kind of way, that C++ lambda functions have value semantics. However, since I’d pretty much always used lambdas functions as arguments to functions that were declared right at the call site, I hadn’t appreciated that they enabled, for example, runtime polymorphic behaviour in value objects. As the code compiles a regular expression into a pattern_matcher object, it’s building a std::vector of matcher objects. Each of those matcher objects can have different behaviour - matching the start of a line, a single character, one of a set of characters, or whatever. "Classically", if I’d written this code 10 years ago say, at the very least I’d have a base class, possibly abstract, and then a series of leaf classes to describe the different behaviours. Inside the pattern_matcher we’d have a list of pointers, there’d be a whole lot of memory management, and we’d all be a bit nervous. Today, we can, thanks to lambda functions and std::function, do that all with value objects. There’s no explicit memory handling, no worries about lifetime management, the code is compact and to the point. It’s fantastic.

In my write-ups of find and compare I quoted only a little bit of Kernighan and Plauger’s Pascal. As the problems we’re working on get that bit bigger, the C++ I’m writing is further and further away from the Pascal they wrote and comparisons between the two are less and less useful. We’ve established, for instance, that Kernighan and Plauger are hampered by Pascal’s lack of a proper string type, and there’s not much point in saying it again. The bigger problems they now present naturally have a bigger solution space, and C++ (or your current language of choice) lets us explore that solution space much more freely.

Two programs for chapter five, but only one coming up in chapter six, Editing. It’s going to be a bit of job, and I think I’m going to have to find a new way to write about it too. I hope you’ll follow along with me.


1. Assuming there are any because I can’t think of reason why you’d want to invert a tree.
2. PJ Plauger is widely known as Bill.

Tagged code, and software-tools-in-c++


Jez Higgins

Freelance software grandad
software created
extended or repaired

Follow me on Mastodon
Applications, Libraries, Code
Talks & Presentations

Hire me
Contact

Older posts are available in the archive or through tags.

Feed