Jez Higgins

Freelance software grandad
software created
extended or repaired


Follow me on Mastodon
Applications, Libraries, Code
Talks & Presentations

Hire me
Contact

Older posts are available in the archive or through tags.

Feed

Thursday 06 August 2020 The Forest Road Reader, No 2.51 : STinC++ - archive contents, delete, print

STinC++ rewrites the programs in Software Tools in Pascal using C++

At the end of my last installment, I had archive creation going, albeit with mocked up input. Putting together inputs and expected outputs as inline strings quickly became rather tedious, and I extended out my tests to pick up test cases from the filesystem. Subsequently, I was able to reuse the same test cases to drive archive from the very top, passing in the command line arguments. I was even able to reuse code from compare to verify the program output.

With archive creation out the way, the next step seemed to be listing the contents. I followed that with removing files from the archive, and then printing a file to standard out, the -t, -d, and -p command line options. Recall that the archive format is a header line

-h- name size

followed by the file contents, the next header line and contents, and so on.

These three operations, which operate on an existing archive file, all have a similar shape

void table_archive(
  std::istream& archive_in,
  std::ostream& out
) {
  archive_in.peek();

  while (archive_in && !archive_in.eof()) {
    auto header_line = getline(archive_in);
    auto header = parse_header(header_line);

    out << header.name << '\t' << header.filesize << '\n';

    skip_entry(archive_in, header);

    archive_in.peek();
  }
} // table_archive

void delete_from_archive(
  std::istream& archive_in,
  std::vector<std::string> const& files_to_remove,
  std::ostream& archive_out
) {
  archive_in.peek();

  while (archive_in && !archive_in.eof()) {
    auto header_line = getline(archive_in);
    auto header = parse_header(header_line);

    if (of_interest(files_to_remove, header))
      skip_entry(archive_in, header);
    else {
      archive_out << header;
      copy_contents(archive_in, header, archive_out);
    }

    archive_in.peek();
  } // while ...
} // delete_from_archive

void print_files(
  std::istream& archive_in,
  std::vector<std::string> const& files,
  std::ostream& out
) {
  archive_in.peek();

  while(archive_in && !archive_in.eof()) {
    auto header_line = getline(archive_in);
    auto header = parse_header(header_line);

    if (of_interest(files, header))
      copy_contents(archive_in, header, out);
    else
      skip_entry(archive_in, header);

    archive_in.peek();
  } // while ...
} // print_files

Before I lined them up like this, I hadn’t realised just how similar they are. There’s a fairly obvious refactoring to do next time I touch the code.

archive_in.peek() is a cheeky lookahead that I’m using to set the end-of-file (technically end-of-input) flag before I try to read anything, rather than afterwards. Let’s imagine we’ve been churning through an archive file, and have just read the last header and contents. We now call archive_in.peek() which goes away and tries to find the next character of the input. There isn’t one, so the stream sets its end of file flag. Looping back up to the top of the while loop, we call archive_in.eof(), which now returns true causing us to break out of the loop. Without the peek(), that call to archive_in.eof() would return false because we hadn’t reached the end of the input yet. The subsequent call to getline() would hit end of input, so would return an empty string. We would then need to handle that inside the loop, and the whole thing starts to get a bit messy. I don’t think I’ve used peek() in this way before, but I wish I’d worked it out years ago.

Almost There

With the create, list contents, delete, and print operations complete, the archive program is almost complete, with only the extract and update operations to go. Extracting a file is essentially the same as printing, except we write to a file rather than to standard out. Updating can be achieved by, essentially, combining the operation of the delete and create operations. If first we remove the files to be updated from the archive, then append the new contents to the archive we’ll have successfully updated it. That’s what I reckon anyway. I’m off to find out.

Source Code

Source code for this program, indeed the whole project, is available in the stiX GitHub repository. archive is the sixth program of Chapter 3.

Library Reference

Endnotes

This whole endeavour relies on Software Tools in Pascal and Software Tools, both by Brian W Kernighan and PJ Plauger. I love these books and commend them to you. They’re both still in print, but new copies are, frankly, just ridiculously expensive. Happily, here are plenty of second-hand copies floating round, or you can borrow them from The Internet Archive using the links above.

For this project I’ve been using JetBrain’s CLion, which I liked enough to buy a license.

The test harness I’m using is Catch. I’ve been aware of Catch pretty much since it was first released, but this project is the first time really getting in and using it. I like it and will use it again.


Tagged code, and software-tools-in-c++


Jez Higgins

Freelance software grandad
software created
extended or repaired

Follow me on Mastodon
Applications, Libraries, Code
Talks & Presentations

Hire me
Contact

Older posts are available in the archive or through tags.

Feed