ARCHANGEL is a joint project involving University of Surrey, The National Archive, and The Open Data Institute, investigating how we might help ensure the long-term integrity of digital documents stored in public archives.
When an archive produces a physical artefact, its relatively easy to establish that it is indeed the original document, preserved unaltered since it was first deposited.
But a digital artefact?
A digital document can be infinitely copied without degradation, but can also be undetectably altered, inadvertently or deliberately, both with benign or malign intent.
How can we be confident that what we've presented with is, in fact, identical to the document that was first stored in archive?
ARCHANGEL is trying to address this problem, and in this talk I'll describe some of the approaches and technologies we're using.
Spoilers: Yes, it includes blockchains, but it's about the only blockchain application you'll hear of that doesn't immediately make you feel dirty. It might also include machine learning, but it's machine learning for justice.
I presented this session, which features a primer on archival practice, a bluffer's guide to blockchains, and a brief introduction to machine learning, last night at Nor(Dev), and I think it went pretty well. I certainly enjoyed presenting it. It was a pleasure to be there as the opening act for my friend Russel, who spoke about Me-TV, an honest to god broadcast TV client, and his journey picking up what was a dead project written in C and using the Xine video player to its current incarnation written in Rust using GStreamer. It was really good and even though Russel only showed a small amount of Rust code, it made me think about it very differently to the various introductions to that I've read. Real code always wins over toy examples.
ARCHANGEL: Trusted Archives of Digital Public Documents - Paper presented at ACM Document Engineering 2018.
Looking for Life on a Flat Earth - The astonishing New Yorker article I mention at the start of the talk