Freelance software grandad
software created
extended or repaired
Follow me on Mastodon
Applications, Libraries, Code
Talks & Presentations
Since 2001 I have been working sporadically on a new parser written in Java that, instead of parsing well-formed or valid XML, parses HTML as it is found in the wild: nasty and brutish, though quite often far from short. TagSoup is designed for people who have to process this stuff using some semblance of a rational application design. By providing a SAX interface, it allows standard XML tools to be applied to even the worst HTML.I would love to have the time to port this to C++ and give Arabica HTML eating capabilities.
Freelance software grandad
software created
extended or repaired
Follow me on Mastodon
Applications, Libraries, Code
Talks & Presentations