| JezUK Ltd - The Coffee Grounds - October 2008 |
| << September 2008 | November 2008 >> |
Made an offering to the goddess Sulis Minerva today. Bang go my athiest credentials.
In a similar vein, Happy Diwali.
At the junction of Hurst Street and Sherlock Street, the traffic lights include a special little crossing just for cyclists. You can see the little bike and arrow painted on the pavement right at the top of the picture. There's a contraflow cycle lane running down Hurst Street, and the crossing lets cyclists across join Sherlock Street and pedal their way out of the city centre.
At least I assume that's what it's for, because that's how I've used it in the past. I've occasionally felt it made me wait a bit, but it's little light turned green and away I want. Last night, however, it really, really didn't want me to cycle home from the pub.
I arrived at the crossing with the button pressed and the WAIT light already lit up, although there was no one else waiting. The traffic lights went through one full cycle, without the cycle light going green. The traffic lights went through another full cycle. Things were starting to get ridiculous, but I carried on waiting. The lights went through another full cycle. I waited through one further full cycle of the lights just to be sure, but the little green bike still refused to light up for me. I waited until there was no traffic and then, like the socialist-vegetarian-atheist-political-enthusiast-and-others potential terrorist I am, ran the red light.
As I said, I've used this crossing before. While I've sometimes felt it made me wait a bit, it's never blatently refused to change for me. Perhaps it's broken? Perhaps there are different timings at different times of the day, to make us post-pub cylists wait? The only other thing I can think of is there was traffic coming north-west up Hurst Street from Bishop Street. That's the road coming diagonally in from the right. It's got one of the magic induction loops, so the lights to let traffic through only goes green when there's actually someone there*. Perhaps that light changing trumps the cycle crossing, so everything resets and we all go round again. If the traffic keeps coming, like the seemingly unending trail of minicabs last night, the cyclists just have to wait and wait and wait. If that's the case, then obviously my indignation is entirely unjustified. Can't hold up the car traffic, can we?
* Unless you're on a bike, in which case the loop doesn't pick you up, and you stare at a red light forever.When I didn't hear the delivery chap knocking, my decorator answered the door and signed for the package.
Nothing unusual there except he is profoundly deaf and I am not.
To tell the truth, I have no idea. Development is of Mangle, Arabica's XSLT engine, is ongoing, although progress varies according to the vagarities of how busy I am, how energetic I'm feeling, whether the kids have a swimming gala, and so on and so forth.
Although it's not done yet, it might well be done enough. I'm using the OASIS XSLT test suite to help drive development, and so it also provides a measure of how much has been done, what's working and what isn't. The results are published here, but all the code and test data is included in the download. The executive summary is the core stuff that you use every day works, but some of the bits round the edges (edges defined by my experience, anyway) are missing.
To my knowledge there's nothing that causes Mangle to crash, and anything that I haven't yet implemented generates a warning when the stylesheet is compiled.
Give it a go. It might do what you need.
If you run the tests, the final testsuite exercises the XSLT engine and it will list a number of failures. Quite a large number. XSLT development is ongoing, and I'm using the OASIS XSLT test suite to guide that. Consequently, the tests that fail generally indicate something I haven't done yet, rather than an actual bug. The XSLT tests are, therefore, ignored by make check (should you be lucky enough to be working on a Unixy platform).
Failures in any other tests are, however, indicative of a problem that needs investigating.
The "Probably long overdue release" bringing a big chunk of new functionality.
Source tar.bz2
http://downloads.sourceforge.net/arabica/arabica-2008-october.tar.bz2
Source tar.gz
http://downloads.sourceforge.net/arabica/arabica-2008-october.tar.gz
Source zip
http://downloads.sourceforge.net/arabica/arabica-2008-october.zip
The exciting new stuff is Taggle, a port of John Cowan's rather super TagSoup package.
TagSoup, if you're not familiar with it, is
a SAX-compliant parser written in Java that, instead of parsing well-formed or valid XML, parses HTML as it is found in the wild: poor, nasty and brutish, though quite often far from short. TagSoup is designed for people who have to process this stuff using some semblance of a rational application design. By providing a SAX interface, it allows standard XML tools to be applied to even the worst HTML.Obviously, if you have a SAX parser you can apply all your standard XML techniques - not only SAX filters, but building a DOM, applying XPaths, or XSLT transformations as well.
Cowan describes what TagSoup does as
TagSoup is designed as a parser, not a whole application; it isn't intended to permanently clean up bad HTML, as HTML Tidy does, only to parse it on the fly. Therefore, it does not convert presentation HTML to CSS or anything similar. It does guarantee well-structured results: tags will wind up properly nested, default attributes will appear appropriately, and so on.Looks straightforward, doesn't it? Well, that's a simple example and it's still a tricky and awkward result in practice. Cowan's patience in persuing this and what looks like a rather elegant solution is to be applauded. Porting his code to C++ was quick and painless, and Taggle is a useful addition to Arabica. Thanks, John.
The semantics of TagSoup are as far as practical those of actual HTML browsers. In particular, never, never will it throw any sort of syntax error: the TagSoup motto is "Just Keep On Truckin'". But there's much, much more. For example, if the first tag is LI, it will supply the application with enclosing HTML, BODY, and UL tags. Why UL? Because that's what browsers assume in this situation. For the same reason, overlapping tags are correctly restarted whenever possible: text like:
This is <B>bold, <I>bold italic, </b>italic, </i>normal text
gets correctly rewritten as:
This is <b>bold, <i>bold italic, </i></b><i>italic, </i>normal text.
Arabica Taggle chews through HTML, providing the same SAX XMLReader interface as the XML parser, and can be used in exactly the same way. HTML source can be fed through SAX filter stacks, used to build DOM trees, queried with XPath, or transformed using XSLT.
There are, of course, many other fixes and changes. Most are relatively minor, and if you haven't been bitten by them you won't notice. The most significant changes are in Arabica's XSLT engine, Mangle. While still not feature complete and under development, it takes, in this release, a fairly big step forward.
SAX
AttributesImpl.getIndex. Thanks to Isak Johnsson for that, and what on earth was I thinking to meXMLReaderInterface. It only needs the string_type and string_adaptor. Any addition parameters are only of interest the implementing parser classDOM
TextCoalescer filter into the DOM builder, so that consecutive bits of text get applied to a single Text or CDATA node, rather than as a series of nodes. (A series of nodes is perfectly legal, it's just slightly unexpected. Even to me, and I work with DOMs a lot :) XPath
XPathValuePtr and XPathExpressionPtr both exposed implementation details and provided an interface that was inconsistent with the DOM classes, because you accessed the member functions via -> rather than . At the time, I was just pleased to have got the XPath stuff done and wasn't really fussed, so I left it. Since then though, it's niggled and niggled away at the back of my mind and now I've done something about it. XPathValuePtr has become XPathValue and XPathExpressionPtr has become XPathExpression, with the member functions accessed through the . operator. The XPathValuePtr and XPathExpressionPtr name and -> member access are retained for the meantime, so that existing code won't be broken. Existing code using XPathValuePtr will still work, but new stuff should use XPathValueprefix:* didn't compile. I had no test for it, and had overlooked it. Now I do, and it isn'ttext() test to match CDATA nodes as well as text nodesXSLT
xsl:apply-imports callnode() matches any node of any type. In an XSLT match pattern, node() matches everything except attributes and the document root node. Fixed.xsl:for-each, xsl:if, and xsl:choosestd::stable_sort instead of std::sort. When xsl:sort specifies a numerical sort, but you've got some string data in there we need to maintain the relative positions of that string data. This is the first time I can recall actually using std::stable_sort. I will mark it down in my big book of programming accomplishments.xsl:message can contain another xsl:message - now handled properlyxsl:choose has at lease one xsl:whenxsl:template mode attribute is not emptyxsl:sort attribute values xsl:call-template now throws if it can't find a matching templatecurrent() in match patternsxsl:for-each selects a node-setxsl:paramxsl:stylesheet now allows top-level elements when they are in a foreign namespace position(), last() and positional predicates in match patternsselect attribute and text contentxsl:element unprefixed names - when no namespace uri is supplied are in the default namespace@xmlns|@xsmlns:* selects no nodesstd::cerr, not std::coutBuild and installation
std::mbstate_t and/or mbstate_t. Some platforms
don't have it (VxWorks, for example)Other bits and bobs
../A couple of months ago a release was, I said, impending. And it really was, but then I found a niggly thing I really want to fix. And went on holiday. And got really busy at work. And all that other stuff that happens when you're not programming.
There really is a release coming now, because I'm cutting it now. The source bundles will probably go are up on Sourceforge this evening now, and tagged in subversion. Release notes should follow later this weekend or early next week. I'll write up the niggly thing too, because it's quite a nice one.
The last release was just over a year ago. That's probably a bit too long.
http://news.bbc.co.uk/1/hi/england/hampshire/7675699.stm [added 17th Oct 2008]
| << September 2008 | November 2008 >> |