<< December 2007 February 2008 >>

Thursday 31 January, 2008
#[Arabica]

Moments before I'm about to go to bed, I discover Taggle fails (in a coredumpy way) for documents which have a DOCTYPE declaration. Will have a look at a fix in the morning.

And there's the fix in subversion. [added 1st Feb 2008]

[Add a comment]
#[linkfarm] Pravda.ru: Vegetarianism proves to be perversion of nature - Furthermore, cosmetologists say that a typical vegetarian has dry and fragile hair, dull eyes and unhealthy complexion. They can hardly stand criticism and have a low boiling point. They raise their voice, swing their arms and splutter when arguing. They are weak even in their logic.
Yep, that's me. Weak, even in my logic.

   * Russ L said 'Cosmetologists'?

As in those well versed in the use of cosmetics? [added 31st Jan 2008]

Yes. Astonishingly, it's a real word. [added 31st Jan 2008]

   * allan@allankelly.net said The number of vegetarians in Russia is much lower than in the West. I'm sure if you went back 20, 30 or especially 40 years you would find similar comments in English newspapers and magazines. [added 1st Feb 2008]

It's probably less to do with Russian attitudes to not eating meat, that it is with the fact that pravda.ru appears to be at the same journalist level as a the Sunday Sport. [added 1st Feb 2008]

[Add a comment]

#[Arabica]Taggle: Building the code

If you've grabbed the code from subversion:

svn co svn://jezuk.dnsalias.net/jezuk/arabica/branches/tagsoup-port
you might be wondering how to build it.

For Visual Studio 2005 users, open up the vs8\taggle.sln project and build away. It should just work. If it doesn't, then check the project build notes for information on setting up search paths and things.

For Unixy types, you will need a mighty three steps:

  1. autoreconf - to create the configure script
  2. ./configure - to dig out where the various bits and pieces Arabica needs are, and to create the Makefiles
  3. make - to, erm, make everything

Problems, questions, issues? Get in touch.


[Add a comment]
#[linkfarm] ALEC (LIFE-SIZE OMNIBUS) - will collect all the stories from THE KING CANUTE CROWD, THREE PIECE SUIT, HOW TO BE AN ARTIST, AFTER THE SNOOTER, as well as the very early out-of-print ALEC stories and tons of bonus material. It will definitely be the definitive ALEC tome.
Coming in 2009. I have most, if not all, of it already but won't be surprised if I wind up with a copy ...

[Add a comment]
Wednesday 30 January, 2008
#[linkfarm] Swamp Thing: Anatomy Lesson - Never use the InstallUtil.exe utility that ships with the .NET SDK again.
Big fat PDF of Alan Moore's first issue on the comic that really made his name, wonderfully drawn by Steve 'jolly good indeed' Bissettte and John 'frankly terrific' Totleben.

[Add a comment]
#[Arabica]Taggle: And there it is ...

Taggle, Arabica's port of the TagSoup HTML parser, now builds and runs. It dodges pretty much every encoding issue on the planet, but as a first go it's really quite pleasing. Give it this -

This is <B>bold, <I>bold italic, </b>italic, </i> normal text

and get this

<html>
    <body>This is
        <b>bold,
            <i>bold italic, </i>
        </b>
    <i>italic, </i>
normal text
    </body>
</html>
(Ok, you have to squint a bit at the indenting, but that's a separate issue.)

If you want to have a play, check out the tagsoup-port branch from subversion:

svn co svn://jezuk.dnsalias.net/jezuk/arabica/branches/tagsoup-port

In examples/Taggle, there's a little command line application that read HTML documents and prints the corrected markup to the console.

I'll merge this back into the trunk in the next few days.

zcorpan [e] said Why not implement an HTML5 parser instead of porting TagSoup? [added 1st Feb 2008]

Time and inclination. Porting TagSoup to C++ took me a few hours. It was fun, and quite an easy win. Having done it, I'm surprised that nobody's done it before.

Writing an HTML5 parser needs rather more time than I have - not only in writing the code, developing the test suite, but then tracking the standard as it emerges. Even if I had the time, I don't actually have the inclination, because it's not something that really interests me enough right now. Sorry :)

[added 2nd Feb 2008]

[Add a comment]
Tuesday 29 January, 2008
#[linkfarm] Superfriends meets Friends
[Add a comment]
#[Arabica]Taggle: Bringing HTML Parsing to Arabica

After a rather intense return work after Christmas, I'm taking a bit of a break from Arabica's XSLT development for something a bit lighter - porting John Cowan's excellent TagSoup package to C++ and Arabica. TagSoup, if your not familiar with it, is

a SAX-compliant parser written in Java that, instead of parsing well-formed or valid XML, parses HTML as it is found in the wild: poor, nasty and brutish, though quite often far from short. TagSoup is designed for people who have to process this stuff using some semblance of a rational application design. By providing a SAX interface, it allows standard XML tools to be applied to even the worst HTML.

Obviously, if you have a SAX parser you can apply all your standard XML techniques - not only SAX filters, but building a DOM, applying XPaths, or XSLT transformations as well.

Cowan describes what TagSoup does as

TagSoup is designed as a parser, not a whole application; it isn't intended to permanently clean up bad HTML, as HTML Tidy does, only to parse it on the fly. Therefore, it does not convert presentation HTML to CSS or anything similar. It does guarantee well-structured results: tags will wind up properly nested, default attributes will appear appropriately, and so on.

The semantics of TagSoup are as far as practical those of actual HTML browsers. In particular, never, never will it throw any sort of syntax error: the TagSoup motto is "Just Keep On Truckin'". But there's much, much more. For example, if the first tag is LI, it will supply the application with enclosing HTML, BODY, and UL tags. Why UL? Because that's what browsers assume in this situation. For the same reason, overlapping tags are correctly restarted whenever possible: text like:

This is <B>bold, <I>bold italic, </b>italic, </i>normal text

gets correctly rewritten as:

This is <b>bold, <i>bold italic, </i></b><i>italic, </i>normal text.

Looks simple, doesn't it? Well, that's a simple example and it's still a tricky and awkward result in practice. Cowan's patience in persuing this and what looks like a rather elegant solution is to be applauded. Porting his code to C++ has been pretty quick and painless so far, and I expect the new piece, which I've called Taggle, to be finished pretty soon. Arabica will be stronger for it - thanks John!


[Add a comment]
Monday 28 January, 2008
#[linkfarm] Twilight Heroes - Justice Served Nightly
Interesting and clever web playable game. [Play This Thing review]

[Add a comment]
#

In the unlikely event you sent me email between about 17:00 on Friday and 9ish this morning, I won't have received it and you may want to send it again.


[Add a comment]
Friday 25 January, 2008
#[linkfarm] Windows Services Can Install Themselves - Never use the InstallUtil.exe utility that ships with the .NET SDK again.
[Add a comment]
#Unread Stacks

In the basket under the bed, I've got 18 unread books, and there are others lurking on the shelves downstairs. On my desk I have 6 unread technical books. The comics to read pile is about 12 inches and 22 separate volumes deep.

How did I get here?

Ken [e] [w] said More to the point: how do you get away?

You may recall me mentioning I've got a stack of around ten Patrick O'Brian 'Master and Commander' novels. Now, while historical adventure-soaps may be my bag, I think I'm either going to mothball these books out of sight for my retirement, or just bung 'em in the charity shop.

There are too many other good things to read, perhaps a touch of ruthlessness is in order? [added 25th Jan 2008]

Russ L said I currently have 18 unread too, aside from the two I'm reading at the moment.

I'm glad I'm not as badly off as I might be with this.

On a completely unrelated note, this: http://russl.wordpress.com/2008/01/25/splenetic-memetics-birmingham-blog-tig-tag-tick [added 25th Jan 2008]

I get out of it by reading more of them I suppose. The curious thing is the backlog seems to be growing (or at least not getting any smaller) despite putting myself under a book-buying embargo some time last summer. That's for a definition of books which doesn't include work books or big comics, natch. That said, I violated the embargo just last Wednesday, buying a copy of Cyclecraft. I will, however, finish it this evening, and I've read nine other books this month already. Why isn't the stack diminishing? [added 25th Jan 2008]
smellygit said I wish they did a Walkingcraft, I'd hand it out to the fools who wander aimlessly across the road whilst I head towards them ringing my bell and wearing a fluorescent coat... [added 28th Jan 2008]
Ken [e] [w] said Haha smellygit! Like the hairy-arsed builder who stepped out in front of me this morning, and by default also stepped out in front of the Ford Focus giving me a wide berth (but at pace that was bewildering slow). He received a warning (and justified) toot from the driver, a clamorous bell-ring from me, but it was loud 'stupid nonce' I shouted that finally got his attention.

Oh, that and being clipped by the mirror on the Focus. Twot. [added 28th Jan 2008]

My favourites are the people who do stop at the kerb, look you right in the eye, and then walk out anyway. Where are their brains? [added 29th Jan 2008]

[Add a comment]
#[linkfarm] New Bond film title is confirmed - The next James Bond film is to be called Quantum of Solace, producers have confirmed.
If Licence Revoked was changed to Licence To Kill because not enough people understood what revoked meant, how the hell is Quantum of Solace supposed to go over with the general public?

   * Ken [e] [w] said Aeon Flux anyone?

License Revoked was a good title. Quantum of Solace just isn't (not for a Bond movie), regardless of the reader's vocab. Perhaps they'll change it to Tea and Sympathy ;) [added 25th Jan 2008]


[Add a comment]
Thursday 24 January, 2008
#

For my Fantasy GDFAF I rated the chances of seeing Ministry as nil, because they didn't tour any more. But I was wrong. Trip to Wolverhampton, anyone?


[Add a comment]
#[linkfarm] Cycle Stands In Birmingham
With the exception of New Street Station, can't recall using any of them.

   * smellygit said I remember using the one's by the Aston Students guild, many, many years ago [added 24th Jan 2008]

   * Ken [e] [w] said So have you Jez, the Aston Guild ones I mean - oh, and some just up from the Academy. Remember?

[added 24th Jan 2008]

Blimey, I have, yes. I have a mental image of a cycle stand as one of those stupid things that your supposed to stick your front wheel into. Because, you know, wheels look so much better when they've got a nice bend in them. [added 24th Jan 2008]

[Add a comment]

#[linkfarm] Once In A Lifetime - Man, that was weird
Kermit the Frog sings. Talking Heads video from 1980 which I was equally fascinated and repelled by. Here they are again in an extract from Stop Making Sense.

   * anonymous said You login link on the leave a comment is broken :( [added 25th Jan 2008]

Fixed now. [added 25th Jan 2008]

[Add a comment]

# I've had a bit of a poke and a prod around in the JezUK webservery-underpinnings. If you see anything awry (assuming, of course, you can read this), it'd be super if you could let me know.
[Add a comment]
Wednesday 23 January, 2008
#[linkfarm] Man saved from croc shot in error - Crocodile attacks and shootings are rare in Australia. To suffer both at once is - to say the least - unfortunate.
[Add a comment]
Sunday 20 January, 2008
#[linkfarm] Changing Software Development: Learning to Become Agile - "Changing Software Development" explains why software development is an exercise in change management and organizational intelligence. An underlying belief is that change is learning and learning creates knowledge. By blending the theory of knowledge management, developers and managers will gain the tools to enhance learning and change to accommodate new innovative approaches such as agile and lean computing. "Changing Software Development" is peppered with practical advice and case studies to explain how and why knowledge, learning and change are important in the development process. Today, managers are pre-occupied with knowledge management, organization learning and change management; while software developers are often ignorant of the bigger issues embedded in their work. This innovative book bridges this divide by linking the software world of technology and processes to the business world of knowledge, learning and change.
My chum Allan's book is in the shops. I haven't seen a copy yet myself, but he's a smart guy and I have no doubt that his book says sensible things.

[Add a comment]
Saturday 19 January, 2008
#

Updating my low-grade train spotter list, I discovered the 390-009 Virgin Queen that I collected in October has been renamed Treaty of Union. Realised immediately I would have to ride the train again.


[Add a comment]
#[linkfarm] Ask Golden Age Wonder Woman
[Add a comment]
Thursday 17 January, 2008
#[linkfarm] Stickleback
More images and reference from D'Israeli's current 2000AD series. Look at the large versions of the panels, then start petitioning Tharg to start printing on A3.

   * Ken [e] [w] said I've just read the first round of Stickleback (thanks!) and was mightily impressed. Mr D keeps pushing the technique envelope, and the only criticism I have is that the rendering was a little too dense, and 'noisy' - a criticism that would be answered in the fairytale A3 2000ad universe ;)

BTW I have a feeling that most thrills so far (progs from Jan to April) have been exceptionally strong - a feeling I didn't get at all with 2006's output. I pray it continues... [added 17th Jan 2008]

I think it does. Upgrade your thrill buffers! [added 17th Jan 2008]

[Add a comment]

Wednesday 16 January, 2008
#[linkfarm] Campfire At Will - hack the urban context
Not immediately obvious where it's going, but I chuckled high-concept chuckles by the end. Lovely.

[Add a comment]
#[linkfarm] WHEELED SHIELD (Sep, 1956)
I am Iron Man!

[Add a comment]
#[linkfarm] Clash of the compacts: Eee vs Air - There's no doubt, on the basis of the specs, the Air has the Eee licked on performance, usability, wireless capability and screen space. However, the Eee takes the lead on expandability - the range and type of connectors it offers, basically - and sheer portability.
And, by a metric significant distance, price! 200 quid vs 1200?

[Add a comment]
#[linkfarm] RFC678 - Standard File Formats - The logical page is the area that can contain text, the height of this area is measured in lines and the width is measured in characters. A typical logical page is 60 lines high and 72 characters wide.
Of historical interest, but amusing.

   * Ken [e] [w] said Twee in the same way, I guess, that PDF files will be viewed 30 years from now - with their limited scope for embedding / linking 'rich' media.

What then? A virtual 'mind' format? Close your eyes and *feel* what the author felt? Whoa there, I've gone too far. [added 16th Jan 2008]

I found it amusing because some of what it contains - tabs being 8 characters wide, 72 characters per line - are pretty deeply embedded in programming culture, often provoking a degree of discussion. Google tabs vs spaces for an example. [added 16th Jan 2008]

[Add a comment]

Saturday 12 January, 2008
#[linkfarm] Bunwell's Village Store
My childhood house has a website.

[Add a comment]
Friday 11 January, 2008
#

SMTP error from remote mail server after MAIL FROM:jez@jezuk.co.uk SIZE=5446:
host mx1.uk.tiscali.com [212.74.100.147]: 550 mail not accepted from blacklisted IP address [195.188.213.7]

Oh dear. 195.188.213.7 isn't me. I wonder who it belongs to?

jez@riven ~
$ nslookup 195.188.213.7
Server: ns1-gat.blueyonder.net
Address: 62.31.144.39:53

Name: smtp-out4.blueyonder.co.uk
Address: 195.188.213.7

Hmmm.

Spamhaus takes a rather hardline view to blacklisting, and why not? However, it does seem a little silly for one of the major UK ISPs to be dropping emails sent from one of the other major UK ISPs.

Dave [e] said Hi. The ISP belongs to me - and anyone else that seems to want to hijack it or use it. Don't know how or why, but now I'm blacklisted! What's that all about? I can't even write to my parents any more and they are in their 80's. I retire in 5 months, don't do spam, don't visit sites that are improper and don't really understand what is happening. If you can help to get me unblocked, I would appreciate it.

I have spent alot of time searching on line searching on my ISP address and this is the first time I have seen a negative thing about it.

The latest check on the below link say's I'm no risk, I know that I am no risk so what the heck is happening?

http://www.trustedsource.org/query/195.188.213.7

[added 27th Jun 2008]

ISP means internet service provider, and in this case it's Blueyonder (now Virgin Media). The address 195.188.213.7 isn't assigned to you, Dave, but to one of Blueyonder's outbound email relays. When you send your email, it disappears up the wire, bounces around inside Blueyonder for a bit, before emerging for its onward journey through one of the outbound mail relays (there are several).

The reason Tiscali are kicking back your mail is because some other Blueyonder customer (in fact, probably many of them) is sending spam (almost certainly unknowingly). One of the spam monitoring services has noticed and, as a result, blacklisted Blueyonder. Tiscali (because they are stupid) appear to be blindly following the blacklist, and are kicking back your email.

If you wait a while (hour, couple of hours, day), somebody at Blueyonder will have spoken to the people who run the blacklist (again), and get themselves unlisted. Hopefully they'll also talk to Tiscali (again), but I wouldn't bet on it not happening again.

[added 27th Jun 2008]
Dave [e] said Many thanks for that.

Tiscali now appear to have sorted them out, now it's another one rejecting our emails.

I don't send spam but seem to get everyone elses, even though I paid £90 for Norton 360 so stop it all.

I think that everyone who sells viagra or thinks that I want a loan have my email address!

The message we get is:

SMTP error from remote mail server after initial connection:

host manchester.sin1.netline.net.uk [213.40.66.235]:

550-Host 195.188.213.7 is listed at www.uceprotect.net as a spam source.

I even get messages returned when I reply to my friends.

Who knows, I'll be rejecting myself soon! (or is that Injecting myself)? [added 5th Jul 2008]


[Add a comment]
#accu2008 Conference

accu2008
The schedule for accu2008 has been published and registration is now open. As ever, it's a pretty fantastic line-up including process-giant Tom Gilb, Erlang inventor Joe Armstrong, and Haskell big-brain Simon Peyton Jones.

If you have even the remotest care about your software development, you should consider going. It's not just C++ or Java or what some supplier thinks you should be interested in this week, it's a solid, deep, programme on software development, process, project management, and so on. Ask your boss if he'll pay. He really should. If you book before the end of January it's £450 for ACCU members, or £550 for non-members. (Hint: It's ok to join then book.) For a four day conference, that's an utter bargain. (Compare with, say, DevWeek which charges over a £1000 for three days and has a timetable dominated by Microsoft staff talking about as-yet-unreleased Microsoft tools.)

But you would say that, Jez, you're the ACCU Chair ...

Well, yes I am the Chair. But I joined in running ACCU because things like the conference were so good, I don't say they're good because I'm Chair.


[Add a comment]
Wednesday 09 January, 2008
#[linkfarm] The world's first trainspotter - As the National Railway Museum in York opens a new £4m visitor centre, it claims a 14-year-old in 1825 as the world's first trainspotter.
[Add a comment]
Tuesday 08 January, 2008
#[linkfarm] Arabica - biblioteka C++ do obsługi XML - Jez, twórca biblioteki przedstawia zestaw narzędzi do obsługi formatu XML z poziomu języka C++. W trakcie prac nad Arabicą autor przyjął dwa priorytetowe założenia: poprawność oraz łatwość użycia. Arabica jest napisana w języku C++, w związku z czym, jest ona dostępna na wszystkich popularnych platformach.
[Add a comment]
#[linkfarm] 5 dangerous things you should let your kids do
[Add a comment]
Monday 07 January, 2008
#[linkfarm] Clarkson stung after bank prank - TV presenter Jeremy Clarkson has lost money after publishing his bank details in his newspaper column.
Ha, bloody, ha

   * smellygit said I thought Clarkson would be the sort of person you'd admire? [added 7th Jan 2008]

What on Earth gave you that idea? [added 7th Jan 2008]

[Add a comment]

Sunday 06 January, 2008
#[linkfarm] What Would Richard Feynman Do?
[Add a comment]
Friday 04 January, 2008
#[linkfarm] Not Exactly Shakespeare
How to write an article while avoiding elementary mistakes.

[Add a comment]
#[linkfarm] Evan Dorkin's Art for Sale
Milk! Cheese! Booze up and Riot!

[Add a comment]
#[linkfarm] Hoka-Hoka Happy New Year - 2008 is the year of the Beanworld!
Yay!

[Add a comment]
#[linkfarm] Kristin Hersh at Yep Roc Records
With a full stream of her latest album.

   * Ewan said Sounds great, I haven't always kept up with her stuff but she sounds on top form.

I can't imagine you haven't been to http://www.throwingmusic.com/freemusic/ - interesting that she was using the "tip jar" model quite some time before those Oxford upstarts... [added 5th Jan 2008]


[Add a comment]
#[linkfarm] Torque - This song was written under water.
New Kristin Hersh song. Beautiful.

   * Rol [e] [w] said And free too! [added 4th Jan 2008]

[Add a comment]
#[linkfarm] Eee PC Tips - A crash course in Linux
£186 ex-VAT! Mine's (perhaps I should say our's) is due in a month.

[Add a comment]
#[linkfarm] Who's archiving IT's history? - The relaunch of the IT History Society (formerly the Charles Babbage Foundation) exposes a problem which the Web has brought to journalism and historians - stuff is not being preserved. There are people, however, who are trying to build proper records of the past.
[Add a comment]
#[linkfarm] Batter Blaster - With it's unique pressurized, patent pending process, Batter Blaster makes organic light and fluffy pancakes and light and crisp waffles in minutes!
Gah! FFS! American pancakes only take minutes anyway. Seconds even. This is wild, rank, crazy stupidity.

[Add a comment]
#[linkfarm] This is what half price is
For Boots customer's who struggle with basic maths.

[Add a comment]
Thursday 03 January, 2008
#C++: copy_while and copy_until

template<typename InIter, typename OutIter, typename Pred>
OutIter copy_while(InIter first, InIter last, OutIter dest, Pred pred)
{
  for(; first != last && pred(*first); ++first, ++dest)
    *dest = *first;
  return dest;
} // copy_while

template<typename InIter, typename OutIter, typename Pred>
OutIter copy_until(InIter first, InIter last, OutIter dest, Pred pred)
{
  return copy_while(first, last, dest, std::not1(pred));
} // copy_until

For a warm smug feeling, briefly outline possible problems with the above code.

allan@allankelly.net said Interesting, on bloglines I don't see any of the a-z characters, all I get is:

<

,

,

>

(

,

,

,

)

{

etc.

So I see quite a few problems... [added 4th Jan 2008]

I don't think it liked my markup. I've changed, it but it hasn't noticed yet. Let's see what happens :) [added 4th Jan 2008]

The slightly tedious answer, by the way, is the the use of std::not1 in copy_until requires pred be an adaptable predicate (by, for example, having Pred derive from std::unary_function). This means using a simple free function as your predicate, which people expect to be able to do, isn't possible. I guess you could dink around creating traits classes and partial specialisations and so on, but it's easier just to rewrite as:

template<typename InIter, typename OutIter, typename Pred>
OutIter copy_until(InIter first, InIter last, OutIter dest, Pred pred)
{
  for(; first != last && !pred(*first); ++first, ++dest)
    *dest = *first;
  return dest;
} // copy_until

And no, Allan, this isn't an argument to program in some other language.

[added 4th Jan 2008]

[Add a comment]
Tuesday 01 January, 2008
#Happy New Year

Back at my desk for the first time in ages. The attic's going to be replastered in the next few days. This is both good, because it should mean I won't finish the day covered in a fine layer of dust, and bad, because I had to move everything into the bedroom next door. The JezUK computing infrastructure isn't especially large these days - couple of boxes, couple of flatscreens, wifi-router-hub, and a linkstation. The total distance moved was about 10 feet. Somehow I managed to kill my keyboard, and for several long panicky minutes it looked like my main box wasn't going to start. First off it fired up fine, but when I realised the keyboard was dead and I plugged in another, it abruptly powered down. And didn't come on. And didn't come on. And didn't come on. And didn't come on. And didn't come on. And didn't come on. And didn't come on. Suddenly, as bitter tears of impotent rage and panic sprung to my eyes, it powered up. Hopefully it won't be quite so horrible when I get to move back.

Ken [e] [w] said My boot drive failed just before Christmas, fortunately my ongoing work and stuff is on another drive (and that's backed up onto yet another drive, and a laptop). After a frantic morning of frankly desperate efforts I recalled I had a backup of this drive also, somewhere. It was about 4 months old, so only shy of a few updates. Whew. Note to self: back this one up more often!

What happened as I powered up the dead drive (to see if I could reformat and eke out some more use)? It fired up with no problems, and has run well since. WTF?

Computers. You can keep 'em.

BTW Happy New Year! [added 2nd Jan 2008]

I have nearly all my work sitting in version control, so I wouldn't especially worried on that score. It was more the time - fixing the machine, rebuilding the install and all that stuff - that I was worrying about. I have a job, a deadline, and not a lot of wiggle. Let's not dwell on the might have beens :)

[added 2nd Jan 2008]

[Add a comment]
<< December 2007 February 2008 >>