| JezUK Ltd - The Coffee Grounds - February 2008 |
| << January 2008 | March 2008 >> |
... is plinky-plonking what might be a tune outside. It's not a good looking van and I can't say I blame the punters for not flocking to it. The time, which I'm recording for posterity, is 15:52.
Previous years:
Cavalo nero is beautiful dark loose cabbage with a lovely slightly sweet flavour. It's very easy to grow, and despite what the link above says, ours overwintered and we're cropping it now. Because it is that bit sweet, it works rather well against the sourness of lemons.
To make tea for two, lightish lunch for four, run to your pantry and gather
Pop on your pinny, and get cracking!
serve with boiled rice. [added 13th Oct 2009]
Let me tell you it is not at all cheap eating! At our local country club, the terrine, parfait et foie d’oi frits sur leur jus de costs a fortune.
Prof Scrub
http://www.profscrub.com [added 12th Feb 2008]
Back in December I shipped myself over to Cambridge to do a technical presentation on the wonders of iteration. It went pretty well although I still think I'm not quite getting my real point over, possibly because I haven't fully worked out what my real point is. I know it's in there, I need to surface it better.
The code I present is in Java and Python, primarily for my convenience. I'd written the original talk using Java, and the examples I'd lifted were in Python. Further, most people can read Java and Python even if they don't use either of them, and you can generally fit a reasonably complete example on a single slide without an excess of punctation or other distractions. Consequently they're good from a pedagogic standpoint. The audience that evening was largely composed of guys who worked in C++, and I asserted throughout that the ideas I was presenting could be implemented in C++ without any particular difficulty. I was picked up on that by someone (who's name I unfortunately didn't catch) at the end, who argued that the typing would get in the way. Specifically, he thought that a C++ version of this Java iterator would be awkward.
public class FilterIterator implements Iterator
{
public FilterIterator(Iterator iterator, Predicate predicate)
{
iter_ = iterator;
pred_ = predicate;
findNext();
}
public boolean hasNext()
{
return next_ != null;
}
public Object next()
{
Object current = next_;
findNext();
return current;
}
public void findNext()
{
next_ = null;
while(iter_.hasNext() && next_ == null)
{
Object candidate = iter_.next();
if(pred_.test(candidate))
next_ = candidate;
}
}
...
}
Writing new iterators in Java is very easy, because all you have to do it fulfil the java.util.Iterator interface. With that done, you can throw your new iterator around and it'll drop right in thanks to that common interface. Iterators in C++ can be more awkward, it's true. There is no common base class, and iterators are typically tied to the type of whatever they iterate over. A std::vector<string>'s iterators are of type std::vector<string>::iterator, a std::deque<string>'s iterators are std::deque<string>::iterator, and the two are not interchangable, at runtime at least, despite both having essentially identical characteristics. Generally this isn't a problem but for the situations I describe in the presentation, using iterators to abstract data sources or to build a view of something, it appears to present something of an obstacle.
I continued to assert that the obstacle was, in fact, illusory, and that a little bit of scaffolding would take you a long way. He disagreed. There was a bit of back and forth around the room (including a suggestion that this was the Sapir-Whorf hypothesis in action), but I didn't win him over and short of starting to write code on the whiteboard it didn't look like I was going to. So we all went to the pub.
In the following few days I did write the code, and I think I would have brought him round in the end.
template<typename iterator_type, typename predicate_type>
class filter_iterator
{
public:
typedef typename iterator_type::value_type value_type;
filter_iterator(const iterator_type& begin, const iterator_type& end, const predicate_type& pred):
current_(begin), end_(end), pred_(pred)
{
while((current_ != end_) && (!pred_(*current_)))
++current_;
} // filter_iterator
value_type& operator*() const { return *current_; }
value_type& operator->() const { return *current_; }
filter_iterator& operator++() { advance(); return *this; }
bool operator==(const filter_iterator& rhs) const { return current_ == rhs.current_; }
bool operator!=(const filter_iterator& rhs) const { return !(operator==(rhs)); }
private:
void advance()
{
do
{
++current_;
}
while((current_ != end_) && (!pred_(*current_)));
} // advance
iterator_type current_;
iterator_type end_;
predicate_type pred_;
};
There are some obvious differences from the Java version, primarily due to differences in language idiom. Java iterators know the range they traverse, while C++ uses a pair of iterators to denote a range.
This iterator does indeed filter, and its use is straightforward :
class Even
{
public:
bool operator()(int& i) { return i%2 == 0; }
};
...
std::vector<int> vec;
... populate the vector ...
filter_iterator<std::vector<int>::iterator, Even> fb(vec.begin(), vec.end(), Even());
filter_iterator<std::vector<int>::iterator, Even> fe(vec.end(), vec.end(), Even());
for( ; fb != fe; ++fb)
{
std::cout << *fb << std::endl;
... do something else with *fb ...
} // for ...
Job done, right?
Well, no. It works, but the usage is cumbersome at best. The type of the instantiated filter_iterator is related not only both to the type of iterator it wraps, and also the type of the predicate. Not only is filter_iterator<std::vector<int>::iterator, Even> not substitutable for a std::vector<int>::iterator, it's not even a substitute for filter_iterator<std::vector<int>::iterator, Odd>. It's not exactly adding flexibility, is it?.
But what to do?
Huge chunks of the C++ Standard Library are built around generic programming techniques, most notably the containers, iterators, and algorithms of the STL. The approaches I describe in the talk are, quite plainly, object-oriented techniques. The proliferation of types in the code above arise from the tension between the generic and object-oriented programming.
Type erasure gives us a way to relieve that tension. In our filter_iterator, we don't actually care about the actual type of iterator being wrapped. It isn't exposed through the public interface, and so our client code doesn't care about it either. In the public interface, we only care that it returns an int (or whatever) when dereferenced. Internally, the implementation only requires that wrapped iterator can be advanced by calling operator++ and compared for equality. If only there was a common base class along the lines of
class iterator_base
{
virtual int& get(); // aka operator*()
virtual void advance(); // aka operator++()
virtual bool equal(const iterator_base& rhs); // aka operator==()
}
If you squint a bit, looks almost like the java.util.Iterator, doesn't it? Perhaps Josh Bloch was on to something after all. Anyway ... We can effectively introduce such a base class through the use of an adaptor or wrapper. If we using a template class with a non-template base class, we create a type specific wrapper to swaddle around the iterator, which we can then manipulate through the base class. Because the base class isn't parameterised, it doesn't expose anything about the wrapped iterator. By the classical computer science technique of an extra layer of indirection, we've slipped in a common base class down the side of our existing iterators.
template<typename value_type>
class iterator_holder
{
public:
template<typename iterator_type>
iterator_holder(const iterator_type& iter) :
iter_(new holder<iterator_type>(iter)) { }
~iterator_holder() { delete iter_; }
...
value_type& get() const { return iter_->get(); }
void advance() { return iter_->advance(); }
private:
class holder_base
{
public:
virtual ~holder_base() { }
virtual holder_base* clone() const = 0;
bool compare(holder_base* rhs)
{
return (type() == rhs->type()) && (up_compare(rhs));
} // compare
virtual const std::type_info& type() const = 0;
virtual void advance() = 0;
virtual value_type& get() const = 0;
protected:
virtual bool up_compare(holder_base* rhs) const = 0;
}; // class holder_base
template<typename iterator_type>
class holder : public holder_base
{
public:
holder(const iterator_type& iter) : iter_(iter) { }
virtual holder_base* clone() const { return new holder(iter_); }
virtual const std::type_info& type() const
{
return typeid(iter_);
} // type
virtual void advance() { ++iter_; }
virtual value_type& get() const { return *iter_; }
protected:
virtual bool up_compare(holder_base* rhs) const
{
holder* r = dynamic_cast<holder*>(rhs);
return iter_ == r->iter_;
} // up_compare
private:
iterator_type iter_;
}; // class holder
holder_base* iter_;
}; // class iterator_holder
That's enough code, and I won't dwell on the details. The main thing to note that outmost class iterator_holder is parameterised on the value type, on what the held iterator points to, rather than the type of the iterator itself. The holder and holder_base are the template class with non-template base described above.
We can rewrite filter_iterator to use iterator_holder, and our example becomes
filter_iterator<int, Even> fb(vec.begin(), vec.end(), Even());
filter_iterator<int, Even> fe(vec.end(), vec.end(), Even());
for( ; fb != fe; ++fb)
{
std::cout << *fb << std::endl;
... do something else with *fb ...
} // for ...
But look at this. We can also write this
std::deque<int> deq;
...
filter_iterator<int, Even> fb(deq.begin(), deq.end(), Even());
filter_iterator<int, Even> fe(deq.end(), deq.end(), Even());
for( ; fb != fe; ++fb)
{
std::cout << *fb << std::endl;
... do something else with *fb ...
} // for ...
Exchanging the vector for a deque doesn't require any other change. That's rather a useful result.
We can perform a similar exercise with the predicate type. So long as it has some function that takes a value_type and returns bool we really don't care about the precise wheres and why fors. Writing a similar predicate_holder, the filter_iterator reduces to
template<typename value_type>
class filter_iterator
{
public:
template<typename iterator_type, typename predicate_type>
filter_iterator(const iterator_type& begin,
const iterator_type& end,
const predicate_type& pred):
current_(begin), end_(end), pred_(pred)
{
while((current_ != end_) && (!pred_.test(current_.get())))
current_.advance();
} // filter_iterator
filter_iterator(const filter_iterator& rhs) :
current_(rhs.current_), end_(rhs.end_), pred_(rhs.pred_) { }
filter_iterator& operator=(const filter_iterator& rhs)
{
current_ = rhs.current_;
end_ = rhs.end_;
pred_ = rhs.pred_;
return *this;
} // operator=
bool operator==(const filter_iterator& rhs) const { return current_ == rhs.current_; }
bool operator!=(const filter_iterator& rhs) const { return !(operator==(rhs)); }
value_type& operator*() const { return current_.get(); }
value_type& operator->() const { return current_.get(); }
filter_iterator& operator++() { advance(); return *this; }
filter_iterator operator++(int)
{
filter_iterator c(*this);
advance();
return c;
} // operator++
private:
void advance()
{
do
{
current_.advance();
}
while((current_ != end_) && (!pred_.test(current_.get())));
} // advance
iterator_holder<value_type> current_;
iterator_holder<value_type> end_;
predicate_holder<value_type> pred_;
}; // class filter_iterator
It doesn't look hugely different from the first version, but the type signature is much more straightforward. Not only is it much easier to work with, it's significantly more flexible.
filter_iterator<int> fb(vec.begin(), vec.end(), Even());
filter_iterator<int> fe(vec.end(), vec.end(), Even());
for( ; fb != fe; ++fb)
{
... do something with *fb ...
} // for ...
// and now the odds
fb = filter_iterator<int>(vec.begin(), vec.end(), Odd());
fe = filter_iterator<int>(vec.end(), vec.end(), Odd());
for( ; fb != fe; ++fb)
{
... do something with *fb ...
} // for ...
Victory is mine, I gloated to myself before realising I'd rather exceeded my brief. The iterator_holder class, far from being an implementation detail for the filter_iterator, is a useful piece of kit in its own right, providing a runtime polymorphic type-safe wrapper for arbitrary iterators. It's the thing I should have been aiming for in the first place. Everything else - filtering iterators, transformers, or whatever - could be built on and with it. Ah well - maybe next time.
External Polymorphism - An Object Structural Pattern for Transparently Extending C++ Concrete Data Types. What we've just done has a name.
Googling around I found A Fistful Of Idioms - Giving STL Iterators a Base Class, an article from 2000 in which my chum Steve Love develops an any_iterator wrapper class. I must have read it at the time, but can't explicitly recall doing so. Towards the end of last year, Thomas Becker published On the Tension Between Object-Oriented and Generic Programming in C++ and What Type Erasure Can Do About It in which he developed any_iterator, a type-safe, heterogeneous C++ iterator. The Adobe Source Libraries also have an any_iterator. All of those are rather better developed that what we stumbled over writing filter_iterator. Perhaps I should pre-emptively google next time?
Boost.Iterator includes a filter_iterator. It also includes several other useful iterator adaptors, including a transform_iterator and a zip_iterator. Matthew Wilson describes a bidirectional filter_iterator in an extract from his Extended STL book. Here's another. All are essentially equivalent to the naive version presented above, if rather more complete and, in the Boost case, rather more formally specified.
On a slightly different track, Mr Edd has an opaque_iterator, designed to reduce compile-time dependencies. I do like the look of that - it's jolly clever.
Kevlin Henney's article in the August 2000 C++ Report, Valued Conversion, covers the same techniques describing a class which can hold any value. That code grew up to become Boost.Any.
The Taggle parser in subversion is now parameterised on string_type and string_adaptor, in exactly the same way as the usual Arabica XMLReader class. The two are now equivalent, which means that all the SAX filters, the DOM builder, XPath, and so on can be applied to Taggle.
So we go
Up up - do the jump
Move around and clap your hands together
Down down - turn around
Having fun is what it's all about

Hal-baby and I took in a show at the Alex on Sunday afternoon. He loved it. I loved it. It was top fun!
(Does it feature the same 'North Atlantic' cast?)
((Er, Jez - is that a graphic I see before me?)) [added 5th Feb 2008]
Gah. I've been putting this off, and off, and off, but I suppose I'd best get on and get it out the way, then I can stop fretting, RussL can stop wondering if I've ignored him, and we can all resume our lives with renewed vigour and pep.
So it goes like this, so read that now if you haven't already. M'chum RussL has tagged me, and I wasn't safe, or exes, or whatever today's young people call it these days. Here we go:
God, this is difficult. I'd attempt to make up an initially plausable lie that spiralled into madness, but chum Tom does that kind of thing much better than me. And, indeed, already has. I could go for that old chestnut about Birmingham having more miles of canal than Venice, but then Birmingham is very, very large and Venice is very tiny in comparison. It'd fit comfortably inside the Middleway with plenty of room left over. Actually, it's less of a difference than you think - Birmingham has 58km of canal, while Venice has 42km. Given a population about around a million against Venice's 62,000, that actually means that Birmigham is vastly under-canaled. It has a trifling 5.8cm of canal per head of population against Venice's magisterial 67¾cm. For Birmingham to match that we'd have to build another 620km of canal. I'm guessing now, but that would probably mean digging out all the main roads into the city and flooding them. And that would be silly, wouldn't it?.
As if this wasn't hard enough already. Back in 2001, I got a phone call from a woman at Thames television wondering if I would go on some late night entertainment (i.e. an early let's-laugh-at-the-internet-weirdos) programme. She was quite disappointed when I explained that a) I didn't live in London and b) all that stuff about playing Top Trumps with the cards prostitutes leave in phone boxes wasn't meant to be taken seriously.
That was easier.
And will they still love me in morning? Ok, I'm choosing The Baron even though I suspect he's officially in Smethwick. Nevermind, I haven't seen him for a while and hopefully he'll have forgotton about this by the next time I do. I'm also choosing international man of musicalness Dubber. I don't believe we've met, but I realised recently that I know his wife. No, not know his wife. Know his wife. Sort yourself out. Really.
Can I go now, sir?
As an aside, and observant readers will probably have realised, I really don't like the word blog, but I can't explain why. I'm also uncomfortable with Brum and rarely use it, because I'm not from round these parts and it doesn't seem right for me to.
| << January 2008 | March 2008 >> |