You are looking at archived content from my "Bookworm" blog, an experiment that ran from
2014-2016. Not all content may work. For current posts, see here.
Posts with tag UK
Back to all postsHansard
Dec 14 2015
A first pass at understanding the potential of the Hansard corpus through a Bookworm browser.
I’ve divided up the native XML by using the intrinsic speaker tag into a variety of individual speeches.
A “speech” can be very short; on average, each one in the Hansard corpus is 225 words.