Posts

The Bookworm-Mallet extension

I promised Matt Jockers I’d put together a slightly longer explanation of the weird constraints I’ve imposed on myself for topic models …

Building outlines and slides from Markdown lectures with Pandoc

Just a quick follow-up to my post from last month on using Markdown for writing lectures. The github repository for implementing this …

More thoughts on topic models and tokens

I’ve been thinking a little more about how to work with the topic modeling extension I recently built for bookworm. (I’m curious if any …

Building topic models into Bookworm searches

I’ve been seeing how deeply we could integrate topic models into the underlying Bookworm architecture a bit lately. My own chief …

Searching for structures in the Simpsons and everywhere else.

This is a post about several different things, but maybe it’s got something for everyone. It starts with 1) some thoughts on why we …

Markdown, Historical Writing, and Killer Apps

Like many technically inclined historians (for instance, Caleb McDaniel, Jason Heppler, and Lincoln Mullen) I find that I’ve …

The Simpsons Bookworm

I thought it would be worth documenting the difficulty (or lack of) in building a Bookworm on a small corpus: I’ve been reading too …

Finding the best ordering for states

Here’s a very technical, but kind of fun, problem: what’s the optimal order for a list of geographical elements, like the states of the …

Bleg 1: String Distance

String distance measurements are useful for cleaning up the sort of messy data from multiple sources. There are a bunch of string …