It’s been a few months since the folks at Google Labs unveiled their fancy n-grams toy. It’s fun to play with, but, as I’m sure all my hermeneutically suspicious readers know, there are plenty of objections to taking the findings seriously. The team of non-digital-humanist scientists behind it have since published an FAQ. Since the topic’s been handled much more ably by others, I won’t go through the list of problems here. However, I do think that it could be useful for me. In a previous post, I described my efforts to get a sense of when “Celestial Empire” became associated with China, and when it stopped. And now, I can give you a sexy graph:

As I predicted, there’s nothing in the eighteenth century, and it dwindles in the twentieth, with peaks around the Opium Wars and the Boxer Rebellion. A better use of n-grams, though, is to make comparisons. Here’s “Celestial Empire,” grouped with “Middle Kingdom” and “Chinaman”:

I think that despite the “noise” in the data, this is a fairly effective demonstration that, as a name for China, “Celestial Empire” was more popular than “Middle Kingdom” in the nineteenth century, and vice versa in the twentieth. Why did Victorians like the “Celestial Empire”? I’m hoping to answer that in my dissertation. What the above search suggests to me, though, is that there’s some historical shift going on around 1850, when all of the sudden “Chinaman” becomes way more popular than “Celestial Empire when it had been so closely correlated before that.

Despite all the shortcomings of the Google Books data and metadata, though, I’m really curious to see how these searches would look for a corpus with more reliable metadata–namely, the ProQuest and Gale databases of British periodicals.

Advertisements