I chose Great Expectations, and I had absolutely no idea what this novel is about. I only knew that Charles Dickens wrote it, and that there is (possibly) a character named Pip. Having read Tale of Two Cities and being familiar with Oliver Twist and A Christmas Carol, I predicted that this novel deals with class struggles in England or offers some sort of social critique. The title suggests that a character or group has high hopes for something—maybe the poor or working class strive to move up in society?

Screen Shot 2014-10-16 at 3.21.56 PM

Through Wordle, I found that (apart from stop words) “Mr” and “Joe” appear most often in the text. There are lots of character names in the word cloud, but some of the names are quite odd and don’t offer any details as to who these people are and what they do in the novel. Some don’t even offer any hints for the gender of the character. Most of the verbs that appear are quite simple: look, saw, thought, come, go, etc. The common nouns are pretty basic as well: face, eyes, house, room, hands, etc. These words don’t offer much help in terms of determining the plot or themes of the text, although they could suggest that the novel is realistic, revolving around everyday events. This would certainly fit with what I know about Dicken’s work.

I also created word clouds for the first and final chapters to see how they compare. In the first chapter, “man” is the largest word, followed by “young.” Many of the words are related to family and death: mother, father, boy, dead, tombstone, church. “Pip” also appears in this word cloud, but I don’t know if he’s the “father”, the “boy,” or someone else. Confining my textual analysis to one chapter offers more insight into the plot of the novel. I would assume that this chapter recounts the life and death of someone in a family. Similarly, the final chapter is frequented with an array of different words: old, Estella, Biddy, little, place. There are also words like remembrance, evening, parting, years, and husband. These words suggest that this chapter reminisces on certain characters and events. It suggests a chunk of time passes by the end of the novel.

Screen Shot 2014-10-17 at 9.26.57 PM

Using Voyant, I found that “Mr” and “Joe” are the first uncommon words to appear in the “words in the entire corpus” list—and they’re pretty far down. “I” is the third most common word, and from this I would assume that the narrative is in first-person. Initially I thought maybe Joe is the narrator, but I doubt a first-person narrator would mention his or her name more frequently than any other character’s. I’m curious what Joe’s relation to the narrator is. I also found it interesting that “he” appears over 1000 times more than “she,” suggesting male characters dominate the novel. After omitting the preset list of stop words to the word cloud, “said” appeared as the most common word, followed by “Joe” and “Mr.” The novel must contain lots of dialogue. Looking at the heat map, “said” appears at a high frequency throughout.

The heat map for “Joe” was interesting, as the name appears a lot in the first third of the book, but not as much is the final two-thirds, with the exception of the final bit of the book. This makes me wonder if Joe leaves or dies part way through the novel, only to be brought up in conversation at the end. I also looked at “Mr” in context to see if it’s applied to one character in particular or multiple. It appears in context most often with three characters: Pumblechook, Wopsle, and Hubble. “Miss” is also a popular word, but it has two definitions: as a title for a woman and a verb. However, looking at the words in context, it’s used very little as a verb. Most of the time, it’s used in reference to Havisham, Estella, Pocket, or another woman. In looking at “Mr” and “Miss” in context, I’m able to determine the gender of many of the characters appearing in the world cloud.

Screen Shot 2014-10-20 at 10.18.57 PM

Textexture reveals very dense, complex networks for “Mr” and “Joe”—so complex that I found it hard to decipher clear relationships. This map made me think that the novel revolves around Joe. Could he be the protagonist, even if Voyant suggests that he doesn’t appear much in the middle of the novel?

One of the most influential keywords was sister, a word that I didn’t notice through any of the other sites. There’s a relationship between Joe, sister, afraid, cry, terror, and home. This suggests there’s some kind of horrific event revolving around the sister. So who is “sister”?

Texture highlights the “most influential contexts” as “Joe,” “Mr,” “sister,” and “made.” Initially I thought the sister is Joe’s, but Mr. threw me off. Could it be the narrator’s sister, who has some relationship to a Mr. Joe? Do Mr. and Joe even refer to the same person? On the side of the site, there are highlighted passages regarding the selected set of words. In looking at these, I noticed that the program is actually picking “Mr” out of “Mrs.” The narrator’s sister is Mrs. Joe Gargery. I don’t know why “Mr” appears in the map over “Mrs,” but it sure is deceiving!

Screen Shot 2014-10-17 at 9.52.07 PM

I used the Ngram viewer to look at “Joe,” “Miss,” and “Mr.” “Joe” is on a somewhat slow but steady incline from 1800 to the present. The name seems very common nowadays, but it wasn’t so prevalent in literature in 1860. The graph for “Mr” is quite jagged, peaking in 1812 and again in 1840. In 1860, the trend shows that “Mr” again peaked, but not to the extent of usage in 1812 and 1840 publications. I wonder if the sharp decrease in appearance in the decade preceding Great Expectation indicates something about class structure, and if that factors into the novel at all. The use of “Miss” was most frequent in 1910, and was on a steady incline in 1860.

Screen Shot 2014-10-17 at 9.56.26 PM

For the heck of it, I also looked at trends for gender pronouns. There’s a huge gap between the frequencies of appearance of the male versus female gender, but that gap gets smaller towards 2000. The use of “he” and “she” in this novel fits the trend in 1860. I also looked at some of Dicken’s uncommon names to see whether they were common before this novel. “Wemmick,” “Havisham,” and “Jaggers” do not appear in the corpus until 1860, so they must be Dickens’ creations.

So to recap: The novel is narrated in first person. The narrator (who I assume is not Joe) has a sister married to Joe Gargery, and something bad may happen to her. The novel contains lots of dialogue and lots of characters, some with unique surnames, and there seem to be more male than female characters.

After reading Chapter 1, I confirmed that Pip is the narrator of the novel. Pip’s sister is Mrs. Joe Gargery! His parents are dead & buried in the churchyard—he never knew them. Chapter 1 involves Pip’s threatening encounter with a man in the churchyard. The man tells Pip to get him a file and wittles, or else. The man also terrifies Pip by talking about a sinister young man, and the chapter ends with Pip running home. Clearly, my predictions based on the Wordle for Chapter 1 were a little off.

I found distant reading interesting and thought provoking, but not effective. It leaves a lot of blanks for the reader to fill in. I was able to discern basic things about the novel through textual analysis, such as its first person narration. However, I felt that these methods could be more deceiving than enlightening. For example, Joe was a word that popped up as prominent in the word clouds and Textexture. Logically, I assumed Joe is a key character in the novel. Yet from reading the first chapter, I realized that Joe is used in terms of two characters: Joe Gargery AND his wife, who is referred to as Mrs. Joe Gargery. So I have to wonder, if the textual analysis were able to distinguish between these two Joes, would either of these characters prove more prominent in the story than any of the other characters that appeared in the word clouds? Textexture was also confusing in regards to Joe, as it showed a strong relationship between “Mr,” “Joe,” and “sister” when the real relationship was between “Mrs.”, “Joe,” and “sister.” After reading the first chapter, this relationship is crystal clear. But if I had relied solely on the Textexture map, I would have put these words together as Mr. Joe’s sister.

I found that when segmenting the novel by chapter—creating a Wordle just for Chapter 1, for example—textual analysis could be more (but not entirely) effective. From the word cloud, I was able to get a general sense of what Chapter 1 involved: something to do with family. But when I attempted to string words together, my predictions were off. I never would have predicted that “young man” would be some ominous figure in Chapter 1, or that the narrator never really knew his parents. One word can make a big difference, even if that word only appears once in a passage. Frequency does not equal importance.

Another limit I found to distant reading is that words can have different definitions and connotations, and a reader really needs to have some context to interpret them. The meanings I got through distant reading were not the meanings I got through normal reading. I do think textual analysis is interesting because it does grant the distant reader creative license. You have to piece together a puzzle by building upon patterns you find within the novel and in its historical context—It is kind of fun. Ultimately though, I wasn’t able to get a sense for the plot or themes of the novel. The theme of a novel can be very subtle, so I don’t think it can be highlighted through word clouds or text networks. I was able to identify some characters, but after the whole “Joe” thing, my assumptions could be completely off. This may defeat the purpose, but I think textual analysis would be most useful after reading the book (or at least a summary of it) to pick up on patterns relating to history or the author’s other works.

One thought on "Lab #8: Great Expectations

  • Helen Rogers

    This is a fascinating exercise and I’m impressed by your dogged commitment to exploring the potential of distant reading as a tool and your astute assessment of its limitations as a method and a means to interpretation. Looking forward to seeing what my students @DigiVics make of distant reading having read Great Expectations in serial form.

    Helen Rogers, Liverpool John Moores University