Lab Report #4: XML/TEI Encoding

I was incredibly excited about this lab in particular, because encoding is something that I was entirely unfamiliar with— actually, until I had gotten to college, I hadn’t even been exposed to it. Fortuitously, having a lot of computer science major friends has remedied that. That being said, I had no previous knowledge of what coding entailed, and mostly was working off the most vague idea of what html looked like. From what I understand, the difference between the two is that XML doesn’t actually do the same thing that HTML does- which is formatting and displaying data- and rather describes and carries the data.

We, as a group of four, were set to work on transcribing letter 4 (which, in my opinion, was the hardest one to try to decipher!).

The immediate problem was getting started, because none of the members in my group were particularly versed in Oxygen, or TEI/XML for that matter. I would say that we spent about half the class period getting familiar with the markup language itself, though even that was not nearly enough time to get acquainted. Therefore, it is no surprise that we did not get nearly as detailed in our transcription as I would have liked. The first roadblock (that is, the first one that wasn’t the legibility of the handwriting) was formatting. In the salutation of the letter, there was a crossed-out word that was still legible. This posed a number of questions: was it still a relevant piece of the data? If so, how would we add it into the document while still preserving the essence of it? Each question that was brought up garnered its own discussion within the group members, and I could see that what seemed to be a relatively simple process was much more complicated than I had anticipated. Perhaps it was a mixture of not being familiar with the language and the struggle to determine how exactly to display the data, but we only managed to get so far with the encoding. For one thing, we did not get so far as to tag references, but after reading some other blog posts from different groups, it opened my eyes to a new understanding of what some would consider relevant enough to tag.

From looking at the pure importance of the data, I can immensely appreciate TEI encoding as a form of transcribing text. However, I think it’s much harder to capture the true spirit of the original text— we were working with a mere picture of a letter, which was a shadow of its physical self, and even then some elements (such as the look of the paper and the color of the ink and the handwriting itself) were lost with our encoding. I feel as though there is only so much that one can capture with this method, but the parts that we can capture make this method revolutionary.

After leaving class, I was entirely smitten with the idea of what kind of potential digital encoding has; as if reading my mind, many of the resources given to us were from the Women Writers Project and I spent quite some time exploring their website. I am fascinated by this project and would seriously consider wanting to get involved in it somehow, as well as get much more comfortable with TEI/XML.



    If you are serious about getting involved in WWP, you should absolutely contact Prof. Julia Flanders. She leads the WWP, as well as several other cool TEI projects hosted at Northeastern. Plus, she’s super nice and fun to work with! Let her know you’d like to get involved and she might have some work (or at least some resources) to get you started.