I asked Elizabeth whether she’d be willing to share her first lab report as an early exemplar for the genre, which I know is unfamiliar. This report does an excellent job of balancing observation, reflection, and analysis, as well as of linking concepts across historical technologies. As we move forward, you will want to incorporate citations from our readings into these as well.
At the beginning of copying the letter, my pencil sailed across the page and my candle illuminated enough of the paper for me to read “Dear Sir” without difficulty. As time the time wore on, my candle burned low and my hand grew sore. I only got to three-quarters of the way through the first side of the letter, but I filled a page and a half of the yellow lined paper. As I was writing the last lines, I was wondering whether my pencil was going to last much longer. It was getting dull and I had to adjust my grip to use a side of the pencil which could still write. Poe did not use a pencil for the original letter, but his grip on the ink pen he was using would have mattered to the effectiveness to his writing. For the past few years, I have used fountain pens for most of my writing, including class notes and letter writing. It writes much more smoothly than a disposable pen or any sort of pencil, but it requires writing without too heavy a hand and to be frequently filled with ink. Writing with the pencil, especially as the lead wore down, was certainly tiring to my hand.
The candle, too, had a physical effect. As it wore down, I had to adjust it and the letter in order to read each word. Some of the unfamiliar words and those along the far edge of the paper required my holding them very close to the flame to decipher the letters. The wax of the candle was dripping down and even blocking the light of the candle.
Producing so much continuous text and at one time is rare for me. Much of my computing work requires modifying software, running it, and noting results, including performance scores and completion times. The exception is when writing up a paper, which I do not write in such a continuous order, and take multiple sittings before I have even a section drafted since I jump between sections frequently as inspiration strikes. This lab report is an exception, as I have written it continuously so far, except for a few thoughts jotted down for the second and third questions. Yet even though I have written all the sentences in this same order, I can easily delete words (“delete” was initially going to be “erase”), as I am writing it using vim. Poe’s letter shows no errors that he scratched out, though I had to use my eraser a few times.
I do not remember the date of the last time I wrote a letter of length on paper. I do know that it was a Saturday morning when it was still warm. I sat on a blanket in the yard and read a letter I received from a friend. I started writing a letter in reply, but it is still sitting on my desk, half completed. Through writing letters in pen, I have found how differently I think when writing with a pen than while typing or speaking. I have told that friend who will eventually receive my letter-in-progress that writing a letter to her feels more like an evening together in her living room than a Zoom call does. Even with the responses slowed down by months, they can feel more authentic than spending time while observed by a camera.
I have often wondered how people could write without the privilege of the backspace key. A reason that I study computer science rather than any sort of laboratory science is the efficiency of control-Z. Despite my love of undoing my mistakes, I wonder if I would have liked to be a scribe or stenographer if I had lived in another era. In elementary school, I very much took to handwriting, and enjoyed copying down text in my spare time to practice the cursive letters. My handwriting has only gotten wrose since then, but over the years I have enjoyed thinking about the process of writing.
When I think about what it would be like to do those roles, I am mostly imagining the tools I actually use every day rather than historically accurate lighting and pens. I had never before tried writing by candlelight. If I can’t use sunlight, I am quick to turn on the overhead light. If the writing had to happen after dark, I imagine that to be the most challenging aspect, though the physical requirements for the hand and back would not have been comfortable either.
I find it very impressive that people produced works of very high quality when it was so difficult to produce text at all. While of course we have preserved the books that are of high quality and many texts of lower quality were never replicated or preserved, while also encouraging nostalgia for old books, the physical books and texts of the past are often much more beautiful than modern books. Today, the challenge of producing a book is a matter of creating content worthy of filling a book. If I wanted to create a “book shaped object” today, I could easily find enough text that I had written and saved on my laptop in order to fill pages and print out and pay a few dollars to bind. Even if someone had content in the past, it would have meant tedious hours by candlelight or waiting for daylight in order to copy it into a book, letting alone the production of the paper.
FASTA file formats and ASCII character set
The FASTA file type is used for storing genetic sequences. Each sequence in the file will have a title preceded by “>” and then the line with the data, often the sequence of nucleotides, but someone a sequence of amino acids. If it is showing the sequence of nucleotides for DNA, it will be made of A, C, T, G, and “-“. The dash is used for alignment purposes. Because of errors in reading from the DNA, there can be errors in terms of additions, deletions, or substitutions. After aligning the sequences to account for these errors, the dash stands in for the missing character. It is all written with ASCII characters. This means that it is readable for a human in terms of understanding which character lines up to which nucleotide.
This is a file type that I frequently work with. Its features are a combination of design elements in order to make it readable for humans and readable for computers. The character set makes it readable for people, but the mere fact that the sequences are typically hundreds or thousands of nucleotides each and the file may contain thousands of sequences means that it is not going to actually be “read” by a person, but instead glanced at to ensure that it is a typical FASTA file, if anything.
Much work around computational biology and similar projects are focused on compressing the data. This is an area where there are incredible amounts of data, and so efficiency in the data storage is critical. Using the ASCII character set means that the FASTA files are easy to manipulate by typical text editing methods, whether that is programming in Python or modifying right in a text editor. It also means that if a person _does_ want to look at it, they can very quickly see what they are looking at and write a program to analyze it in the same way that they would if they could do it by hand. However, the other side of this is wasted memory. An ASCII character is 8 bits. If we only wanted to represent ACTG, that could be done with two bits. Since we need the gap represented as well, it adds a third bit. This still means that we are only using five of the possible eight values that can be represented with 3 bits, but it is possible that more could be useful for marking the start and end of sequences. With the 8 bits that are used, there are a possible 256 values. This means that a total of 251 values are never used, and 5 bits are wasted per character. The storage requirement for these sequences could be half of what it is without the human readability aspect, or it could use its own reader to be human readable anyway.
Much of my work focuses on how the computers can or cannot read the data well. It is more about computer efficiency than human efficiency. The structure of FASTA files has little cultural signficance, but the choice to use the ASCII character set seems like an obvious choice, like looking at things with a lightbulb, yet it does have an impact on the life of the FASTA file format.