Lab #10 Prompt

Today we will be thinking about computational classification using the files in this repository, especially the two Google Colab notebooks. Like last week, we’ll be interacting with code, but you don’t need to understand every function to meaningfully explore. We will focus on understanding the transformations of data that happen during a classification process.

For your lab reports, please work through at least the first notebook, on classifying using words and consider the second notebook, on classifying using topic models. When you open each notebook you should save a copy in Google Drive so you can make changes and experiment.

From here, the report should continue the line of thinking we started last week, considering what happens when the book becomes one record in a digitized corpus, and when categories like genre become something we can model and analyze computationally. How do human judgement and algorithmic processes intersect in a process like genre classification? What benefits can you see such processes offering to familiar cultural heritage tasks, such as cataloging, preservation, and access? What limitations or dangers do you see in such processes?