Tutorials on Computational Tools and Skills
Non-credit tutorials on various topics are offered each year as a supplement to the core courses and elective courses in Digital Studies (DIGS). They are intended for students in the M.A. program in Digital Studies of Language, Culture, and History. However, students in other programs may be allowed to attend them if space is available.
Spring Tutorials
The following series of tutorials will be offered in the Spring Quarter during the week or weeks specified. Each tutorial will entail one or more in-person sessions as well as an assignment to be completed outside of class.
The Spring tutorials are optional. However, students doing a two-year research-intensive M.A. in Digital Studies are expected to take all the tutorials listed below in order to learn important skills they will need for their thesis projects and future careers.
For more information on these tutorials, including their meeting schedules and locations, please contact the Associate Director of Curriculum and Instruction of the Forum for Digital Culture.
Using GitHub (Week 2)
GitHub is a widely used online platform for the collaborative creation, maintenance, and dissemination of software. In this tutorial, students will learn how to (1) establish a GitHub repository; (2) start and manage permissions on a new branch of a repository; (3) make changes to the code in a branch; (4) “push” changes to GitHub as “commits”; and (5) handle a “pull” request to merge a set of changes from one branch to another. This tutorial entails one 90-minute session in Week 2 of the Spring Quarter.
Web Scraping with Beautiful Soup (Weeks 3 and 4)
In this tutorial, students will learn how to write Python scripts to pull data out of HTML and XML files using the Beautiful Soup library. This is called “web scraping” because it is a way of extracting information from websites. Students will learn how to create and document a workflow for scraping data from websites to create a data frame (a two-dimensional table of data) that contains the extracted information in a form suitable for automated analysis. This tutorial entails two 90-minute sessions, one in Week 3 and one in Week 4 of the Spring Quarter. Prerequisites: DIGS 30001, “Introduction to Computer Programming Using Python” (or equivalent), and DIGS 30002, “Introduction to Statistics Using Python” (or equivalent).
Unix/Linux and High-Performance Computing on Midway (Week 6)
In this tutorial, students will learn how to use a command-line interface (CLI) to communicate with a computer’s operating system (OS) to manage files and execute code (“run jobs”). Unix is a widely used family of multitasking, multi-user operating systems. Linux is an open-source version of Unix. Students will learn basic Unix commands and will learn how to run jobs on the University of Chicago’s high-performance computing cluster, which is known as Midway. Tasks that involve computations on large amounts of data (“big data”) often require high-performance computing (HPC) infrastructure such as Midway, which consists of thousands of processors running in parallel. HPC is often needed for natural language processing, computer vision, speech recognition, and other computational tasks that use machine learning methods. This tutorial entails one 90-minute session in Week 6 of the Spring Quarter. Prerequisites: DIGS 30001, “Introduction to Computer Programming Using Python” (or equivalent), and DIGS 30002, “Introduction to Statistics Using Python” (or equivalent).
Basics of R and RStudio (Weeks 7 and 8)
R is a programming language for statistical computing and data visualization. RStudio is an integrated development environment (IDE) for R that allows easy interaction with data sets using R commands. R is less code-intensive than Python and is widely used in industry. Equivalent statistical and visualization tools in the Python programming language (which are taught in the Digital Studies core courses) are more widely used in academia. In this tutorial, students will learn how to install R and RStudio, import a data set, and use basic R commands to (1) produce descriptive statistics that summarize the data; (2) perform a regression analysis; and (3) generate graphical plots of the data. This tutorial entails two 90-minute sessions, one in Week 7 and one in Week 8 of the Spring Quarter. Prerequisites: DIGS 30001, “Introduction to Computer Programming Using Python” (or equivalent), and DIGS 30002, “Introduction to Statistics Using Python” (or equivalent).