Exploring the history of literature with a data science lens

Throughout Raf’s academic career, he has engaged both right and left sides of his brain before ultimately settling into the humanities for both his undergraduate and master’s degrees, but unknown to him at the time, data science would allow him to blend his interests in a new way.

Raf Alvarado is a program director and general faculty member with the School of Data Science.

After receiving his undergraduate degree, Raf went on to study cultural anthropology at the University of Virginia, earning his PhD in January 2000.

“The best way to characterize my career is that I am a digital humanist,” Raf said. “I got interested in this field because I could clearly see it was a place where I could apply my interests in human culture but also my skill set and quantitative methods that I had developed as an undergraduate – I had spent two years in the Engineering School before switching out and studying philosophy.”

Following his education, Raf moved to New Jersey to work at Princeton University as a Coordinator of Humanities and Social Sciences Computing, combining his humanities background with his interest in computing. During his eight years at Princeton, Raf worked on several projects, including the Charrette Project, which digitized an Old French manuscript of an Arthurian legend, Chrétien de Troyes’ Knight of the Cart, which tells the story of Lancelot. He also worked on digitizing the Persian epic, the Shahnameh.

“To participate in this new way of knowing, you need to know data science.”

“One of the things I did was I tried to see if there was a correlation between meaning and form.” Raf used computational tools to find a connection between meaning, the names and definitions in the texts, and the form, or poetic structure of the work.

Raf moved back to Charlottesville in 2009. A couple years later, UVA began to get interested in “big data” and what to do with it. At first Raf was asked to join a committee designed to plan for the future of “big data” at UVA. From there he got involved with the School of Data Science’s capstone project program, a unique part of the students’ experience in the MSDS course of study.

Raf now considers what he does as data science plus digital humanities, and he understands the potential for the field.

“Data science is really important because…it’s almost a truism now, a commonplace to say that we live in a world of data, but we do.”

Data surrounds so many parts of our world today, and it will only play a bigger role going forward, including in academics. “To participate in this new way of knowing, you need to know data science,” Raf said.

The incorporation of ethical practices into the data science curriculum at UVA is crucial, and Raf reinforces this notion: “We [the School of Data Science] would like to see that our impact on the world is that we have influenced how people do data science and thinking ethically, in terms of its social impact from the get-go.”

The importance of ethics in data science can’t expressed enough.

“Get involved with an area that has data that you love to work with, and play with data."

As for entering into the field of data science, Raf provided some advice for students looking to further their studies. “Get involved with an area that has data that you love to work with, and play with data. Find data sets available on the web in areas that you’re interested in like something about music reviews or politics, voting behavior, and just start working with data, get involved with understanding data at a very almost material level.”

Raf emphasized the ability to start now, where ever you are: “Just set up the toolset on your computer, and play.”

Profile by Connor Masterson