Data Science Master’s Students Tackle Diverse, Real-World Challenges in Capstone Projects

The spring 2024 residential data science master's class.
Residential data science master's students pose for a group photograph ahead of their capstone project presentations. (Photo by Alyssa Brown)

Human trafficking, illegal fishing, small business contracts — these are all issues that, at first glance, seemingly have little in common.    

Yet, for students in the University of Virginia’s data science residential master’s program, these disparate subjects have one critical common denominator: they could all be better understood through the application of data science methods.  

Fifteen groups of graduate students at the School of Data Science recently presented the findings of their capstone projects before a sizable crowd of faculty, staff, and fellow students at the Graduate hotel in Charlottesville.  

Capstone projects have long been a cornerstone of the data science master’s program. In them, students work in groups with a faculty mentor as well as an outside sponsor to tackle a real-world problem.  

It’s an opportunity for students to collaborate with their classmates and apply what they are learning from faculty toward addressing a real issue that their sponsor is dealing with — experiences that often leave lasting impressions.

“When we talk to alumni years later, they all say that the two things that they remember most about the program are the capstone experience and also a cohort experience,” said Phil Bourne, founding dean of the School of Data Science, in opening remarks.

The projects offered students the chance to learn about new subjects and methods as well as develop skills that could prove vital as they begin their careers.

“While you’ve done many projects, this is the first time that it was a very large project that you had to break it up and project manage it,” Adam Tashman, an associate professor of data science and capstone program director, told the students.

He added that perhaps the most important lesson students will take from the experience is “how to deliver for a customer.”

For more than four hours, with a break for lunch, groups of students laid out their findings in brief presentations followed by questions from the audience.  

In one, which focused on predicting which federal agencies small businesses could match with to secure contracts, David Diaz, whose father owns a small business in California, described the personal nature of the group’s work.

“I’ve seen this firsthand; I still see it with my dad now. There are a lot of long hours. It isn’t really a 9 to 5 job — it’s a 24/7 job,” Diaz said, describing the continual process that small business owners face in gathering information to try to secure bids while ensuring business operations run efficiently.  

“So, the goal of this project is to hopefully reduce that research time in reaching the federal contracting market and, hopefully, allow businesses to have a finer scope into what they’re looking for,” he added.

Image
David Diaz addresses the audience during his group's capstone project presentation. (Photo by Alyssa Brown)
David Diaz addresses the audience during his group's capstone project presentation. (Photo by Alyssa Brown)

Another group laid out their work on illegal fishing, discussing the complexity of this global issue and the vast problems it creates. They also highlighted a key point that any data scientist must confront: how to classify the data they had.

“Kind of like an indie album, our data set is unlabeled,” joked Samuel Brown, who explained how he and his group created labels to differentiate between illegal and illegal fishing.  

The group discussed how their project demonstrated how machine learning could be used to help predict illegal fishing, information that could, potentially, reduce its prevalence.

Like any complex challenge, completing a capstone project can be stressful. And in those difficult times, sources of wisdom are sometimes found in unexpected places.  

Sunidhi Goyal, who works as a tennis instructor for UVA Recreation, recounted how one of her six-year-old students asked her one day what was bothering her.  

Goyal, part of a group that worked with LMI on empowering their enterprise architecture team, said she wasn’t sure how she could explain the complexities of their project to a young child.  

But she tried, describing, in simple terms, how she and her collaborators needed to find a way to allow enterprise architects to sort through a large number of documents and keep just the relevant ones — the “good” documents, she called them.

“She was like, ‘Oh, what if the good could be a magnet, and you could keep it together and let go of the bad,” Goyal said the student responded.  

“This is exactly what we ended up doing,” Goyal said, explaining how her team used a method called principal component analysis to retain only the most relevant documents, an approach that helped lead them to their solution.

Image
Adam Tashman
Adam Tashman, an associate professor of data science, addresses students at the capstone presentations. (Photo by Alyssa Brown)

As the day wound down, audience members voted on awards, and Tashman praised the students for the effort, passion, and purpose they exhibited in taking on the challenges presented by their sponsors.  

“These are all important things that our sponsors need help with, and you all took ownership of that. I think you took it to heart and really put your heart and soul into it,” he said.  

And while the completion of their capstone projects signaled an end to their time as master’s students at the School of Data Science, it also marked the beginning of a much longer journey to come. 

“Let this be the first of many real-world problems that you face and that you tackle,” Tashman said.  

Awards, as voted on by the audience

Most Innovative Analytical Solution: “Optimizing the ALMA Research Proposal Process with Machine Learning”

Group members: Brendan Puglisi, Arnav Boppudi, Kaleigh O’Hara, Noah McIntire, Ryan Lipps

Most Compelling Data Visualization: “Detecting Illegal Fishing with Automatic Identification Systems and Machine Learning”

Group members: Samuel Brown, Danielle Katz, Dana Korotovskikh, Stephen Kullman

Most Engaging Data Story: “Predicting Winter California Precipitation with Convolutional Neural Networks”

Group members: Anthony Chiado, Kristian Olsson, Luke Rohlwing, Michael Vaden

Most Impactful Ethical Engagement: “Detecting Human Trafficking"

Group members: Jacqui Unciano, Grace Zhang, Tatev Gomtsyan, Serene Lu