MSDS Online Capstones Utilize Data Science for Good
As the University of Virginia School of Data Science celebrates the latest cohort of its Online M.S. in Data Science (MSDS Class of 2025) program, the virtual capstone presentations offered a compelling look into the practical, impactful, and often visionary work students are producing from around the globe. Capstone projects are the culmination of the MSDS program, including for students in the online program, a part-time 20-month synchronous program tailored for working professionals. Student teams collaborate with industry sponsors, government agencies, and academic researchers to solve real-world problems using the full range of data science tools and techniques mastered throughout the program.
Jonathan Kropko, Quantitative Foundation Associate Professor of Data Science and M.S. in Data Science Online program director, began the meeting by applauding the students for their commitment and sacrifice in obtaining their degree. Kropko asked the audience to recall their favorite memory over the past week. "Maybe it was time with family or friends, maybe you saw a superhero movie, went to a concert, or out to dinner," he said. "I bet you that that moment probably took place during an evening or a weekend. And guess what our online MSDS students are giving up to get this degree? Their evenings and their weekends."
Kropko noted what the capstone presentations meant for the students presenting. "Let's be clear, these presentations are not just the culmination of a semester-long grind to tackle a complex problem, though that in itself is no small achievement. They're also the culmination of years of sacrifice for many of these students, in this, their fifth semester of enrollment," he said.
The presentations demonstrated the versatility and reach of data science, with nine student teams tackling projects across public health, humanitarian action, environmental science, education, and biomedical science. These efforts reflected not only technical sophistication, but also a deep commitment to ethical responsibility and social impact.
Four projects earned top honors in categories recognizing excellence in storytelling, ethics, and visual communication in data science:
Most Innovative Analytical Solution: Racial Equity Analysis of Homeless Services
Winner of “Most Innovative Analytical Solution,” the project “Racial Equity Analysis of Homeless Services System” took on the challenge of assessing whether housing programs in the region equitably serve the most vulnerable. Partnering with Pathways MISI and faculty mentor Abbas Kazemipour, the team analyzed extensive client data from Permanent Supportive Housing (PSH) and Rapid Re-Housing (RRH) programs, developing more than 100 engineered features to capture demographic, housing, and vulnerability factors.
Using a mix of Random Forest algorithms, gradient boosting models, clustering, and survival analysis, they identified a high-risk group compromised of nearly one in five clients that was almost 4x more likely to require PSH, but faced wait times 45-days longer than other lower-risk groups. The team's models reached 93% accuracy in predicting housing needs, and they packaged results into a privacy-preserving, local large-language-model interface for frontline staff.
The team recommended faster intake, regular fairness audits, and bias-aware resource allocation to possibly enable roughly 150 more successful placements per year while cutting assessment time by 40%. The conclusions reached by the team serve as an innovative analytical solution that will help Pathways MISI meet more need.
Team: Michael Amadi, Darreion Bailey, Venkatramanan Viswanathan
Sponsor: Pathways MISI
Faculty Mentors: Abbas Kazemipour
Most Compelling Data Visualizations: Heritage Tree Detection in Central Virginia
In a blend of artificial intelligence innovation and environmental stewardship, the “Heritage Tree Detection in Central Virginia” project earned the “Most Compelling Data Visualizations” award. Sponsored by the Charlottesville Area Tree Stewards and mentored by Philip Waggoner, the team set out to identify and map heritage trees — mature specimens with ecological and historical significance — across the local region.
Starting with aerial imagery and geographic information system (GIS) data, the team trained a convolutional neural network to distinguish heritage trees from surrounding canopy. They combined satellite-based vegetation indices with object-based image segmentation, ultimately producing striking, high-resolution maps that revealed not just tree locations but species clusters, canopy coverage, and preservation priorities.
Their dashboard visualizations allowed stakeholders to zoom from a regional overview down to individual properties, making the data both accessible and actionable. The result is a decision-support tool that can guide conservation efforts, prioritize field surveys, and help protect an irreplaceable part of Central Virginia’s natural heritage.
Team: Devin Bridges, Alex DeLuca, Shaival Mandavgade, Victor Ontiveros
Sponsors: Charlottesville Area Tree Stewards
Faculty Mentor: Philip Waggoner
Most Engaging Data Story: Classification and Retrieval for Renewable Energy and Wildlife
The award for “Most Engaging Data Story” went to the team working on “Classification and Retrieval Augmented Generation for Renewable Energy and Wildlife Literature and Reports.” Sponsored by the Renewable Energy Wildlife Institute and faculty mentor Abbas Kazemipour, the project aimed to streamline scientific literature review by automating the tagging of technical documents.
Faced with a small and highly imbalanced dataset of more than 200 possible tags, the team tested support vector machines, fine-tuned transformer models, and semantic embedding approaches. The embedding-based model emerged as the clear winner, delivering a micro-F1 score near 0.90 for species-type tags and generating dozens of high-confidence label suggestions per document. Their interactive dashboard visualized tagging confidence, allowed human review, and supported multiple export formats, ultimately making the process faster and more consistent. The final presentation combined technical rigor with intuitive visuals, illustrating how automation and human expertise can work hand-in-hand in environmental research.
Team: Eric Arnold, Sareena Miley, Christian Ollen, Timothy Rodriguez
Sponsor: Renewable Energy Wildlife Institute
Faculty Mentor: Abbas Kazemipour
Most Impactful Ethical Engagement: Informing Humanitarian Action
The “Most Impactful Ethical Engagement” award went to “Predict Displacement to Inform Humanitarian Action,” a project addressing the urgent need to anticipate and respond to internal displacement caused by conflict in Northern Nigeria. Partnering with the Internal Displacement Monitoring Centre and faculty mentor Adam Tashman, the team merged conflict event data from the Armed Conflict Location & Event Data (ACLED) project with displacement tracking from the International Organization for Migration, creating lagged and sentiment-based features to capture both the scale and context of violence.
By testing multiple machine-learning models on local-government-area-level data, they developed a framework capable of producing monthly risk estimates with strong predictive accuracy. The approach balances technical performance with ethical considerations, ensuring predictions are interpretable and usable by humanitarian organizations in real-time crisis response. This methodology can be adapted to other regions, offering a scalable tool for forecasting human movement and guiding both emergency relief and long-term resilience planning.
Team: Isha Anand, Lionel Medal
Sponsor: Save the Children
Faculty Mentor: Adam Tashman
Multidisciplinary data insights
Other projects from the MSDS Online Class of 2025 capstone cohort further demonstrated the program's interdisciplinary approach and emphasis on collaboration and data science for good. These other projects included:
- "Virtual Screening for Small Molecules Targeting Disease-Causing Proteins," which created an automated pipeline for screening molecules and an interactive web application for researchers.
- "Analysis of Wikipedia Writing Assignments," which worked with Wiki Education to identify trends in their surveys and improve outcomes.
- "Wildfire Prediction Using Machine Learning Application," which worked to predict wildfire duration using data from the USDA Fire Lab and weather records.
- "Scoping a Global Database on Human Thriving," which laid the groundwork for a database on human thriving by evaluating a demographic and socio-economic dataset from Paraguay.
- "Analysis of Log Data of Users of a Health and Longevity Software Product," which worked with MyYouthSpan to model user adherence to their health and wellness platform.
Together, these projects reflect the School of Data Science's goal of utilizing data science for good. Student teams gained insights on improving quality of life, increasing safety, and maintaining health by employing their studies on everything from data-cleaning and model-building to Bayesian hierarchical frameworks and high-level principal component analysis.
Moving forward
The completion of their capstone projects marks the end of the MSDS program for these online students, who have completed their degrees alongside their professional ambitions over the course of 20 months. The projects serve not only as technical portfolios but also as evidence of the real-world change that data science can bring about. These projects are more than academic exercises — they are blueprints for a better, data-informed future.




