SDS Works Across Grounds to Advance Neuroscientific Computing
In June of 2019, President Ryan stated that, of the goals for the University of Virginia over the next decade, we should seek to achieve excellence in both the brain and the data sciences.
Specifically, the University will “focus on a discrete set of pressing challenges and opportunities that require collaboration across disciplines and schools and where UVA can be an international leader in important fields of research (Democracy, Environmental Resilience and Sustainability, Precision Medicine, the Brain and Neuroscience, and Digital Technology and Society).”
To achieve this goal, the School of Data Science is teaming up with the Brain Institute, Research Computing, and the Department of Psychology for a new, exciting project.
These three institutions are working together to create a high-performance computing system that will support all of the neuroscientific research and computation across Grounds, called the NEO System. This technology will keep vast amounts of data in one central place, with the ability for researchers to collaborate together through comparing and analyzing all of this data.
“We call this the ‘NEO’ system, which fills a unique place in the UVA research computing infrastructure, providing secure large-scale computing focused on the multiscale nature of neuroscience data in health and disease, in human and non-human species,” said Jack Van Horn, who has a joint appointment in the School of Data Science and Department of Psychology.
“This is a major investment to support neuroimaging, electrophysiology, microscopy, genomics, etc, involving the study of the brain and neural systems.”
Van Horn explained that the name “NEO” comes from the movie, the Matrix, where the protagonist is named Neo. The name is twofold, as it also is short for “neocortex,” which is a set of layers of the cerebral cortex, involved in high functioning systems, including language, sensory perception, spatial reasoning, and cognition.
The NEO system will be able to process hundreds of neuroimaging datasets, in the time it would normally take to process one.
“This will greatly accelerate and amplify our ability to process data writ large,” Van Horn stated. “We'll be able to develop new algorithms, and we'll be able to apply them across different diseases, whether it's autism, epilepsy, other clinical brain disorders in humans or in models across animal species like the laboratory mouse. We will be able to do those things, and will be able to very quickly in less than a decade, achieve President Ryan's vision for research excellence in those areas.”
Dr. Kevin Pelphrey, Professor of Neurology and leading researcher at the UVA Brain Institute, as well as founding director of the Autism and Neurodevelopmental Disorders Institute, further explained the impact the NEO system could have, specifically on Autism research.
“With these sorts of complex disorders, we need data from different levels of the organism, specifically brain function development over time, genetics, and behavioral data,” Pelphrey said. “That's three levels at work and we need to put those levels together to understand how the interaction of brain function and gene function relate and how behavior shapes genes and then in turn how that shapes brain development, and then looking at the bi-directional interactions.”
Pelphrey noted that the brain itself is a big data science puzzle, with so many levels and layers to understand.
“The only way to look at all these interactions effectively is to have all the data in a computational space where we can begin to piece together the data from multiple levels of one individual, and then multiple individuals and how they vary,” Pelphrey said. “We'll look at all this data over time.”
Pelphrey added that this NEO system is only possible due to the immense advances in technology and data science over the past several years.
“This [NEO System] demands very specific computational resources,” Pelphrey noted. “And so we need all of that data together and inherently, the complexity of development demands that we take that kind of approach and so something we really couldn't do you know even 10 years ago.”
Dr. Jaideep Kapur, is a neurologist at UVA Hospital and leads the UVA Brain Institute with Dr. Pelphrey. He is also a part of the development of the NEO system. He discussed the pressing need for large scale systems to store data about the brain that are secure, as this involves people’s personal health information.
“The brain is the most complex organ known to mankind,” Kapur said. “As we study the brain, we collect immense amounts of data, whether it is imaging or genomics, or understanding the progression of disease. Increasingly, we recognize that data science is at the core of making sure that we can make sense of this data.”
Kapur added that this project is bringing together professors, institutions, researchers, and students from across Grounds and disciplines. He noted that this is one of the core principles behind the UVA Brain Institute.
Collaboration and interdisciplinary work is also the cornerstone of the School of Data Science, making the NEO System the perfect project to develop.
“We believe that this is such a complex process that no single person can solve them (brain disorders) and no single discipline can solve them. But we do believe that people working together can make a big difference,” Kapur said. “Data Science is a big instrument and helps people work together.”
Van Horn works specifically with neuroimaging data. He explained that in his research, he and his team keep collecting more and more datasets.
“These datasets get larger and larger,” he stated. “In order to deal with this data - that's really born digitally - we need high performance computational capabilities.”
He noted that currently, UVA is using a system called the Rivanna system, which is a high performance computing system, but it is a system focused more towards engineering and physical sciences. With personal health information for brain research and so much data from many different areas across Grounds, Van Horn said this has posed a challenge.
“Our brain and data science community is spread across Grounds,” Van Horn said. “It covers multiple schools, and it covers multiple different areas of expertise. There's psychology, neurology in the School of Medicine, multiple MRI centers around the grounds, many people working at the cellular and molecular level all the way up to full systems level, and they all have computational needs.”
When Van Horn arrived at UVA in 2019 and heard President Ryan’s challenge for brain research and saw the computational needs for the neurodata across Grounds, he was excited to pursue this niche.
“We have taken this challenge to heart and have moved quickly with intentions to grow this system further as one of the largest neuroscience focused compute clusters in the country,” Van Horn. “It will be a major resource for our UVA brain sciences community and will be a major tool for the development of new data science methods for neuroscience data analytics (e.g. network theory, time series, image processing, neurogenetics, GWAS, machine/deep learning, spectral analysis, data visualization, etc).”
Van Horn then approached Ron Hutchins, Vice President of IT at UVA, who runs Research Computing, and also leads several teaching and learning technologies committees.
Hutchins broke down how the research process worked before, and how he is changing that now.
“I think about how technology fits into our strategy here at UVA. In the past at UVA, a researcher was basically on his or her own to build up the infrastructure, to do their research, and it's a lot harder when you have multiple different researchers, or you have a lot of different datasets that are around,” Hutchins explained.
As Hutchins has thought about technology strategy, the NEO System project fits well into his goals for Research Computing as a whole at UVA.
“Our push at Research Computing at UVA under is to assemble the equipment, and the people and the software, and all of the ancillary pieces in order to build a system that can support a lot of different researchers, that makes it effective from a standpoint of having a scale of the equipment.”
Hutchins was excited to hear about Van Horn’s vision for the NEO system and the team it was bringing together across Grounds. Hutchins emphasized the importance of scale in this new computing system.
He first broke down all the sources where the data can originate.
“So maybe we're taking brain scans, or we're taking, maybe even something like weather data - maybe weather has an influence on people's health in some ways, or taking their environment or their genotype or something like that, and putting all of this data together in a way that the computers can find the data, pull it in, and do research on it according to the science the researchers are trying to do.”
In order to pull all of this data together, Hutchins explained that scale is key.
“You have to have the scale of the computing to make that work, and the scale of the storage. You have to manage the data appropriately and connect them up together and protect the data.”
With a team of people from the School of Data Science, Information Technology, the Brain Institute, and the Medical School, the NEO System will soon become a reality at UVA.
Both Van Horn and Pelphrey envision their students being a major part of research using the NEO system.
“This would be a very important resource for training our next generation of computational scientists who are dealing with neuroimaging data,” Van Horn said.
The NEO System is in its final stages of deployment and will be ready for use by the UVA brain research community to use in Fall 2021.