Data science is the study of data and the methods used to learn from data. Data scientists explore:
- how data is interpreted as scientific evidence
- how prediction models and forecasting algorithms rely on data
- how different data structures capture and encode information
- how knowledge extracted from data impacts science, society and policy, and
- how ethical standards govern the collection and use of data.
Data scientists develop novel mathematical and computational frameworks to extract knowledge from data and generalize findings. While data science is often envisioned as the application of statistical and computational tools to real-world problems, the discipline itself is much broader. Data scientists:
- contribute to the mathematical theory of statistics and computer science
- develop and study the behavior of computational algorithms
- examine how the collection of data impacts its utility and validity
- examine how data should be stored, shared and communicated
- investigate how machine learning works, and
- research the ethical dilemmas for societies, companies and governments that arise from modern computational advances.
At the School of Data Science, we loosely group these activities into four domains —analytics, systems, value and design — which are all applied in a fifth domain called practice. Our white paper details the motivation and need for the Domains of Data Science model and traces its origins, which date back decades.
The domains are intended to broadly encompass areas of focus in data science that are related in fundamental ways, as the table below illustrates. Faculty research often spans more than one domain and is interdisciplinary in nature. Much of our research uses statistical, computational and philosophical principles to enhance ongoing collaborations in biomedical science, engineering, education, business, imaging and other fields.
Domain | Focus Areas | Examples |
Analytics | Prediction Modeling and Machine Learning | statistical methods, algorithm development, imaging, mathematical modeling |
Data Engineering | data pipelines, machine learning operations, data life cycle | |
Systems | Data Structures | data architecture, database theory |
Data Systems | high performance computing, distributed systems, cloud architectures, security | |
Design | Data Coalescence | communication, visualization, human-computer interaction, computer vision |
Value | Data Policy & Ethics | privacy, ethical algorithmic construction and deployment, representativeness |
Social Impact | justice and influence pertaining to the use of data and analytics | |
Practice | Utilization | practical applications of data science techniques at scale |
We also use the Domains of Data Science model to guide our pedagogy. Students in our programs take several courses in each of the domains to meet degree requirements. This provides a holistic educational experience that lays a broad foundation in data science methods and prepares our graduates to be critical thinkers.
Exploration. Scholarship. Impact. @ UVA Data Science