Dean’s Blog: Data Availability – A Call to Action

Phil Bourne, in a suit and striped tie, stands confidently with arms crossed in a column-lined hallway, conveying authority and professionalism.

Recent developments in the U.S. have brought into question the availability and sustainability of public data which is the lifeblood of data science. As public data have been taken offline ostentatiously for reasons, as described in Executive Orders, of gender identity,  wasteful diversity equity and inclusion programs, and putting America first among environmental agreements among others, we begin to lose the ability to teach and undertake research so vital to the future or our society.

Concerned stakeholders have taken matters into their own hands. The Data Rescue Project and the End of Term Archive are two examples. Here at the University of Virginia (UVA), the Library in collaboration with the Center for Open Science have played an active role in data preservation.

Even as data are still available, the sustainability and growth of that data come into question given a reduced workforce. In my sphere of research, that PubMed would go dark, the first time I recall this happening since its founding in 1996, is a worry regarding future accessibility. PubMed currently indexes more than 37 million articles, and is maintained by the U.S. National Center for Biotechnology Information (NCBI) at the U.S. National Institutes of Health (NIH). In short, PubMed is vital for biomedical research and that it would go dark for an extended period has sent a worrying signal. What can we do?

As stakeholders in the scholarly ecosystem, we each have a role to play. Here are a few ways:

  • As deans and university leaders, we need to make clear to governments that to be a public university means public accessibility to all the scholarship we produce, including the data from which that scholarship is derived. In the case of my own university, the University of Virginia, this is particularly poignant as its founder, Thomas Jefferson, one of this nation’s original founding fathers said, “The most important bill in our whole code is the diffusion of knowledge among the people.”
  • As senior scientists who influence scientific circles through journal editorship, society membership, and roles within the funding agencies as reviewers, etc. we need to continue to impress upon those organizations to which we give our time and energy, the importance of the availability and sustainability of public data.
  • As mentors, we should be teaching our students that the data upon which they are basing their careers is no longer guaranteed.
  • As students, we should realize the frailty of the system, the gift quality data brings, and that we have a role to play in its collection, curation, and preservation so that those that follow can benefit.

The sustainability of data from which we learn and make new discoveries has always been an issue, but never more so than today. I encourage you to get involved.

Author

Stephenson Dean