The Data Science Institute at the University of Virginia (UVA) is developing Scholia, a tool based on the Wikipedia platform that aims to make academic research more accessible to scientists, organizations, and the general public.

Scholia uses data visualizations to display research profiles. For example, if a user searches Scholia for a researcher, then Scholia presents a list of their publications, along with charts showcasing the topics on which they publish, a network graph of their collaborators, a timeline of their institutional affiliations, and other information based on the metadata of their publications and research media assets.

"Our hope for Scholia is that it makes discovering published research as routine as checking Wikipedia," says Lane Rasberry, Wikimedian-in-Residence at the UVA Data Science Institute, who is coordinating community outreach and project documentation.

In similar ways, Scholia also profiles institutions, topics, events, research projects, clinical trials, and academic journals or individual articles, among others.

The project is an equal partnership among three institutions. Finn Årup Nielsen, at the Technical University of Denmark, founded the project by establishing it as a general interest experimental service in 2016. Egon Willighagen, at Maastricht University, brought specialized support to the project for research in chemistry. Daniel Mietchen, at the University of Virginia Data Science Institute, is working on integrating the profiles with collaborative workflows for curating the underlying data and coordinates this phase of project administration.

Scholia takes advantage of developments in the Wikipedia network—innovation in Wikipedia itself and the maturity of Wikidata. Within Wikidata, the most developed subcommunity is WikiCite, which is the effort to collect "source metadata” as a free and open dataset for improving the control that Wikipedia and all scholarly disciplines have in analyzing research literature. Scholia is the user friendly interface which converts the WikiCite data collection within Wikidata into visualizations for Wikipedia readers and anyone else who wants to examine research media.

For 2019-20, the development theme for the project is "Robustifying Scholia.” With a grant from the Alfred P. Sloan Foundation, the team seeks to stabilize the infrastructure of Scholia to advance it from a beta version into a stable 1.0 release. The intended user base for the tool is as diverse as Wikipedia's editor and reader community—multilingual, global, in every academic field, and of every culture. The current version of the tool anticipates the common queries from the current generation of data enthusiasts, and the next version must present accessible visualizations to all sorts of people globally who are newly interested in answering their questions with machine-generated reports.

In comparison to existing commercial platforms, “Scholia is functionally similar in many ways but differs in that it democratizes access to the underlying data and software”, says Mietchen. “Because Scholia uses a public domain data set, it’s free to use, re-use and remix."

Daniel Mietchen
Senior Researcher
School of Data Science
Lane Rasberry
School of Data Science
Completed in: