Democratizing Data Science

“Data is about being counted,” Renée Cummings told a live audience by Zoom last month. “Data is about every voice being heard.” 

Cummings, who serves as data activist in residence at the School of Data Science, was speaking at an event organized in honor of Martin Luther King Day, titled “Why We Can’t Wait: Justice, Data and Activism.”

In a conversation moderated by Siri Russell — the School of Data Science’s associate dean for diversity, equity and inclusion — Cummings and former Charlottesville mayor Nikuyah Walker spoke and took questions about everything from data literacy to the systemic racism that persists in algorithms being used today.

“Data is really about who we include and who we exclude,” said Cummings, a criminologist and AI ethicist who plans to launch a police accountability tool in March. The tool will measure the algorithmic force of AI-infused policing, strengthening oversight of technology deployed by law enforcement. 

“We want to ensure that we are including and counting everyone,” Cummings continued, “because it is that kind of accuracy that will give us the best policies, from criminal justice to healthcare to education to finance.” 

In other words, “we’ve got to democratize data,” she argued. 

Renée Cummings
Renée Cummings serves as Data Activist in Residence at the UVA School of Data Science.

An important step in that direction is improving data literacy for citizens. “We've got to let communities understand how data becomes information, which is turned into intelligence, which is turned into policy,” she said.

Especially when it comes to research on hot button issues like surveillance and policing, many nonprofits have attempted to foster conversations about data collection and analysis with the people they serve. But at the moment, there is no “standard operating procedure” for doing so. “That does not exist,” Cummings said. “What we are seeing is that we need to get more creative.”

Walker, who worked with a range of nonprofits in Charlottesville before her term as mayor, agreed — and argued that citizens need to actively shape the way data is being collected in their communities.

She shared a memory of visiting Harlem Children’s Zone, a nonprofit dedicated to breaking the cycle of intergenerational poverty in New York. Staff there had spoken with her about the restrictions they placed on researchers from nearby Columbia University. It was a move intended to end exploitation by making sure Harlem residents could control how their data was being gathered and used.

“Harlem put its foot down. They said, ‘Absolutely not. You will not tell the story of our children, of our families, without us,’” Walker said. “Because most of the time that story was being told in a negative light. And once you portray people and characterize them only through that lens, what does that do to their spirit?”

In Charlottesville, a group called Residents for Respectful Research has taken a similar stance. The new research review board ensures that public housing residents can understand, influence and benefit from studies performed in their community by University of Virginia faculty and students.

These kinds of shifts in power structure occur when citizens are recognized as experts on what’s happening in their lives and neighborhoods — and when nonprofits and researchers are willing to acknowledge the legacy of a painful history, Walker said.

Nikuyah Walker
Nikuyah Walker was the first black mayor of Charlottesville, Virginia, serving from 2018 to 2021.

“[Data] has been used in the past to tell a story, to abuse, to manipulate,” she explained.
The question data scientists and other researchers need to be asking then, is “how do you build that trust in communities, especially in communities that you say you want to serve?” 

Data science is a relatively new field, but much of the data it draws on and already has systemic racism “baked in,” Cummings noted. “What we don't want is for new technology to replicate old biases and old stereotypes,” she said. “And that's what we're seeing with emerging technologies such as AI.” 

Compounding the problem is “a monoculture of designers” that lacks diversity and can therefore easily introduce further biases into those technologies, she added.

Many algorithms also remain opaque to everyday people, especially in fields like criminal justice and policing. Moving forward, Cummings argued, achieving greater transparency will be key, as will regaining a level of public control.

“The collection of public data is in crisis,” she said. Fewer people are filling out surveys, census data often isn’t used until several years after it’s been collected and a handful of private corporations own the data that guides public policies.

All of this just further emphasizes the need for citizens to take an active role in how our data is being used.

“We need to all be activists when we think about privacy and security, accountability and transparency, and the need to create technology that is inclusive and trustworthy,” Cummings said. “It is so important that we understand the power our data holds.”