Using Data Science to Uncover Native Kinship

April 21, 2021
data science tshirt

Presidential fellows Margot Bjoring and Christopher Whitehead are using data science and history to uncover native kinship of indigenous peoples from the Northeast 400 years ago. 

Bjoring and Whitehead’s project is one of six projects funded by the Presidential Fellows in Data Science Program through the School of Data Science and the Office of Graduate and Postdoctoral Affairs. The fellowship provides training and funding for graduate students across a variety of fields of study to collaborate on projects that address real-world problems using traditional research methods alongside cutting-edge data science tools and techniques.

From Songbirds to Data Science

As Ph.D students in psychology and history, Bjoring and Whitehead’s path to data science and to working together on this project happened in an unexpected way -- songbirds.  

Bjoring, a sixth year Ph.D. candidate in Psychology, works in a lab with Daniel Meliza, Psychology professor and researcher at UVA. Meliza’s lab focuses on the learning and recognition auditory system of songbirds. 

With an undergraduate major in cognitive linguistics, Bjoring was intrigued by studying this niche area of auditory and vocalization systems. 

“It seemed like a really interesting connection with linguistics to look at the vocalizations that songbirds make because they actually have to learn to sing from their parents,” Bjoring explained. “They're not just born knowing this stuff. They're actually tutored in learning their song, so it's an acquired vocalization. It's really complex, and it's a really fun project to study.” 

Through her work in this lab, Bjoring started to analyze data about songbird vocalizations and learned to code in Python and R.  

“I started to do a lot of computational work both in terms of modeling of behavioral responses and analyzing neural data. So that's kind of how I got into data science,” Bjoring said. “It’s very satisfying to me when your program runs and it doesn't give you any errors. It's just amazing.”

She soon realized she enjoyed the data science side of the project more than the experimental work. 

This is where Whitehead and Bjoring’s work began to overlap. 

Parish Registers Reveal Deep History of the Northeast

Whitehead is a Ph.D. candidate in the history department. He is currently working on his dissertation, which is on the history of native peoples in the Northeast during the early colonial era. Specifically, Whitehead studies the area which is present day northern New England and southern Quebec.

“My project takes a look at a growing field within the history of Native peoples that investigates native kinship,” Whitehead said.

Whitehead noted that what he can study depends largely on what European colonists happen to write down 400 years ago. 

“We happen to have a really rich case of documents, because of the Catholic Church, where some of the early colonial institutions wrote down everything in terms of parish registers,” Whitehead said. “So, we have things like who was baptized, who is married, who died, who were the parents that were present at each of those things.”

Whitehead explained that when he started this project, he found that he could not go through parish records in full, as they have thousands of entries.

Through Presidential Fellows and the UVA Library Data Services Group, Whitehead was connected to Bjoring. 

“When Chris emailed me about this project, I was like ‘wow that sounds like a really fascinating project and a type of data I haven’t had the chance to work with before,’” Bjoring said. 

Records and Data Science Bring Together a Fuller Picture of the Native Population

Until the two were connected, Whitehead had been studying the parish records by hand, which was taking an extremely long time. 

“It's hard to make sense of the records without the help of someone like Margot, who can help to go through these records in a systematic way,” Whitehead said. “She can suggest who is probably  the same person showing up at these parishes multiple times  rather than once, and then  mapping the kinship connections between people identified within those records. This partnership with Margot really opens up the possibilities of who we can study and what we can learn about native peoples during the earliest years of colonization.”

Through Whitehead’s historical understanding of the area and time period and Bjoring’s data science skills, they are rebuilding a population that could have been forgotten in history. 

“Data science allows us to do what historians dream of doing, which is identifying and rebuilding, as much as we can through these records,” Whitehead said. 

He went on to explain how unique it is to have access to these parish records, which provide much more information than many other areas that were colonized. 

“It really removes us from this limitation of imperial or colonial officials recording just a handful of native names,” Whitehead explained. “Through traditional documentary sources, we get disproportionately a view of native peoples being young and male and warlike because that's who colonial officials were mostly interested in, but right now we have all these names of native women native children, elders, people that otherwise would have escaped the notice of colonial officials.”

For Bjoring, the experience of working in historical documents has been new and exciting. 

“It's been really fun because I get to learn about this whole area. I'm definitely not a historian, but it's fascinating to see the kind of work they do and also to get to work with some really different data. Most of my data is very numerical, so to work with more text-based data, relationships, that sort of thing has been really interesting.”

Asking More Questions

Bjoring explained that up until now, she has been cleaning the parish record data. The two Presidential Fellows have been working to identify names that appear multiple times  and piece together connections. 

The next step, however, is asking more questions.

“Where we can actually go with this data is learning something about these communities and how they adapted to the colonization of their communities,” Bjoring said. 

Whitehead went on to explain more information he hopes to gather through this data. 

This information could then give Whitehead and Bjoring an understanding of how power operated within these communities that otherwise would only have European observers and likely scarce records. Whitehead noted, “Because we have records from these places for multiple generations, we can examine some key metrics, like who were the people that had the most number of connections, did subcommunities emerge within populations, and if so, who were the people linking together these communities? and who were the people that connected these communities.”

These Presidential Fellows are looking forward to uncovering more information and seeing increased results from their work. 

“It'll be interesting to trace communities through  four or more  generations,” Whitehead said. “For instance, in some of the communities we’ve begun examining, the daughters of chiefs were the most socially-connected people, suggesting they wielded considerable influence within their village . This preliminary finding leads to the question -  were chiefs influential because they had daughters who were able to connect subgroups within the community through marriage or through godparenthood?’ So all this information allows us to really start imagining  these communities in new ways, and seeing the importance of people who colonizers never really bothered to understand or document in other records.”