Campaign speeches provide significant insight into how candidates communicate their message and highlight their priorities to various audiences.
This study explores the campaign speeches of Donald Trump, Joseph Biden, Michael Pence, and Kamala Harris during the 2020 US presidential election using Natural Language Processing (NLP) techniques and a novel data pipeline of unstructured automated video captions. The intent of this effort is to evaluate the stylistic elements of the candidate speeches through elements such as formality, repetitiveness, topic variance, sentiment, and vocabulary choice/range to establish how candidates differ in their approaches and what effectively resonates with the voters. The NLP methods used include unsupervised similarity and clustering algorithms as well as a structured sentiment analysis. Through this work, the results uncovered large stylistic differences amongst the candidates overall; however, more notably also indicate stark differences between the top and bottom of the Republican ticket compared to the Democratic ticket. The findings support the idea that the candidate pairs were selected strategically to cover the largest bloc of voters possible as part of the election process.