22 Jan

Ph.D. in Data Science Dissertation Defense: Jiahao Tian

January 22, 2024 In-person
2:00 PM 4:30 PM

Ridley Hall, Room 206

Committee:
Jeffery Blume, Committee Chair (SDS)
Michael Porter, Advisor (SDS/SEAS)
Jon Kropko (SDS)
Bill Basener (SDS)
Negin Alemazkoor (SEAS)

Title:
Handling ambiguity of interval censored events

Abstract:  
Censored data refers to a type of data where the specific timing of events of interest remains unknown, and we only have information about their occurrence within a specified interval. This type of data is encountered in various fields such as medicine, finance, criminology, and politics. However, drawing inferences from censored data is challenging due to increased uncertainty. In our study, we focused on three significant aspects related to interval-censored data modeling: change point detection, intensity estimation, and event forecasting. The objective of my research is to address the inherent uncertainty in this data by employing statistical learning and deep learning techniques, with the aim of providing practical insights and answers to relevant questions. The subsequent sections provide a concise overview of each topic we investigated.

Change point detection involves identifying the time points at which there are shifts or alterations in the underlying model. However, the presence of interval-censored data renders traditional methods unsuitable for this task. In our research, we introduced a novel approach by combining the joinpoint model with Bayesian Model Averaging (BMA) to detect a sequence of significant changes. We applied this proposed model to analyze the 2020 presidential approval rate, aiming to identify the events that influenced shifts in public sentiment.

Intensity estimation is a widely explored research area; however, the introduction of interval-censored data presents a new challenge in this field. Most existing work primarily concentrates on time-to-event analysis within the context of survival analysis, which makes it challenging to extend those approaches to scenarios where the concept of time-to-event is not applicable. In our study, we specifically focused on penalized temporal intensity estimation, which offers a framework that yields precise and practical estimates of intensity while accounting for the unique characteristics of interval-censored data. This approach provides both accurate and realistic estimates that are valuable in practical applications.

The framework introduced in the previous project lacks the ability to capture abrupt changes in the underlying process if they occur. To address this limitation, we put forward a deep learning-based framework that enables event forecasting using historical data, thereby facilitating the detection of sudden changes and trends. This approach proves particularly valuable in fields such as criminology, where accurate crime intensity forecasts can aid in effectively combating the increasing trend of crime rates and implementing preventive measures.

By undertaking these endeavors, we have successfully showcased the practical applications of data science and its impact on our daily lives. This research has contributed significantly to the advancement of our understanding of modeling interval-censored data. Through our work, we have made notable strides in this field, paving the way for improved methodologies and insights that can be applied in various domains.