Utilizing Statistical Models and Machine Learning Techniques to Determine Crash Related Factors and Predicting High-Risk Segments for Roads with Limited Data Availability
This doctoral research focuses on determining the most significant crash factors affecting vulnerable road users through statistical analyses and machine learning algorithms, identifying high-risk routes, analyzing how human behavior related crash factors differ by gender of involved person, and proposing a model to determine high-risk segments for roads with unavailable crash data. Pedestrians and bicyclists are less protected in traffic crashes and prone to relatively more severe injuries. Several environmental, temporal, human, and road related factors affect the frequency and severity of pedestrian and bicycle crashes. Hence, understanding how these factors affect the crash frequency and crash severity, determining the most significant factors associated with these crashes, and identifying high-risk zones and routes through spatial analysis should help policymakers adopt effective countermeasures and efficiently allocate limited available resources. Additionally, understanding how human behavior related crash factors are affected by the gender of involved persons might help in adopting proper countermeasures to targeted audience. For cities or states where crash data is unavailable or scarce, identifying the most significant factors and high-risk routes or zones might prove challenging. This study proposes a model based on the data available from an online application to determine the high-risk road segments which could be used for areas with unavailable crash data. The introduction chapter (Chapter 1) provides an overview of the context of the study and describes the study area and current conditions. Chapter 2 provides a detailed literature review on how several factors affects pedestrian and bicycle crash frequency and severity, how gender affects human behavior related factors, and how statistical analysis could be used in crash data analyses. Chapter 3 explains the methodology of the studies and introduces the mathematical background of different statistical models. Chapter 4 presents the results from different analyses and describes most significant factors and high-risk routes. Chapter 5 employs several less used statistical models to check their efficiency in crash analysis and identifying the most significant factors. The chapter proposes a model to determine crash severity risk of road segments with unavailable crash data. Chapter 6 summarizes the key findings of this research and presents the research recommendations.