By Sachin Kumar and Durga Toshniwal
Journal of Big Data 7 May 2016
Road and traffic accidents are an important concern around the world. Road accidents not only affects the public health with different level of injury but also results in property damage. Data analysis has the capability to identify the different reasons behind road accidents i.e. traffic characteristics, weather characteristics, road characteristics and etc. A variety of research on road accident data analysis has already proves its importance. Some studies focused on identifying factors associated with accident severity while others focused on identifying the associated factors behind accident occurrence. These research analyses used traditional statistical methods as well as data mining methods. Data mining is frequently used method for analyzing road accident data in present research. Trend analysis is another important research area in road accident domain. Trend analysis can assist in identifying the increasing or decreasing accidents rate in different reasons. In this study, we have proposed a method to analyze hourly road accident data using Cophenetic correlation coefficient from Gujarat state in India. The motive of this study is to provide an efficient way to choose the best suitable distance metric to cluster the series of counts data that provide a better clustering result. The result shows that the proposed method is capable of efficiently group the different districts with similar road accident patterns into single cluster or group which can be further used for trend analysis or similar tasks.