top of page

Data Driven Business Solutions

We specialise in Data Science and AI solutions for complex engineering and business problems.

  • Varun Rao

Towards Zero - fact or fiction?

At Deep Blue AI, we are a group of old friends who met while four-wheel driving and bonded over a common love of all things motoring. Given that four-wheel driving is an inherently dangerous hobby, safety is never far from our minds. Over our many casual conversations we have often discussed vehicle accidents, both on and off road. While data for the latter is mostly anecdotal and sparse, there is a wealth of public domain data related to on-road accidents.

Curiosity aside, there is a very human element to road safety. At the time of writing, there were fewer than 100 deaths due to COVID-19 in Australia, compared to approximately 200 deaths per year due to traffic accidents in Victoria alone. Granted, that's not a fair comparison - deaths due to COVID-19 have been greatly minimised thanks to a massive international cooperative effort resulting in punitive economic costs, while road safety campaigns generally operate in business-as-usual scenarios. Nevertheless, a straightforward comparison of road accident statistics would be a useful way to gauge the efficacy of road safety programmes.

Road safety is a key government responsibility and is therefore the focus of signi ficant attention and funding, such as the 'Towards Zero' initiative launched by Victorian government agencies in 2016 [1]. Given the complex nature of road accidents, detailed data analysis is required to ensure that resources are allocated appropriately and desired outcomes are achieved.

This article uses open-domain data sources to analyse road accidents in the period July 2013 to March 2019 in the state of Victoria. Road accident data is sliced by various features of importance, such as geographical location, time, severity and speed zones. Open-domain weather observations were also analysed to investigate the effect of rainfall on accident likelihood. Results are presented as visualisations using Tableau.

Data pre-processing

VicRoads Accident Statistics: Vehicular accident statistics for roads in Victoria are compiled by the state government agency VicRoads. This paper uses data on 74,400 accidents and 1,413 fatalities over the period July 2013 to March 2019 [2]. Metadata is available at [3].

Bureau of Meteorology Weather Observations: Historical rainfall and temperature observations were obtained from the Bureau of Meteorology [4] for the period Q3 2015- Q2 2016; metadata [5] and technical references [6] are also available. The data represents one-minute values that been aggregated into hourly values for the 518 Automatic Weather Stations (AWS) shown in Figure 1. Of these, the 73 highlighted stations in Victoria have been considered for this study.

Figure 1: All weather stations from the dataset. Victorian stations are highlighted red.

Data wrangling

Timestamps for accident records were rounded to the nearest hour for matching with the weather observations. Subsequently, the weather for each accident record was matched to a weather station inside a radius of 10 km, shown in Figure 2. Records without a matching weather station were discarded. Therefore, although not an exact match, the data presented here matches each accident record to the weather conditions within 30 minutes and 10 km of the accident location.

Figure 2: Matching of accidents to weather stations

Results and Discussion

Geographical locations

Figure 3 shows the distribution of accidents in the area surrounding Melbourne. The size of individual points represents the number of accidents and the colours represent speed limits. The low speed zones of dense pockets of Melbourne city and its inner suburbs are clearly visible, as is Geelong to the south-west. Accidents in high speed areas (>=80 kph) appear to be reasonably uniformly distributed along major arterial roads and highways, although there are a few clusters indicating accident hotspots. The bulk of the data is low-speed accidents in relatively densely populated areas.

Western suburbs tend to be under-represented in the accident data, although normalisation of this data by traffic density would be necessary to place this in context.

Figure 3: Geographical location of accidents with size indicating number of accidents and colours representing speed zones

Temporal distribution of accidents

Figure 4 shows the distribution of accidents per year and per region; years 2013 and 2019 have been excluded as data was not collected for the entire year in either case. Unsurprisingly, the two metropolitan regions account for the bulk of accidents.

There is clearly a signi ficant reduction ( 26%) in accidents over the four years considered. It is also apparent that the reductions occurred roughly uniformly over all the regions considered.

Figure 4: Accidents per year by region

Figure 5 shows the variation in the light condition reported for accidents from 2014-2018. The most obvious feature of this visualisation is the disproportionate reduction in accidents at Dusk/Dawn, which reduced by a factor of nearly 5 over this period. Most accidents take place in the day, or with street lights on.

Figure 5: Accidents per year by light condition

Figure 6 shows a different temporal slice through the data. The graph shows accidents per month with the size of each stack representing the number of accidents and the colour representing the speed zone; data has been aggregated over years 2014-2018. 60 km/h speed zones are clearly the dominant contributor across all the months. Particular months of interest are September and March as the best and worst months respectively.

Figure 6: Accidents by month by speed limit

Speed zones

Speed limits are an important contributor to road safety initiatives; it is estimated that excessive or inappropriate speed has been involved in 29% of Victorian accidents since 2008 [7]. However, determination of the actual vehicle speed for a large range of accidents was not feasible, so the accidents dataset only provided the speed zone corresponding to each accident rather than the actual vehicle speed. Data from mobile speed cameras suggests that only a small minority (1.5-3%) of drivers were assessed at driving 10 km/hr or more over the speed limit in the period 2001-2011 [8]. In view of these results, the following section assumes that drivers generally obey the speed limit, and that the speed limit is a useful surrogate for the actual speed, although this will not be strictly true for a small number of cases.

Figure 7 shows the number of accidents by speed limit and also road geometry. 60 km/hr zones are by far the greatest contributor to accidents, and most of these accidents occur at some sort of intersection.

Figure 7: Accidents by speed limit and road geometry

The data in the previous figure could be normalised to allow easy comparison of the relative importance of road geometry to the number of accidents in the various speed zones, such as the representation in Figure 8. The proportion of accidents taking place at intersections of some kind increase with increasing speed limit till 60 km/hr, at which stage they decrease till they are a marginal contributor at 100km/hr and above; this is unsurprising because 100km/hr roads are generally highways or freeways and use entry/exit ramps rather than intersections. The 60 km/hr zone is of particular interest, as it has by far the greatest number of accidents (Figure 7). In this zone, accidents at Cross intersections and T intersections dominate the distribution, accounting for nearly 60% of the total number of accidents. 34% of funding for the Safe Roads part of the Towards Zero campaign has been earmarked for improvement of intersections.

Figure 8: Normalised accidents by speed limit and road geometry

The severity of an accident is another important metric to consider. Intuitively, lower speed zones should result in less severe accidents. Figure 9 shows the distribution of accident severity across speed zones with expected results. At speeds below 60 km/hr, over 70% of accidents resulted in only minor injuries. Above 60 km/hr, this number drops steadily - 67% at 80 km/hr and 56% at 100 km/hr. Of most concern is the steadily increasing proportion of fatalities with increasing speed zone, rising to 5% at 100 km/hr and above. Ongoing initiatives such as installation of safety barriers and rumble strips on high speed regional roads will likely improve this statistic in the coming years.

Figure 9: Accidents by speed limit and severity


Figure 10 shows the number of fatalities by year (left) as well as the total number of accidents (right); 2013 and 2019 have been excluded as data was not recorded for the entire year. 2016 was clearly a poor year by this metric, although it is likely that the absolute number of accidents would need to be normalised by the total number of registered vehicles or total mileage driven to account for these extraneous factors. The trend from 2016 (the year that the Towards Zero campaign started) is clearly improving. Previous data for this statistic from the TAC website is also provided in Figure 11 to provide historical context. In the years prior to 2009, there were approximately 315-340 fatalities per year. In view of these historical figures, the decrease to approximately 200 per year is impressive, particularly given Victoria's population increase over these years [9].

Figure 10: Fatalities (left) and total number of accidents (right) by year

Figure 11: Fatalities per 12 months from 2006 to 2011 from TAC [8]


Alcohol-related traffic accidents were estimated to cost Australian society approximately $3.6bn in 2013 [10], with the majority of these related to human costs rather than property damage. In 2016 alone, 34 fatalities involved drivers and riders with illegal blood alcohol concentrations [11]. Figure 12 shows the distribution of alcohol related accidents by day of the week from the dataset. Given that alcohol consumption is frequently associated with leisure, the dramatic increase in the number of records over Friday, Saturday and Sunday is well explained. Most worryingly, serious injuries and fatalities are fairly common outcomes for this type of accident.

Figure 12: Daily distribution of alcohol related accidents

Figure 13 shows de fined Alcohol Times, which are denoted by the TAC as periods when casualty rates are at least ten times more likely to involve alcohol than at other times [8].

Figure 13: TAC-defined alcohol times shown in dark grey [8]

Figure 14 shows the distribution of records that occurred during the de fined alcohol time by day of the week. Each column has been normalised by the total number of accidents for that day to allow for easy visual comparison. The prevalence of alcohol related accidents during the defi ned Alcohol Time is clearly highest on Friday (> 80%) and the weekend (>= 90%), whereas for the rest of the week it is signifi cantly lower. When read in combination with the large number of alcohol-related accidents from Friday to Sunday (Figure 12), the data suggests that the de finition of Alcohol Time is appropriate.

Figure 14: Distribution of accidents that occurred during the defi ned alcohol time.

Influence of rain

Rainfall is anecdotally associated with a greater chance of accidents. Weather observation data for the period May 2015-April 2016 was obtained from Bureau of Meteorology observations and merged with accident data from this period only; details of the data wrangling methodology are given in an earlier section. Further details of the spread of weather data available is given in Appendix A. Figure 15 shows the locations of the Victorian weather stations used for this investigation. A threshold of 0.2mm/hr was used to indicate the presence of rainfall.

Figure 15: Locations of the Victorian weather stations

The left image in Figure 16 shows all accidents that could be matched to a weather observation (using the method described earlier) in the period Q3 2015- Q2 2016. The colour indicates the presence or absence of rain. 518 of the 8,848 accidents occurred in the presence of rain, corresponding to approximately 6% of the total observations.

The impact of rainfall on the likelihood of an accident occurring can be measured by comparison to a control set, such as the right image shown in Figure 16. This image shows the distribution of all weather observation records with colour again indicating the presence or absence of rain. 6,285 of the 96,003 observations indicated rainfall, corresponding to about 6%.

If rain was a significant contributor to increased accident likelihood, one would expect a higher percentage of rain events in the left image (accident observations only) than in the right (all observations). However, given that 6% of accidents occur while it rains, and rain is also indicated in 6% of weather observations, the data shows that rain does not appear to be a significant factor in increasing the likelihood of accidents. This is a surprising result given the anecdotal evidence to the contrary, but further investigation of this is beyond the scope of this article.

Figure 16: Distribution of rain vs non-rain observations for accident data only (left) and for all weather observations (right)


This article presents detailed analysis of road accidents in Victoria using open-domain data sources from government agencies. As expected, the metropolitan areas accounted for the bulk of accidents. Accidents declined substantially over the period considered, with the largest contribution to the drop coming from accidents at dusk or dawn. Accidents at some sort of intersection in a 60 km/hr zones were by far the most common type of record.

Speed zones above 60 km/hr showed increasing incidence of serious injuries as well as fatalities, with 5% of accidents in 100 km/hr zones proving fatal. The number of fatalities has however declined along with the overall number of accidents, particularly with respect to historical records beyond the scope of the current dataset.

Analysis of alcohol-related accidents reveals that these are most likely to occur from Friday to Sunday, and over 80% of these occur during the TAC-de fined high-risk Alcohol Time, suggesting that this metric is well chosen. Surprisingly, the presence of rain was not found to be a signi ficant contributor to the likelihood of accidents.


[1] Transport Accident Commission, VicRoads, Victoria Police, the Depart- ment of Justice and Regulation and the Department of Health and Human Services, Towards Zero,, 2016. Accessed: 2020-03-09.

[2] VicRoads, Crashes last five years, au/dataset/crashes-last-five-years1, 2019. Accessed: 2020-03-02.

[3] VicRoads, Crashes last fi ve years - open data,, 2019. Accessed: 2020-03-02.

[4] Australian Bureau of Meteorology, Historical rainfall and temperature forecast and observations hourly data - weather forecasting veri fication data (2015-05 to 2016-04). raw data.,

[5] Australian Bureau of Meteorology, Metadata for historical rainfall and temperature forecast and observations hourly data - weather forecasting veri fication data (2015- 05 to 2016-04). raw data.,, 2016. Accessed: 2020-03-06.

[6] Australian Bureau of Meteorology, Metadata for historical rainfall and temperature forecast and observations hourly data - weather forecasting veri fication data (2015- 05 to 2016-04). raw data.,, 2016. Accessed: 2020-03-06.

[7] TAC, Transport Accident Commission,, 2020. Accessed: 2020-03-09.

[8] TAC, ROAD SAFETY STATISTICAL SUMMARY, Technical Report, Transport Accident Commission, 60 Brougham Street, Geelong Vic 3220, 2011 [link].

[9] Victoria State Government, Land Use and Population Research, Accessd:2020-04-19.

[10] M. Manning, C. Smith, P. Mazerolle, The societal costs of alcohol misuse in Australia [online], Trends and Issues in Crime and Criminal Justice (2013) 1-6 [link].

[11] TAC, Transport Accident Commission,, 2020. Accessed: 2020-03-09.

Appendix A

Weather data comes from [4]. It was important to ensure that the observations covered the range of conditions likely to be experienced in a typical year. Temperature is also a useful surrogate for this purpose as it is more familiar to the average person than hourly precipitation. Figure A.18 shows the distribution of maximum hourly temperatures in Victoria by quarter.

Higher temperatures in Q1 and Q4 indicating summer are clearly represented in the raw data, while winter temperatures are most prominently displayed in Q3.

136 views0 comments

Recent Posts

See All
bottom of page