Data Story

Link to GitHub Repository: thomasnilsson/02806/tree/master/PROJECT

Link to Explainer Page: Explainer page

Are most of the collisions in the New York City caused by tired drivers coming back from work?

More than 8 million people live in New York City, which consists of 5 five boroughs – Brooklyn, Queens, Manhattan, Bronx, and Staten Island. In movies, NYC is always portrayed as a city full of busy people who are all in a hurry, crowded streets, heavy traffic, and aggressive taxi drivers. Every year, thousands of accidents occur due to a multitude of contributing factors. Are the drunk drivers causing them the most? Where do the most accidents happen? What time of the day? Why? Do certain areas get more dangerous at certain time of the year? Let’s find answers for those questions using visualisations presented in this article.

We used NYPD's Motor Vehicle Collisions dataset which is provided by the NYC Open Data repository. This dataset was created for the Vision Zero initiative, which is a multi-national road traffic safety project that aims at decreasing the number of fatalities or serious injuries involving road traffic. The dataset has about 1,240k rows with 29 observations ranging from July 2012 to April 2018.

Firstly, let’s have a look at how collisions are distributed over the area of NYC during last 6 years. Where are the most dangerous spots? The choropleth below presents accident distribution over zip codes, allowing us look at a specific period of time. At first glance we can see, that the main points are: East New York (Brooklyn), Dumbo (Brooklyn), Long Island City (Brooklyn) or Midtown (Manhattan). What can be the cause? Those are areas where the population is very high or very poor neighbourhoods with poor infrastructure [1].

You may be wondering when traffic accidents are more likely to happen - during night when the lighting may be poor, and more drunk drivers are on the roads? Or in the morning, when people are commuting to work, sleepy and likely stressed out trying to arrive at work on time?

The histograms show a distribution of collisions as well as fatalities (grouped by victims: car drivers, cyclists, pedestrians) over a day. Clearly, the number of accidents starts to climb as the city is waking up around 7 AM. Moreover, it turns out, that the distribution of accidents generally does not peak in the morning, but rather in the late afternoon around 4-5 PM - why is this so? A typical employee works from 9 to 5, meaning that a disproportionate number of people will be commuting back from work around 5, probably very tired, and thus vulnerable to potential distractions. Are distractions indeed one of the most important causes of collisions?

The barchart named "Frequently Reported Causes for Motor Vehicle Incidents (2012-2018)" describes the distribution of accidents by contributing factors. It turns out, that the majority of causes are not drunk drivers, but tired, overworked people getting back from work. They are getting distracted very easily, which increases the risk of being involved in a collision.

Finally, we found that generally the larger the population of a borough, the more accidents were likely to occur. Although, the traffic in NYC grows slightly each year, the trend of accidents over the past few years does not seem to decrease or increase, which can be a good sign. Nevertheless, it seems that tired people getting back from work are the main reason for many accidents. Shall we suggest Vision Zero to focus on those people? Help them somehow?


[1] Forbes, Jun 5, 2014

Interactive Visualisations

Selected Period:

Tip: Brush the timeline to select a period.
Download Timeline Dataset
Tip: Hover over a zip-code to view stats.
Download GeoJSON Dataset


Zip Code:


Tip: Hover over a bar to view stats.
Download Injured/Killed Dataset

Time of Day




Tip: Hover over a bar to view stats.
Download Incident Dataset

Time of Day


Download Causes Dataset