Intro to Data Visualization – Project 1: 311 Data

311 Data: Sanitation and Rodents

Is there a relationship between trash complaints and rodent sightings in NYC? And if so has that relationship changed alongside the general upheaval caused by COVID-19? 

New York’s Strongest, the workers of the New York City Department of Sanitation (DSNY), fight a never-ending battle against the rising tide of refuse that threatens to blanket the streets of our fair metropolis. Meanwhile, we share our city with an army of vermin living off our waste, multiplying their numbers. Rodents cause structural harm to our city’s infrastructure and are known as vectors of disease. And there is general agreement among the human population that the rodent population must be reduced. 

Anecdotally, it would seem that when trash is out of control, we should expect to see a spike in the rodent population. But is this the case? Can we expect once Sanitation is called in we will see a dip in rodent sightings soon after? And what effect, if any, did the changes spurred by COVID-19—geographic changes for the City’s workforce, restaurant closures, the introduction of outdoor dining, etc.—have on trash and rodent complaints? What can changes in the location and volume of these complaints tell us about the effects of COVID-19 that we might have otherwise missed? 

To explore and possibly answer some of these questions I turned to the NYC Open Data Portal to access 311 Service Request data. I focused on the date range of Feb 1, 2018 – March 1, 2022, giving me data on the roughly two years prior to COVID, and the two years since. I investigated two separate categories of “Complaint Type.” The first set of complaints are those fielded by DSNY that pertain specifically to the identification or removal of trash, waste, and other hazardous materials (e.g. “Dirty Conditions,” “Missed Collection,” “Residential Complaint,” “Illegal Dumping,” etc.). It took extensive filtering of the many different 311 complaint types fielded by the DSNY. The second category of complaint is “Rodent” i.e. rat and mouse sightings (I did some filtering here as well to ensure we were only dealing with actual sightings).  

I visualized this data three ways: 
First I created a line graph of both complaint groups over time. The goal here was to see if there was a relationship between unsanitary conditions and rodent sightings across the city, and to see how the number of complaints for each changed over time. What we can see is that overall there are many more trash complaints, but there is also a possible relationship between the two groups of complaints, as the peaks and valleys of the complaints are generally aligned. We can also see that after an immediate dip, there has been a general upward trend in both requests since the beginning of COVID.  

Next I created a version of a density map for the entire City that overlays rodent sightings and sanitation requests. I attempted to make separate maps of requests pre-COVID and post-COVD in order to see a) where these requests are being made, b) if there is a geographic correlation between sanitation and rodent requests, and, c) if the locations or volume of these requests changed after COVID. Unfortunately, I realized last minute I was having date filtering issues with my two sets of data, so I opted for a single map of the requests over the entire date range. The biggest takeaway here is that sanitation requests seem to be more evenly distributed across the city, whereas the rodent sightings seem to be more contained to trouble areas.

To help further visualize the geographic trends and add some comprehensibility I made a set of four heat maps broken down by neighborhoods to see which neighborhoods had the highest volume of requests in each group, pre- and post-COVID. Here we can more easily see that Central Brooklyn, for instance, had a high amount of rodent sightings and sanitation requests before and since COVID shutdowns. So there does seem to be some geographic correlation between these requests, and (aside from a general trend upward since COVID as shown in the line graphs) not much geographic change in the trouble areas over time.

311 Data: Sanitation and Rodents

Working with the Data: Design, Limitations, and Beyond 
As mentioned earlier, much of the data cleaning I did was to hone in on the specific requests that seemed most relevant to this question. Once I had good data sets and was able to begin analyzing them, I paired them with additional geographic information on neighborhoods, and added date range filters to further isolate the information of interest. In terms of design, I picked the color green to represent the DSNY as that agency is already identified by that color. I picked a subdued red to represent rodents as both a sign of mild threat, and to easily distinguish from the complementary green. But after being advised no one wants to think about rodents and Christmas at the same time (The Nutcracker, anyone?) I switched to a purple tone that would still be easily visible against the green. I also tried to keep my visualizations as simple as possible to make them easier to understand—little to no unnecessary visual or textual information. 

There are some limitations on what this data can show. For instance, it is unclear that the relationship shown by the line graph is the result of a correlation between these two particular sets of requests, or simply showing an overall trend in volume of all 311 requests. The steep drop-off in requests for both categories in July 2019 seems to suggest it could, in some part at least, be the latter. Another limitation I found was that by breaking my heat maps into separate maps, the saturation of color is internally consistent to any of the individual four maps, but not universal. That is to say, the color of each of the four maps is based on the percentage of calls for that map alone. So a neighborhood in one map may appear darker than it would on another map if the total volume of calls in that time period/for that request was lower. Also the data itself is limited. Just because trash is piling up doesn’t mean a request is being dialed in, and similarly, if every New Yorker who saw a rodent called 311, it seems like the phone would never stop ringing. 

There are a few ways to take this data further. First, I would implement some controls by testing other variables to see if the relationship between sanitation requests and rodents is legitimate. I would also like a more granular view of some of this data. For instance it would be interesting to look at the specific times of requests in a specific month in a specific neighborhood to see if rodent sightings come before or after sanitation requests are made. Or to see if there is a drop-off of rodent sightings after a sanitation request is “closed.” Looking at the micro level would be the best way to track these particular relationships. Additionally, the 311 data contains no sociological data, but it would be illuminating to see if there is a correlation between economic factors (like income or evictions) and the areas of the City being underserved by Sanitation/overrun by rodents.