Visualizing Chicago

A web application representing the Windy City

Introduction to Software and Its Features

Welcome

Hello and Welcome! My name is Dane DeSutter and this is my fledgling effort to turn data about the windy city into an interactive web application. The application is written with browser based JavaScript technologies, so anyone can use it. Below I describe

About the Interface

The interface is laid out to be comfortable for the user who is already using his or her web browser to get work done. Let’s inspect the different parts of the interface, their interaction, and how you can best take advantage of them.

On the left, we have the map region. Here we can see a map of Chicago.

Map Pane

"The left pane shows a map of Chicago. Federal census data recognizes both the sides of Chicago as well as the community areas."

Chicago can be divided up along many criteria. Federal census data recognizes both the sides of Chicago as well as the community areas. The sides are simply clusters of the smaller community areas. In this map, we can see that each side, for example the North side, South side, West side, and so on, are color coded for quick visual inspection. A legend with the color conventions is provided to the left of the map.

Map Legend
Community Areas

Within each side of Chicago, there are also many community areas. These are separated visually dashed borders and mouse hover animation.

Maps can sometimes get cluttered with too much information and be difficult to use, so we also help you find out community area names by another set of mouse hover animations. Note that each community area name appears as a tool tip and in the larger call out.

Tool Tip

Selecting areas for which you would like to view data is as simple as clicking the name of the region from the area list. This list includes both the sides of Chicago as well as each community area. Find the name of the area you are interested in, locate it in the list, and look as data about this is displayed in three different formats: pie chart, bar chart, and table.

Data Pane

We also employed set of empirically validated color mappings (http://colorbrewer2.org/) for quickly distinguishing between pieces of qualitative data. These mappings aid the user in quickly connecting multiple representations of the same data set. The multiple formats become especially helpful when one representation surpasses its usefulness.

If you want to compare multiple regions in the city, simply select them in series from the drop down list. Regions can also be easily removed by unchecking them from the Select Area pane.

Thank you for previewing this web application and we look forward discovering Chicago with you!

Data Wrangling

Data reduction is an important aspect of any visualization project. It was important for me to really understand how I was going to parse up the data file to visualize it effectively using d3.

My process of data wrangling went down many avenues before I finally came to a solution that worked for me. The majority of this work centered around wrangling the census data. I started out by:

  1. Grabbing the full .XLSX data file from http://robparal.blogspot.com/2012/05/hard-to-find-census-data-on-chicago.html
  2. Then I removed individual census tracts from the data set by deleting the rows. An important decision in any data re-representation task is deciding what the size of the “units” are within the data set. At the outset of this project, it was clear that the task was to look for trends at the community level. As such, it was important to cull out data that is relevant to community level, while deleting (or ignoring) the data that is not important.
  3. Data can also often be opaque to an individual outside of the organization that originally collected it. In order to orient myself to the data, I had to find what I thought were logical places to verify against an external reference. I noticed that in the “TT” column (which I suspected stood for total based on it’s position within the document), I noticed that Chicago had a reported population of 2,695,598. I verified this against Google and Wolfram Alpha, which matched this figure. This indicated to me that my interpretation of the data was correct. I then culled down the data set further, removing columns that had no clear purpose for my visualization.
  4. Then I had to determine which of the race identifier columns comprised the reported totals. This became really important as a validation step, as my original reduced data set did not actually have the proper totals relative to the city data.
  5. I then went down my first wrong avenue. I tried turning the data into a JSON format. I found this attribute-based format to be easy to understand organizationally. I used an XLSX to JSON parser called “Mr. Data Converter” located at https://shancarter.github.io/mr-data-converter/ However, this path was a dead end since I had zero success extracting the data meaningfully from the JSON to create charts. I spent a lot of time trying to selectively extract data from d3 methods, but always ran into problems with using incorrect references. After a few days of stress, I decided that the appeal of the JSON format had worn off and decided to return to CSV.
  6. Turning my data into a CSV format was quick and easy, but I again ran into problems of how to handle having one CSV file with multiple columns and then selectively choosing the data from within this format to then render in d3. This led to adopting a new approach to data reduction.
  7. I then wrote a script in Java (a language I have much more familiarity and success) that allowed me to take one data file and then split it up, per community area, into distinct CSV files that I knew would work with my existing d3 graphics. This process was very arduous, but a necessary step along the way. Once I had written the script, I was able to focus on building the application to work with the folder hierarchy that I had established on my web server.