By F. Philip Barash
Last week, the Chicago Architecture Foundation opened “City of Big Data,” an exhibition about the relationship between information and design. Clustered around CAF’s city model, the exhibition is dense, bright, flickering—as hard to pin down as its titular subject.
Big data has been the buzz of the tech community for years. Lately, it has been at the center of legal and ethical debates about data collection by private-sector companies like Google and by government agencies like the NSA. But what, exactly, does big data tell us about architecture? Does it give us insight into the urban condition? Does it make places more livable, more equitable, more fun? Does it limn the contours of future cities? Maybe. Maybe not.
But if anyone knows, it’s John Tolva. Tolva is equal parts technologist and humanist. He was the technology czar for the City of Chicago, where he advocated for open data and transparency. At IBM, he helped governments implement civic innovations. These days, as head of PositivEnergy Practice, an urban design offshoot of Adrian Smith + Gordon Gill Architecture, Tolva is leading the design of smart cities that consume less energy than they produce. And Tolva served as the lead advisor on the CAF “City of Big Data” exhibition.
I sat down with Tolva in his office on LaSalle Street to see what’s the big deal with big data. What follows is an edited transcript of his comments.
Ecosystem of data
Like anything, “big data” is a term whose use is in inverse proportion to its usefulness. There has always been lots of data, and cities have always generated lots of it.
The reason it is on the radar now is twofold. One is that we actually have the tools to process it: machine learning and statistical analytics help us make sense of data. Big data is no more useful than a single piece if you don’t know how to use it. Sure, you could download a spreadsheet of Chicago crime data if you wanted, but that’s 4,000,000 rows. Now, you can hook into it and filter it computationally, by location, by time.
The other reason is that governments, and other actors in cities, are opening data up, so it’s available. The only thing governments have more of than they used to is data. They use it internally for operational efficiency. But the really exciting thing is what’s done with government data externally. The whole ecosystem of apps is built essentially on top of a platform of data that cities give off. And I should note that it’s not just city government. Building owners in the city of Chicago must now publish their data. Food trucks must publish their locations. And everybody has access to that.
When did data become so big?
The financial sector is where big data lived before it became a thing. If you think about financial data, finding patterns, shaving picoseconds off a trade, and looking at how best to do that, is where it really matters. In finance, you’ve got a critical mass of quants who are starting to see that it’s applicable in other fields.
It was in other disciplines too: in astronomy, there is a lot of big data. But the “big” isn’t just volume—it’s also the speed. You have a lot of little things that are generating little data very, very quickly. This Fitbit on my wrist, and on millions of other people’s bodies, is generating little data. Dealing with that is a big data problem, but not in the sense of 4,000,000 rows of crime data at once. The little pieces aren’t actually all that important. How they roll up is important.
How is big data used in urban design?
Big data applies to many fields. But it’s interesting to the city, and design, because we now have the ability to look at the unintended consequences, the cause and effect, of interventions in the built environment going back many years.
For instance: rerouting a street or taking out two lanes to put in a bike path. You can look at the economic data in that region. You can look more macroscopically at public health. (It hasn’t been done yet, but it would be fascinating to look at public health versus Divvy ridership.) Then you can start to stitch these sets together to model the city. Not a 3D model, which is what architects and engineers are used to, but a data model that says something.
You have to be careful: everybody knows that correlation is not causation. And that’s why big data isn’t the only input. We’re not looking at creating SimCity—we’re looking at informing design decisions on data.
You’ve seen data used in architecture in the crazy shapes, and that’s cool. But that’s structural and aesthetic. There’s an opportunity for that industry to take it to the next level. We’ve always been informed by calculations like the wind shear at the twentieth and fortieth and sixtieth floors. Now, it’s treating a building—or it could be a bridge or the Bloomingdale Trail—not as an aesthetic creation, but as an actor in the urban environment. What does it do to pedestrianism? How might it change the socioeconomics? What is its energy usage?
Buildings are not self-contained units. They change things around them. What we’re trying to do is use data to understand how that happens.
So big data isn’t replacing traditional tools of urban design?
A doctor is not replaced by an echocardiogram machine: he uses it to make the diagnosis. I think of big data as being the vital signs of the city.
It is both the infrastructure and—what we learned in the past couple of years—it’s how people use the city. The first wave of smart cities was led by big IT companies, Siemens and IBM, that treated cities as the sum of their instrumented infrastructure. But we know that cities are really the sum of the ways that human beings use the city. It’s not about asking what is the traction coefficient on an icy bridge—although that’s useful—it’s how that bridge is being used.
Social media is one set of big data. Twitter, geolocated to Chicago, tells you how the city is being used. The fastest way to find out what’s going on on the CTA is to go to Twitter because what you see is a city full of human sensors. The thing about cities is, that’s always happened; cities always had a density of communication and feedback loops. But the change is that those feedback loops are quicker now.
Is another change that we are now storing that information? Does a tweet ever disappear?
Up until last year, it was very difficult to go back to early days of Twitter and find tweets. This speaks of the need for tools to make sense of this data. Otherwise, you’re constantly living in the present. The real payoff in big data, as regards design, is looking historically at trends and patterns. We can know that if we situate a building like this, it can have adverse consequences on the use of public space around it. That’s half of it: using data as input to design. But the other piece of this is cities themselves—the infrastructure and the public way—are becoming part of the Internet.
This has been your career-long project: cities as platform for computing.
One version of that is a software platform that’s publishing data. But then there’s a physical platform: you’ve got all the Divvy bikes and those are network endpoints. When they hit the Divvy dock, they are on the Internet. That’s thousands of sensors, showing the city where people want to bike. The ride data helps inform how the city shapes its streets, where it decides to put bike lanes and the docks themselves.
Other cities have used public bikes as actual physical sensor platforms. They put environmental sensors in the wheel well. Designing interventions to combat the urban heat island effect, for example, requires knowing temperatures, and other things, at sub-block granularity. You’re never going to have enough stationary sensors to get that. But if you put them on bikes…
The Chicago convergence.
As the world develops smart grids, we as designers need to be thinking about designing for an Internet-enabled city. If you could visualize the public way, and all the wi-fi clouds, and surveillance cameras, and Internet-enabled trash cans, and Divvy bikes, it would look like the Wild West. It would look like those nineteenth-century photographs of intersections in Chicago.
I love those. Eighteen trolley lines competing on the same corner.
What mitigates the eighteen trolley lines in this case is not consolidating them into one, but having them all work on the same standard. “Interoperability” is the word we use. The bus shelters, most of them internet-enabled, should be talking to the Divvy docks, machine to machine. This is easy but it requires computer code, development. The more standardized we can be about how physical things report their status, the more interoperable the city will become. City government plays a role in that in the same way city government makes sure that all the street signage is the same. Can you imagine if every sign was in a different font and different color? That’s sort of what’s going on here.
Chicago is still the place for the built environment to be designed. The firms here are focusing on data as another tool in the next evolution of urban design. There is this convergence of Internet startups and a history of smart urbanism, and that’s why I think Chicago is the locus of this new approach to urban design.
Design writ large.
When I was at the city, I was once accused of practicing democracy by spreadsheet. The implication was: how is having data any better than not having anything at all? Yes, we have this data, but most people can’t decode it. So what’s needed are cross-disciplinary translators.You don’t go to school for that, but for designers—I mean designers writ large—that’s the way they think.
Human beings only have a few tools, cognitively, to deal with information. We cannot process 4,000,000 rows in our heads. We have, over the course of time, invented slide rules and calculators and databases. But we’re getting to the point where the consequences of data—what it contains, what it tells you—is more relevant to more people. Something in that big data is relevant to your block, your life. And the only real tool we have to make sense of that is design, smart design. A database expert isn’t going to solve that “last mile” of interpretation. But a designer will. Design encompasses visualization and it encompasses storytelling, which is kind of time-based design. And it’s synthesizing multiple, disparate things. The best designers can basically speak more than one language.
Information designers, you mean, not architectural or urban designers?
The data is available but what is lacking—and this is an opportunity for designers—is that translational layer, middleware. Conceptually, it sits on top of the data but makes it actionable by non-statisticians. Mostly, middleware has come in the form of apps.
Think of all the transit apps: that data exists online and you can read it as a series of times and locations. But no one in their right mind would do that. So think of an app as that translational layer. What doesn’t exist is a higher order of tool. Let’s stick with transit: really, people don’t care about who runs the buses or who runs Divvys. And yet, I’ve got a bus app, a train app, a Divvy app and even a pedestrian app. A higher order tool would take all of those and just say: I am here and I want to get to there. It would know the weather, it would know my calendar, it would also know what I’m wearing, and basically plot a holistic course.
What if you’re trying to figure out where to take your community? You’re going to need more than transit data. You need to be able to look at causation between policies and outcomes.There is no good tool that would allow a pastor on the South Side to say: I want to see foreclosures in this ward and around my church, versus foreclosures around my friend who runs a church in Brooklyn. All that data is there, but there’s no middleware, no visualization service that points at two or three data sets and normalizes them. That’s a real opportunity and that’s what designers are really good at.
Designing for the unknown.
The interesting thing about data for a designer is that data for one purpose can be used for something that was never intended. The city of Chicago, for instance, is proactively baiting for rats. It used to be that you’d call 311 and say, “I just saw a rat in my alley,” and then the city would go and put the trap down, and maybe get it. Now, they’re looking at data they already have, like reports of high weeds, a vacant lot, restaurant violations for dumping food. The probability is high that there is a rat there. But the data was never intended for that, just like a Divvy bike wasn’t intended to be an environmental sensor.
Designers of technology need to be able to let go and design urban technology to be used in ways that weren’t intended. At a certain low-level functionality, yes: this button must work. That’s fine, but at a higher level, be okay with opening up, using open source, involving the community.
Despite all the data, there may still be opportunities for surprise and serendipity in cities?
There are apps, actually, that attempt to reinject serendipity by altering your route. But my kids will never experience the city unmediated by some layer of data. It’s unlikely that they’ll get lost. It’s even less likely that they’ll want to get lost. The downside of the map that knows where you are is that you never just set off, and find stuff.
A little bit of friction in the gears is what differentiates a city street from a mall. That’s why people in cities want to live there, and not in artificial constructs. You know the difference between walking down Main Street in the Magic Kingdom and walking down LaSalle Street, even though they look very similar. Old cities that have legacy infrastructure, and legacy data, are richer in possibilities than a new city that’s not been used. The beauty of the city, in many ways, is that it’s shaped by people. We don’t want that to go away.
Pictures at an exhibition.
The Chicago Architecture Foundation exhibition is about two things. One is: how is the physical environment shaped by the infrastructure of data? What are the physical effects? Literally, the tops of buildings with wooden water tanks are now cell phone repeater platforms. Old buildings—publishing buildings, ironically—are reused as data centers. The Internet is a physical thing. I heard a friend of mine once say “the Internet is real and it is on Cermak Road.” That’s one half. The other half is: what are these vital signs that are being given off and what do they tell us about the design of the built environment?
The exhibit is split between those two pieces that are inverses of one another: how does data inform the design of the future of cities, and how is the physical data infrastructure—fiber, cell towers—changing the shape of the city.
The built environment is more than just physical things. The physical things in it are impacted by information. The exhibition is trying to have a conversation at different resolutions—there is the city, there’s your block, there’s you and your data trail. I guess you could say that about any exhibition at CAF, but I hope it changes the way people look at their city.
Any highlights?
I’m very proud of the maps that Jane Addams and the Hull House put together. They look remarkably like data visualizations from today. But they were hand-drawn and colored in house by house—Italians live here, Irish live here, Jewish people live here—so that they could better design social services. Those maps still exist.
The projection on the city model—difficult though that is because you have to account for one building occluding another—is magical. The model becomes a bumpy screen. Now, anyone can give CAF geodata and it can be projected. At the opening, there was a guy who said he had a projection of the spread of the Chicago fire, how it jumped back and forth across the river. That would be awesome on the model. We’re thinking of having a hackathon—or more like a mapathon where people just bring their data.
But people were really drawn to a gigantic, static, printed map. Fixated on it. It’s big data, but all everybody wanted was to find their house. That’s a human instinct. Think about what design does for big data. Something in that map is relevant to everybody. It’s up to the designer to make that intelligible. That’s what design does for big data: it orients it.
Are you optimistic about the way that big data is changing the urban—the human—condition?
Only to the degree that it improves our lives. Like any technology, it’s morally neutral—it’s about how it gets put to use. But I wouldn’t say that this technology is merely a tool. It may be morally neutral, but it can impact policy. I think that performance and accountability that comes with open data is good in and of itself.
Big data can be used to obfuscate. Big data can also tell you things that you don’t want to know. It’s like genealogy. Be careful how far you dig, because you probably will find a bastard in there somewhere.
Literal or figurative bastard?
In both senses.