Monday, May 23, 2016

Big Data And The City

The 2015 Big Data Expo
Guiyang, China
Reuters Staff/Reuters
Hello Everyone:

Time to start the blogosphere week Today we start with the subject of the relationship between data and cities.  Like any relationship, it is a complex one.  In his recent CityLab article, "The Complex Relationship Between Data and Cities, Richard Florida discusses the latest advances and the challenge ahead.  Mr. Florida writes, "There's been no shortage of hip about the relationship between cities and data, especially so-called big data."  For a majority of tech companies, cities, and an increasing number of urbanists, data offers the opportunity to remedy a myriad of urban issues such as: predictive policing, improving the flow of traffic, and promoting energy efficiency.  Data has a greater potential role to play in assisting policymakers and researchers "...understand how cities and neighborhoods grow and evolve-but only if done tight."

Quantifying the Livable City
The legitimately exciting use of new data

The social media has become a rich source for researchers looking to mine data in order to develop new an understanding of cities and urban change.  Mr. Florida cites the examples of the work done by sociologists Robert Sampson and Jacquelyn Hwang, who used Street View image to study to role of race and gentrification and neighborhood transformation.  A similar study from the U.K. Spatial Economics Research Centre made use of geo-tagged pictures on Flickr to assess the rates of urbanity in London and Berlin.  Ride-hailing apps Uber and Lyft  have been mined for mobility data have been used in several studies and chronicled by Mr. Florida's colleagues  Laura Bliss (; May 2016).  Former CityLab writer Eric Jaffe has gleaned information on housing price trends across neighborhoods, cities, and metropolitans from Zillow and Trulia.

Share to Twitter
The very helpful Yelp reviewer data has assisted researchers in studying gentrification and asymmetrical urban consumption trends.  Richard Florida reports, "One study use Yelp reviews to shed light on the connection between gentrification and race in Brooklyn.  Another NBER study employed Yelp data to find out how ethnic and racial segregation affects consumption levels in New York City."

The social media site Twitter has yielded data used to graph regional preferences and behavior patterns.  Citing a study from the Oxford Internet Institute ( that "...mapped the flow of online content and ideas across cultures."  Even blogs have been mined for data-to wit, the mapping blog Floating Sheep ( made ample use of data from Twitter, Google, and Wikipedia to chart everything from beer and pizza, marijuana, bowling, and (strangely) strip clubs.  Mr. Florida's own team has used information from MySpace (remember that site?) to track the main centers for popular music genres throughout the United States and around the world.

Richard Florida reports, "More recently, a team of Italian researchers combined data from Foursquare and OpenStreetMap,..., to test Jane Jacobs' theories of urban vitality and diversity in six Italian cities."  Their researched confirmed most of her main insights about the necessity of short blocks, diverse land uses, walkability, dense concentrations of workers, and urban public space.

 Further satellite information holds the potential for gathering systematic and comparable information across global cities (not much, if any is available)  Mr. Florida cites several studies, including his own, "...have used satellite data to get at the economic output of cities and metros around the world."  Additionally, a 2012 study in the American Economic Review ( utilized light emissions from satellites as stand-ins for the spatial organization and economic size of cities around the world.  Mr. Florida cautions, "While this data is subject to considerable limits, it provides at least rough estimates of the overall size and economic scale of cities across the world."

"Big Data and Informatics"
Accurately characterizing "big data"

The words "big data" sound rather ominous but let us be accurate about what exactly is "big data."  To begin, "Not all data from new sources qualifies as 'big data,' which-as its name implies-refers to truly massive amounts of information."  Max Nathan of the London School of Economics groups real big data into three important categories.  There is internet data like the social media sites and other commercial  data, government-sponsored data collection sponsored by municipalities and the Census Bureau, and related data.  He goes on to cite the example of a 2014 study by NESTA (, which incorporated data from the firm Growth Intelligence " map patterns of information and technology in the U.K."  Finally, the American Journal of Sociology, will be publishing a study that used information from millions of 3-1-1 service call to study neighborhood conflicts among residents in culturally diverse communities.

"Smart cities, big data"
According to Mr. Nathan, "big data can be thought of in terms of the 'four Vs': variety, volume (millions or billions of observations), velocity (real-time data), and veracity (raw data)."  Actual big data frequently necessitates analytic methods like machine learning to take in and derive meaning from large caches of information.  For example,the continuing Livelihoods Projects ( in the School of Computer Science at Carnegie Mello University, uses machine learning to examine 18 million check-ins on Foursquare to discover the structure and nature of eight different cities.  When used properly, big and new data analytics can aid researchers in uncovering urban structures and patterns that traditional data and methodology may not reveal on their own.

IES Faculty Introduction to Big Data
Richard Florida cites a recent NBER study ( conducted by  Harvard and MIT researchers, which used computer visioning to better comprehend the geographic differences in income and housing prices.  Mr. Florida writes "Although the paper covers plant of ground, perhaps the most interesting section involved the use of Google Street View to predict income levels and housing prices in Boston and New York between 2007 and 2014."  The paper connected "...12,200 images of income and home values from the 2006-2011 from the American Community Survey.  It then examines the extent to which positive physical attributes shown in these images...attract more affluent residents and predict incomes and housing prices."

"The promise of big data for cities"
The study found that

images can predict income at the block level far better than race or education does.

The study observed that the main purpose of big data is to highlight the role of smaller geographical locations in our urban economies, more difficult to access through traditional Census information.  The authors concluded that

...big data offers some hope that Google Street Views and similar predicts will enable us to better understand patterns of wealth and poverty worldwide.

Problems and limitations

Big Data
Big data does offer the promise of advancing our knowledge about cities.  However, a growing number of scholars urge caution.  (  A 2014 workshop ( brought together a group of the leading urban social scientist and data users, identified six key subjects involving big data, spanning data quality and compatibility, , use of new analytic methods, privacy and security matters.  The workshop summary noted,

Developing theory to go with new methods and data is critical, and often sidelined.  Engineering and control theory (or big data "without theory') work well when there is a measurable outcome, a simple policy to correct for it, and fast enough reaction time that the correction can be implemented while it is still appropriate.  In cities, this is the process used to optimize service delivery.  But this theory does not work well for complex systems with long term horizons, lie most social systems.

In essence, big data and new analytics are only as good only as the questions we give it and the theories we generate to better comprehend them.  No matter how powerful they may be, new data sources and analytic strategies are no substitute for human intuition and reasoning about the urban environment.  The power of these tools is founded in how they are used to test and further our insights of innovative urban theory.  It is a little scary to think that the random post on the social media sites can be used by a researchers to study a specific urban environment right down to the minute detail.  However, if we want to make our cities better and more efficient places to work and live, these random posts can help the cause.

No comments:

Post a Comment