New Yorker or Angeleno? Your tweets reveal which city you belong to

Tweeting from NYC? There’s a good chance you’re talking about art. LA? More likely healthcare.

Define ‘close’ and ‘same’

Our study tackled two problems: There’s no simple definition of “close together,” and it’s difficult to say whether two tweets are about the same topic. We combined several definitions of “close together,” ranging from people located in the same city to the distance in miles between their coordinates, usinga common formulafrom spatial sciences.

It’s more difficult to determine whether two tweets are talking about similar things. Looking for common hashtags might suffice, but unfortunately many people do not use hashtags or use different hashtags when talking about the same thing. To overcome this problem, we used state-of-the-artnatural language processingtechnology. Algorithms developed in this field read and interpret sentences in a manner similar to the way humans do, and they are able to deal with nuance.

We used this technology to group tweets into clusters of topics. We then studied whether tweets falling inside the same cluster were also from people who were close together based on their GPS-enabled tweets. This allowed us to determine, for example, that clusters containing art-related words and phrases tended to arise more often in New York than LA.

Health and wealth versus art and representing

Even before we looked at who tweets about what, we found tweeting across New York City to be more evenly spread, while in LA, more tweeting happens in wealthier areas, including Calabasas –home to Kim Kardashian– Palos Verdes, West Hollywood and the coastal areas.

We also found that New Yorkers referred to themselves and their city far more often than Angelenos did. On a per capita basis, New Yorkers like to talk about art, while Angelenos like to talk about healthcare and hospitality.

LA generates more tweets than New York throughout the day, despite having a smaller population, but from 8 p.m. to 5 a.m. local time, the two have comparable numbers of tweets. Tweeting in New York City rises sharply from 8 p.m. to a peak at 9 p.m., whereas tweeting in LA rises steadily from 2 p.m. to a peak at 7 p.m.

Computational social science

Our methods are a case study in the growing field ofcomputational social science, which aims to find insights in unique, often large, data sets using artificial intelligence models and algorithms. In contrast, traditional social science tends to rely onsurveys and pollsto quantify public perception about an issue. Though surveys have some distinct statistical advantages, they can be expensive and time-consuming to use for collecting quality data with good response rates.

For example, Gallup releases new survey data every few months and currentlycharges US$30,000 for academic licenses. Decades ago, researchers found thatmonetary incentives increase response rates significantly. Even today, online surveys are often accompanied by lottery-based promises of receiving an Amazon gift card. Researchers are working oncombining the benefits of traditional and computational social science.

Zooming into our data, we uncovered some fascinating trends that we hope future research will explore. We found, for example, that on a per capita basis, as crime increases, so do tweets, at least at the level of ZIP codes. Why do high-crime areas tweet more? We don’t know yet, but the trend is consistent across both New York City and LA.

Tweeting, place and COVID-19

Studying tweeting behavior by location could also be useful for understanding disparate outcomes of large-scale events. For example, our twitter analysis could help shed light on how the COVID-19 pandemic has affected people in different places.

New York City washit hard by COVID-19 early on, showing that even major cities were affected in different ways by this terrible pandemic.New reportingis now showing that even within cities, socioeconomically disadvantaged communities were disproportionately burdened.

Recently, we releaseda Twitter data setcovering 10 of the largest metropolitan areas in the United States to further study such disparities using computational social science. We are already using our methods across all of these cities to better understand how COVID-19 has affected certain groups, and the levels of expressed vaccine hesitancy among these groups.

Eventually, we hope to use our methods with a large set of international metropolises to study urban behavior.

Article byMayank Kejriwal, Research Assistant Professor of Industrial & Systems Engineering,University of Southern California

This article is republished fromThe Conversationunder a Creative Commons license. Read theoriginal article.

Story byThe Conversation

An independent news and commentary website produced by academics and journalists.An independent news and commentary website produced by academics and journalists.

Get the TNW newsletter

Get the most important tech news in your inbox each week.

Also tagged with

More TNW

About TNW

Crime prediction software promised to be free of biases. New data shows it perpetuates them

Amazon puts its own “brands” first above better-rated products

Discover TNW All Access

Welcome to WASP-76b: The exoplanet with metal rain and 2,400C temperatures

EVs are un-American, study finds