Social Media Analysis for Transit Assessment

Project ID: CTEDD 018-08

Author(s): Won Hwa Kim, University of Texas at Arlington

Co-Author(s): Kate (Kyung) Hyun, University of Texas at Arlington & Ge (Gordon) Zhang, Georgia Institute of Technology

CTEDD Funding Year: 2018 General RFP

Project Status: Complete

UTC Funding: $99,823.17

End Date: December 1, 2019


The impact of personal opinions, attitudes, and belief is significant in decision-making processes for public transportation services. Therefore, stakeholders and transportation planners have been trying to collect various information on public transit service and performance to assess quality and management strategies.

In this regime, social network service (SNS) can be considered as a large but unorganized database of information where individuals exchange event base attitude and sentiment (i.e., experience from individual transportation activity). This information often leads a chain effect that encourages others to react the message (e.g., a single post on a Twitter is visible to those who are connected to the commenter and recursively propagates beyond them). While these posts reflect users’ exhaustive experience on transportation service quality and performance, it is extremely difficult to derive meaningful information by human force since the data are large, arbitrary and complex.

This study will employ recent advances in artificial intelligence (AI) for big and complicated data analysis. Using big data collected from social media such as Twitter, we propose to

  1. capture transit riders’ perception and sentiment when there are changes in the transit system in various temporal and spatial spans;
  2. evaluate transit service including efficiency, equity and reliability; and
  3. implement a web-based interactive platform with a real-time data streaming and GIS map system.

To achieve the objectives above, Twitter data containing transit-related texts will be collected from population-dense and transit operating cities (e.g., Los Angeles, New York, Atlanta, and Dallas-Fort Worth) and analyzed using techniques from state-of-the-art Machine Learning (ML) algorithms. We expect that the proposed study will produce feedbacks for policy makers who explore communication and information technology to create strategies and employ big data to improve system efficiency and transit ridership.