Extracting City Traffic Events from Social Streams

ACM Transactions on Intelligent Systems and Technology - Tập 6 Số 4 - Trang 1-27 - 2015
Pramod Anantharam1, Payam Barnaghi2, Krishnaprasad Thirunarayan1, Amit Sheth1
1Wright State University, OH
2University of Surrey, Guildford, UK

Tóm tắt

Cities are composed of complex systems with physical, cyber, and social components. Current works on extracting and understanding city events mainly rely on technology-enabled infrastructure to observe and record events. In this work, we propose an approach to leverage citizen observations of various city systems and services, such as traffic, public transport, water supply, weather, sewage, and public safety, as a source of city events. We investigate the feasibility of using such textual streams for extracting city events from annotated text. We formalize the problem of annotating social streams such as microblogs as a sequence labeling problem. We present a novel training data creation process for training sequence labeling models. Our automatic training data creation process utilizes instance-level domain knowledge (e.g., locations in a city, possible event terms). We compare this automated annotation process to a state-of-the-art tool that needs manually created training data and show that it has comparable performance in annotation tasks. An aggregation algorithm is then presented for event extraction from annotated text. We carry out a comprehensive evaluation of the event annotation and event extraction on a real-world dataset consisting of event reports and tweets collected over 4 months from the San Francisco Bay Area. The evaluation results are promising and provide insights into the utility of social stream for extracting city events.

Từ khóa


Tài liệu tham khảo

Charu, 2012, Aggarwal and Karthik Subbian

2008, Retrieved

Anantharam Pramod, 2013, Proceedings of the 20th ITS World Congress.

Becker Hila, 2011, Proceedings of the 5th International AAAI Conference on Weblogs and Social Media.

Bélissent Jennifer, 2010, Retrieved

Bélissent Jennifer, 2013, Retrieved

Burke Jeffrey A., Proceedings of the World Sensor Web Workshop (Sensys’06)

Chen Edwin, 2012, Retrieved

Commentz-Walter Beate, A String Matching Algorithm Fast on the Average, 10.1007/3-540-09510-1_10

Dou Wenwen, 2012, Proceedings of the IEEE VisWeek Workshop on Interactive Visual Text Analytics--Task Driven Analytics of Social Media Content. 971--980

Elkan Charles, 2008, Retrieved

10.1109/SENSORCOMM.2010.50

10.5555/1289189.1289229

10.1109/MPRV.2008.80

Kehoe Michael, 2011, Retrieved

Koller Daphne, Probabilistic Graphical Models: Principles and Techniques

10.1145/1008992.1009044

10.1145/1772690.1772751

Lafferty John, Proceedings of the 18th International Conference on Machine Learning (ICML’01)

10.1145/2337542.2337557

Lindsay Greg, 2010, Retrieved

10.1007/978-3-540-88906-9_26

10.1145/1869983.1870005

10.1162/coli.2008.34.2.145

McCallum Andrew Kachites, 2002, Mallet: Machine Learning for Language Toolkit.

Mladenić Dunja, 2012, Retrieved

10.1007/978-3-642-04409-0_52

10.1109/MC.2011.187

Naughton Martina, 2006, Proceedings of the AAAI Workshop on Event Extraction and Synthesis. 1--6.

10.1007/978-3-642-04769-5_16

10.5038/2375-0901.7.3.5

Ramshaw Lance A., 1999, Marcus

10.1145/2339530.2339704

10.1145/1772690.1772777

Sayyadi Hassan, 2009, Proceedings of the 3rd International AAAI Conference on Weblogs and Social Media (ICWSM’09)

10.1109/MIC.2009.77

10.1007/978-3-540-69858-6_21

10.1007/978-3-642-29047-3_28