Extracting City Traffic Events from Social Streams
Tóm tắt
Cities are composed of complex systems with physical, cyber, and social components. Current works on extracting and understanding city events mainly rely on technology-enabled infrastructure to observe and record events. In this work, we propose an approach to leverage citizen observations of various city systems and services, such as traffic, public transport, water supply, weather, sewage, and public safety, as a source of city events. We investigate the feasibility of using such textual streams for extracting city events from annotated text. We formalize the problem of annotating social streams such as microblogs as a sequence labeling problem. We present a novel training data creation process for training sequence labeling models. Our automatic training data creation process utilizes instance-level domain knowledge (e.g., locations in a city, possible event terms). We compare this automated annotation process to a state-of-the-art tool that needs manually created training data and show that it has comparable performance in annotation tasks. An aggregation algorithm is then presented for event extraction from annotated text. We carry out a comprehensive evaluation of the event annotation and event extraction on a real-world dataset consisting of event reports and tweets collected over 4 months from the San Francisco Bay Area. The evaluation results are promising and provide insights into the utility of social stream for extracting city events.
Từ khóa
Tài liệu tham khảo
Charu, 2012, Aggarwal and Karthik Subbian
2008, Retrieved
Anantharam Pramod, 2013, Proceedings of the 20th ITS World Congress.
Becker Hila, 2011, Proceedings of the 5th International AAAI Conference on Weblogs and Social Media.
Bélissent Jennifer, 2010, Retrieved
Bélissent Jennifer, 2013, Retrieved
Burke Jeffrey A., Proceedings of the World Sensor Web Workshop (Sensys’06)
Chen Edwin, 2012, Retrieved
Dou Wenwen, 2012, Proceedings of the IEEE VisWeek Workshop on Interactive Visual Text Analytics--Task Driven Analytics of Social Media Content. 971--980
Elkan Charles, 2008, Retrieved
Kehoe Michael, 2011, Retrieved
Koller Daphne, Probabilistic Graphical Models: Principles and Techniques
Lafferty John, Proceedings of the 18th International Conference on Machine Learning (ICML’01)
Lindsay Greg, 2010, Retrieved
McCallum Andrew Kachites, 2002, Mallet: Machine Learning for Language Toolkit.
Mladenić Dunja, 2012, Retrieved
Naughton Martina, 2006, Proceedings of the AAAI Workshop on Event Extraction and Synthesis. 1--6.
Ramshaw Lance A., 1999, Marcus
Sayyadi Hassan, 2009, Proceedings of the 3rd International AAAI Conference on Weblogs and Social Media (ICWSM’09)