Detecting Traffic Event from Social Media Texts with Concatenated Word Embeddings

Zeynep OZER, Ilyas OZER

Abstract


In recent years, efforts to detect traffic events from social media platforms have accelerated due to their extensive coverage and low costs. In the studies conducted to date, tweets have been converted into numerical vectors using the bag-of-words representations. However, bag-of-words do not take into account the order of words and have several problems, such as sparsity. Also, last studies have used supervised deep learning architectures and generic word embeddings, which obtained from sources like Wikipedia. Word embeddings obtained by using this type more formal spelling corpora is successful in representing the general meanings of words, while there are limitations in terms of both coping with noise in user-generated texts and representing domainspecific meanings of words. In this study, to overcome these problems, a domain-specific word embedding created for the traffic area consisting of approximately 1.5 M tweets and its concatenated with generic word embedding. Besides, two datasets were created, which are composed of 2 and 8 classes. Then, the concatenated word embedding tested on these datasets using a convolutional neural network (CNN) and long short-term memory (LSTM) architectures. Experimental results show that the proposed approach on the generated dataset provides a significant improvement over state-of-the-art methods.

Keywords: Traffic event detection, Domain-specific word embedding, Twitter, Deep learning

Full Text:

PDF

Refbacks

  • There are currently no refbacks.



Proceeding International Conference on Information Technology and Business (ICITB) is abstracting and indexing in the following databases:


PROCEEDING INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY AND BUSINESS

Managing By: Lembaga Penelitian dan Pengabdian kepada Masyarakat (LPPM)

Publisher: Institut Informatika dan Bisnis Darmajaya
Address: Jl. Z.A. Pagar Alam No. 93 Gedong Meneng, Bandar Lampung Lampung
Website: jurnal.darmajaya.ac.id

Email: ProceedingICITB@darmajaya.ac.id


 

Creative Commons License

IC-BITERA is licensed under a Creative Commons Attribution 4.0 International License.