Authors: Luyu Wang*, University of Florida, Andrei P. Kirilenko , University of Florida
Topics: Tourism Geography
Keywords: Sentiment Analysis, Machine Learning, Lexicons, Social Media, TripAdvisor
Session Type: Virtual Paper
Start / End Time: 11:10 AM / 12:25 PM
Room: Virtual 28
Presentation File: No File Uploaded
As social media becomes prevalent, billions of people worldwide could interact and communicate with various information. Accordingly, tourists increasingly express their experiences, feelings, and opinions about travel-related topics on social media platforms. Especially, TripAdvisor - the world’s leading travel website, has strong community engagement with 859 million reviews in 2019 (statista.com), providing rich user-generated data to analyze opinions and detect emotions. Interest in analyzing user-generated content in tourism is increasing, and automated sentiment analysis has been widely adopted to detect and extract the polarity from unstructured texts. Sentiment analysis is widely adopted in marketing and customer services in the hospitality field, but limited attention was paid to tourists’ feedback on national parks. There are two categories of sentiment analysis: machine learning-based and lexicon-based approaches. In tourism-related sentiment analysis, studies compared different algorithms of machine learning methods and tested various lexicons. Few studies focus on comparing the performance of the two types of approaches. This study investigates the performance of both methods for classification in terms of commonly used performance indices: accuracy, precision, recall, and F1. Two machine learning methods (Naïve Bayes classifier, Support Vector Machine classifier) and three popular sentiment lexicons (Valence Aware Dictionary for Sentiment Reasoning, SentiWordNet, and NRC Emotion Lexicon) are investigated on Yellowstone National Park review corpus. This study provides an assessment of the performance of two types of sentiment analysis techniques in TripAdvisor sentiment classification. The results indicate that the performance of classification by machine learning was better than lexicon-based methods.