Social media serves as a solution for politicians as a campaign tool because it can save costs compared to conventional campaigns. The 2024 Indonesian Presidential Election has drawn public attention, especially among social media users. Twitter, as one of the widely used social media platforms in Indonesia, functions as an effective campaign forum. However, the problem that arises is how to automatically collect social media data related to presidential discussions and provide conclusions on the analysis results. Of course, this is not easy if done manually. Sentiment analysis is one approach that can be used for this in order to draw conclusions and analysis related to the available data. Data was collected shortly after the registration of presidential and vice-presidential candidates in November 2023. This study aims to obtain sentiment results from the latest data obtained, get the best model from the Naive Bayes method, to conduct analysis in predicting presidential election results based on sentiment. However, at the time of data collection, candidate numbers had not been assigned by the Election organizers. The obtained data amounted to 11,569 records using the Valence Aware Dictionary for Sentiment Reasoning (VADER) library for labeling. After removing duplicated tweets, the data was reduced to 4,893 records, with each candidate pair having 1,631 data points. The sentiment analysis classification model was determined using the Nave Bayes method with Term Frequency-Inverse Document Frequency (TF-IDF) feature extraction. Based on the data, the highest percentage of positive sentiment was found in Ganjar Pranowo - Mahfud MD data at 69.16%, and the highest negative sentiment was in Prabowo Subianto - Gibran Rakabuming Raka data at 52.12%. Common words in positive sentiment for Ganjar Pranowo - Mahfud MD include "strong," "corruption," "support," "reward," and others. Meanwhile, frequently appearing negative sentiment words for Prabowo Subianto - Gibran Rakabuming Raka include "child," "eldest," "mk," "young," and others. This research achieved an average accuracy of 76.67% using the Naive Bayes method on the entire dataset, indicating its reliability in similar cases.
See how this article has been cited at scite.ai
scite shows how a scientific paper has been cited by providing the context of the citation, a classification describing whether it supports, mentions, or contrasts the cited claim, and a label indicating in which section the citation was made.