Jurnal SISFOKOM (Sistem Informasi dan Kompute. Volume 14. Nomor 03. PP 422-428 Comparison of Classification Algorithms with Bag of Words Feature in Sentiment Analysis Fenilinas Adi Artanto Faculty of Engineering and Computer Science. Informatic University of Muhammadiyah Pekajangan Pekalongan Pekalongan. Indonesia fenilinasadi@gmail. AbstractAi The rapid growth of digital culture, especially on social media platforms, has led to the emergence of unique viral phenomena characterized by unconventional humor and illogical logic such as the Italian brainroot anomaly. Although there have been many studies on sentiment analysis, there is still a lack of studies focusing on cultural sentiment such as humor in the Italian brainroot anomaly. This study provides an overview of user sentiment analysis of the game AuHantu Tung Tung Tung Sahur 3D,Ay a culturally viral application anomaly italian brainroot among young people on the Google Play Store during the month of Ramadan. User reviews were collected through web scraping, and data preprocessing involved tokenization, stopword removal, lowercase, stemming, and filtering to prepare the text for analysis. Feature extraction was performed using the Bag of Words method. This study compares the performance of four widely used classification algorithmsAiSupport Vector Machine (SVM). Nayve Bayes. Decision Tree (C4. , and Random ForestAiimplemented through Orange Data Mining software, with evaluation based on K-Fold Cross Validation. The novelty of this study lies in its focus on sentiment analysis in a unique and culturally viral digital context, as well as a comparative evaluation of classification algorithms specifically on this dataset. The results show that the Random Forest algorithm achieves the highest Area Under the Curve (AUC) score of 0. 529, outperforming Nayve Bayes . SVM . , and Decision Tree . These findings provide new insights into the suitability of ensemble methods such as Random Forest for sentiment analysis in specific digital phenomena, highlighting its potential for more reliable sentiment classification in similar contexts. KeywordsAi Analysis Sentiment. Anomaly Tren. Bag of Words. Google Play Store. INTRODUCTION The development of technology and the internet, especially social media, has created a unique and dynamic digital culture One of the emerging phenomena is the anomaly trend, a form of absurd humor and illogical narrative that is popular among children and teenagers. One of the viral anomaly trends is AyTung Tung Tung SahurAy, which has its roots in the Italian Brainrot internet subculture which refers to content with broken logic, chaotic visuals, and dramatic sound effects that are deliberately made illogical but entertaining . This trend quickly spread through TikTok. Youtube Shorts, and Instagram Reels, attracting the attention of especially Generation Z who love fast and absurd entertainment . The phenomenon was then adopted into the form of a mobile game "Hantu Tung Tung Tung Sahur 3D" developed by LemauDev. This game presents a unique gameplay where players must avoid the ghost anomaly that chases them because they don't wake up for sahur. With scary but funny visualizations with an inspired story from Brainrot logic, this game went viral in Google Play Store downloads, especially during the month of Ramadan . The popularity of this game not only reflects the developer's success in responding to viral trends, but also generates thousands of user reviews. These reviews reflect users' perceptions, satisfaction, criticism, and participation in the viral culture ecosystem. Therefore, analyzing user reviews is important to understand how the public responds emotionally to this application, whether they feel entertained, disappointed, scared, or even give negative sentiments towards the viral . Method for understanding reviews from application users is through sentiment analysis, which is a method for classifying text data based on its emotional polarity . ositive, neutral, or negativ. Text classification in sentiment analysis is included in the data mining method, which is a concept where we look for information from the large amount of data that has been obtained . In data mining there is a method of classification where data is mapped into predetermined groups or classes. The data mining method is a concept where we look for information from the large amount of data that has been obtained . Some techniques or methods in classification are decision trees, support vector machines, neural networks, nayve Bayes, random forests and rule-based classifiers to find information from the large amount of data that has been obtained . Review data in the form of text focuses on the interaction of opinions or views in sentence form, so that it will facilitate the sentiment analysis process . As in Artanto's research . where sentiment analysis on Twitter social media about KPPS members was carried out using Support Vector Machine (SVM), where the SVM algorithm obtained 70% accuracy results and showed that on Twitter social media when discussing KPPS members, it showed that 99% of the public gave positive opinions . In the research of Sari & Wibowo, . the method used in sentiment analysis is the Nayve Bayes algorithm on JD. p-ISSN 2301-7988, e-ISSN 2581-0588 DOI : 10. 32736/sisfokom. Copyright A2025 Submitted : June 14, 2025. Revised : June 25, 2025. Accepted : July 1, 2025. Published : July 28, 2025 Jurnal SISFOKOM (Sistem Informasi dan Kompute. Volume 14. Nomor 03. PP 422-428 online store customers, where the results obtained show that Nayve Bayes has an accuracy of 96. 44% where the people who gave positive sentiment were 300 reviews and those who gave negative sentiment were 294, while those who gave neutral sentiment were 288 . Meanwhile, in the research of Fatkhudin et al. , . , the Decision Tree algorithm was used to analyze Twitter sentiment about the use of Artificial Intelligence for Theses. The results obtained showed that the Decision Tree had an accuracy value of only 66% and gave negative sentiment results of 84. 4% and 6% positive sentiment . In Artanto's research, . in sentiment analysis, the Random Forest algorithm was used to classify Twitter reviews about E-Stamps. The results obtained showed that Random Forest had an accuracy value of 70. 1% with 86% neutral sentiment, 9% negative sentiment and 5% positive sentiment . In order to improve text feature extraction in sentiment analysis, the Bag of Words (BoW) approach can be used, where the BoW model is widely used with good results for predicting language modeling and document classification because BoW is simple and flexible for specific text data . In this context, this study aims to: A Analyze user sentiment towards the "Hantu Tung Tung Tung Sahur 3D" application which represents anomalous and viral digital trends. A Compare the performance of four popular classification algorithms (Support Vector Machine. Nayve Bayes. Decision Tree C4. 5, and Random Fores. in processing review data containing typical absurd and emotional narratives. A Examine the uniqueness of the review patterns of this application compared to other applications, especially related to language style, emotional intensity, and rapid viral dynamics. The selection of this case study was criticized and selected because the application is a concrete representation of the viral digital culture phenomenon that has not been widely studied scientifically through text mining and sentiment analysis By using the Bag of Words method for feature extraction and evaluation using K-Fold Cross Validation, this study not only provides technical contributions in comparing classification algorithms, but also broadens the understanding of how absurd digital culture can be scientifically analyzed using artificial intelligence and data mining. II. RESEARCH METHODOLOGY This study uses a quantitative approach by comparing the performance of the accuracy of the classification algorithm in the sentiment analysis of application reviews with Bag of Word feature selection. The research process is divided into several stages as shown in Figure 1: Figure 1. Research Flow Data Collection The data used in this study were obtained from user reviews of the AuHantu Tung Tung Tung Sahur 3DAy application available on the Google Play Store. Data retrieval was carried out using the web scraping technique using the Playstore Scraper tool using the Phython Library on Google Collabs with the syntax: !conda install -y gdown !pip install google-play-scraper !pip install PySastrawi import nltk tungtung, continuation_token = reviews('net. hantutungtungtungtung3D', lang='id', country='id', sort=Sort. NEWEST, count=1. The data used in this study are user reviews of the Auhantu Run Tun Tun Saur 3dAy application on the Google Play Store. The initial data collected amounted to 1,905 reviews, taken with the parameters lang='id' and country='id' to ensure that all reviews are in Indonesian. The data obtained will later be used in orange datamining to test algorithms for sentiment analysis. Data Preprocessing Before classification is carried out, the review data will go through a preprocessing process to make it suitable for processing, this stage includes: A Tokenization Tokenization is the stage where review data in the form of sentences is broken down into words . A Stopword Removal This stage involves removing stop words, which are common words that are considered not to provide valuable information in text analysis due to their frequent occurrence . p-ISSN 2301-7988, e-ISSN 2581-0588 DOI : 10. 32736/sisfokom. Copyright A2025 Submitted : June 14, 2025. Revised : June 25, 2025. Accepted : July 1, 2025. Published : July 28, 2025 Jurnal SISFOKOM (Sistem Informasi dan Kompute. Volume 14. Nomor 03. PP 422-428 Lowercasing This step changes all letters to lowercase . Stemming/Lemmatization This process converts words into their basic or root form. This helps in reducing the variation of words that have the same root, thus making the analysis easier . Filtering This step removes symbols, numbers and punctuation . Normalization In particular, normalization of non-standard words, slang, and typical Brainrot terms . : Autung tungAy. Ausahur 3DAy. AubrainrotAy. AungakakAy. Aurandom bangetA. was carried out so that they remain represented in the analysis. This list of terms is arranged based on the frequency of occurrence and cultural relevance in the dataset . Data Labeling The data labeling process is done automatically using a multilingual sentiment analysis model that supports Bahasa Indonesia, so it can capture the nuances of sentiment in the context of local culture and language. This model classifies each review into three sentiment classes: positive, negative, based on the content and tone of the sentence. Feature Extraction After the data is cleaned, the next step is to convert the text to numeric form using Bag of Words (BoW) feature extraction to calculate the frequency of word occurrence and arrange them in the form of feature vectors that can be used by machine learning algorithms . BoW is popularly used in various NLP tasks such as text classification and sentiment analysis due to its simplicity and efficiency in converting unstructured text data into structured numeric data. However. BoW does not consider word order or word meaning in depth, making it sometimes less effective in capturing complex contexts. Model Classification This study compares four classification algorithms for sentiment analysis, namely: A Support Vector Machine (SVM) Effective for high-dimensional data such as text and capable of handling non-linear decision boundaries which generally have high performance in text classification, such as sentiment analysis . A Nayve Bayes Fast and efficient algorithm in training and prediction, suitable for text classification, and insensitive to small data . A Decision Tree Easy to implement, can handle categorical and numeric data and does not require much preprocessing . A Random Forest It is an ensembling of many decision trees, so it is more accurate and stable, can reduce overfitting in handling large and complex data . The four algorithms will be implemented using the Orange Data Mining application version 3. 1, where each algorithm is configured and connected to the Test and Score component for the evaluation process with the K-fold Cross Validation approach to ensure the reliability of the results. Evaluation Metrics To assess the performance of each algorithm, several standard evaluation metrics are used in classification, namely . A Accuracy: Proportion of correct predictions from the total A Precision: The ability of the model to correctly identify the positive class. A Recall: The ability of the model to capture all actual positive data. A F1-Score: The average harmonic value between Precision and Recall. A Confusion Matrix: Visualization of the classification results showing the correct and incorrect prediction results of each class. Analysis and Conclusion The evaluation results of the four algorithms will be compared to see which algorithm is the most effective in analyzing application review sentiment. RESULT AND DISCUSSION Before classification, the review data collected through PlayStore Scraper, totaling 1,905 entries, first went through a manual labeling process. Two independent annotators assessed each review and classified it into three sentiment categories: positive, negative, and neutral. Consistency between annotators was measured using Cohen's Kappa, which produced a value of 87 . ery good categor. If there was a difference, a discussion was held to reach a consensus. After the data cleaning process . emoving duplicates, spam, and empty review. , the final amount of data analyzed was 1,500 reviews. then the next process was to analyze the data using the orange data mining design as follows: Figure 2. Design Orange Data Mining p-ISSN 2301-7988, e-ISSN 2581-0588 DOI : 10. 32736/sisfokom. Copyright A2025 Submitted : June 14, 2025. Revised : June 25, 2025. Accepted : July 1, 2025. Published : July 28, 2025 Jurnal SISFOKOM (Sistem Informasi dan Kompute. Volume 14. Nomor 03. PP 422-428 Data Preprocessing Data preprocessing is carried out in the Preprocess Text tool which is used at the following levels: Figure 5. Sentiment Analysis (Orange Data Minin. From Figure 5, the following distribution graph results are Figure 3. Preprocess Text (Orange Data Minin. After going through the Text Preprocess, the topic modeling obtained is: Figure 6. Graph Distributions Sentiment Figure 4. Topic Modelling (Orange Data Minin. In topic modeling, the topic keywords that are in the application reviews are displayed, namely: Augame, bagus, seru, banget, serem, kasih, kerenAy. In the sentiment analysis tools, the Multilingual sentiment with language Indonesian, method is used in Indonesian as shown in the Figure 5: In Figure 6 it can be seen that the majority of application reviews have positive sentiment with a total of 85. 61% and application reviews with negative sentiment of 14. This dominance of positive sentiment indicates that the majority of users appreciate the unique concept and absurd humor presented by the application, even though there are elements of fear or strangeness. Apart from that, the tools in Preprocess Text also get Word Cloud results as in Figure 7: Figure 7. Word Cloud p-ISSN 2301-7988, e-ISSN 2581-0588 DOI : 10. 32736/sisfokom. Copyright A2025 Submitted : June 14, 2025. Revised : June 25, 2025. Accepted : July 1, 2025. Published : July 28, 2025 Jurnal SISFOKOM (Sistem Informasi dan Kompute. Volume 14. Nomor 03. PP 422-428 The word cloud results in figure 7 show that the most common words in the application reviews are Augame, bagus, seru, banget, anomali, kasih, kagetAy. The results of word cloud and topic modeling show dominant words such as AugameAy. AugoodAy. AufunAy. AubangetAy. AuscaryAy. AuanomalyAy, and AushockedAy. These words represent the main characteristics of the application, namely a combination of elements of entertainment, strangeness . , and surprise . hocked/scar. This pattern is consistent with the Brainrot phenomenon, where absurd narratives and exaggerated expressions are the characteristics of the digital culture adopted by this application. The presence of the words AuanomalyAy and AuscaryAy in the context of reviews also indicates user acceptance of the absurd and chaotic elements which are indeed the main attractions of the application. Feature Extraction At the Feature Extraction stage, the Bag of Words tool is used with options as shown in the figure 8: Figure 9. Garph Figure Emotion With emotions anger 1% disgust 0. 1% fear 13. 65% joy 29% sadness 3. 04% surprise 33. this shows that users of the application "Hantu Tung Tung Tung Sahur 3D" when giving reviews of the application, the majority showed expressions of joy. This strengthens the argument that the app succeeds in delivering an experience that is not only entertaining, but also surprising and a little suspensefulAia hallmark of BrainrotAos narrative. Figure 8. Bag of words (Orange Data Minin. Different from sentiment analysis in Preprocess Text in testing the classification algorithm using tweet profiler with the Ekman emotions option because it will test the emotional expression of application users when giving reviews on the "Hantu Tung Tung Tung Sahur 3D" application. Emotion analysis using Tweet Profiler . lthough originally developed for Twitter dat. remains relevant because this model detects short text-based emotional expressions that are also common in app reviews. Model Classification At the Model Classification stage, four algorithms were compared. Support Vector Machine (SVM). Naive Bayes. Decision Tree, and Random Forest using the Test and Score tool on Orange Data Mining. The results are shown in the table. Algorithms SVM Random Forest Decision Tree Nayve Bayes Table I. Test and Score Result AUC Precision 0,503 0,266 0,354 0,529 0,355 0,476 Recall 0,306 0,491 0,498 0,504 0,356 0,010 0,357 0,018 0,361 0,229 while the results of the Confusion Matrix Support Vector Machine (SVM): The results of the tweet profiler referring to the emotions of the application users are shown in the figure 9. the results of the Confusion Matrix Random Forest: p-ISSN 2301-7988, e-ISSN 2581-0588 DOI : 10. 32736/sisfokom. Copyright A2025 Submitted : June 14, 2025. Revised : June 25, 2025. Accepted : July 1, 2025. Published : July 28, 2025 Jurnal SISFOKOM (Sistem Informasi dan Kompute. Volume 14. Nomor 03. PP 422-428 Previous studies Artanto, . and Sari & Wibowo, . have shown that algorithms such as SVM and Nayve Bayes can achieve high accuracy on conventional review data. However, in the context of applications with absurd narratives and digital culture such as AuHantu Tung Tung Tung Sahur 3DAy, the performance of both algorithms decreased drastically. This confirms that the characteristics of language and cultural expressions greatly affect the effectiveness of the model. This finding is a new contribution, while also emphasizing the importance of developing models that are more sensitive to the context of digital culture. the results of the Confusion Matrix Decision Tree: This study shows that Sentiment analysis in applications with absurd cultural contexts requires special attention to preprocessing and algorithm selection. Ensemble models such as Random Forest are superior in handling data with variations in language and cultural expressions. Visualizations such as word clouds and emotion analysis can enrich the understanding of user interaction patterns in viral digital culture. IV. the results of the Confusion Matrix Naive Bayes: The results show that Support Vector Machine (SVM) is not very effective in this model, possibly due to the complexity of the review data or class imbalance. Random Forest shows the most stable and good performance among the four algorithms, although its accuracy is generally still moderate. Although its AUC on Decision Tree is low, the F1-score which is almost the same as Random Forest shows that Decision Tree can also be an alternative, but it needs parameter tuning. Nayve Bayes has poor performance, possibly failing because the assumption of independence between features is not met or because of the imbalance of sentiment labels. Random Forest is the best algorithm in this experiment based on F1. Precision, and Recall Support Vector Machine (SVM) and Decision Tree have moderate performance, but can still be improved. Nayve Bayes performs very poorly in this case, not suitable for this review data directly. In the confusion matrix. Random Forest shows the highest number of correct predictions for the positive class, while SVM and Nayve Bayes tend to make many prediction errors in the minority . This indicates that ensemble models such as Random Forest are more adaptive to variations in language style and cultural expressions that are typical of this application. CONCLUSION Based on a sentiment analysis of 1,905 user reviews for the AuHantu Tung Tung Tung Sahur 3DAy app, the study found that the Bag of Words (BoW) feature extraction method is effective in converting unstructured text into numerical features suitable for machine learning algorithms. Among the four classification algorithms testedAiSupport Vector Machine (SVM). Nayve Bayes. Decision Tree, and Random ForestAithe Random Forest algorithm achieved the highest F1 score . and recall . , as well as the highest AUC value . However, it is important to note that these performance metrics remain relatively low and are only slightly better than random classification, indicating that the classification task is challenging given the nature of the data. The relatively modest performance across all models can be attributed to several factors, including class imbalance in sentiment labels, complexity, and mismatching of labels in review texts inherent in the BoW feature representation that does not capture semantic context or word order in italian brainroot anomalous cultural sentiment labeling. In particular, the Nayve Bayes algorithm performed the worst (F1 score of 0. , likely due to its strong independence assumption that does not hold in this Therefore, although Random Forest performed relatively better in this particular dataset, it is premature to generalize its superiority to all viral app review datasets, especially considering that this study only focused on one culturally unique app anomaly italian brainroot. Further research should explore more sophisticated feature extraction techniques such as word embeddings with Word2Vec. GloVe or TF-IDF weighting to better capture semantic nuances and reduce feature Additionally, leveraging deep learning models such as Long Short-Term Memory (LSTM) networks can improve the ability to model sequential dependencies and contextual information in short, unstructured texts such as user reviews. In conclusion, this study highlights the potential and challenges of applying machine learning to sentiment analysis in the context of the viral digital cultural phenomenon Italian It underlines the need for more sophisticated p-ISSN 2301-7988, e-ISSN 2581-0588 DOI : 10. 32736/sisfokom. Copyright A2025 Submitted : June 14, 2025. Revised : June 25, 2025. Accepted : July 1, 2025. Published : July 28, 2025 Jurnal SISFOKOM (Sistem Informasi dan Kompute. Volume 14. Nomor 03. PP 422-428 modeling approaches and careful consideration of data characteristics to improve classification accuracy. REFERENCES