JURTEKSI (Jurnal Teknologi dan Sistem Informas. Vol. X No 1. Desember 2023, hlm. 59 - 66 DOI: https://doi. org/10. 33330/jurteksi. Available online at http://jurnal. id/index. php/jurteksi ISSN 2407-1811 (Prin. ISSN 2550-0201 (Onlin. SENTIMENT ANALYSIS OF PUBLIC OPINIONS TOWARDS TELKOM UNIVERSITY POST PANDEMIC Anindya Prameswari Putri Djakaria1*. Oktariani Nurul Pratiwi 1. Hanif Fakhrurroja 1 Information System. Telkom University email: *anindyappdj@student. Abstract: Twitter, as a social media platform, has rapidly grown as a means for people to express their opinions and thoughts on various topics, including education. The number of Twitter users surged to 10. 000 in 2020, with a significant increase during the pandemic. Telkom University, as a private institution of higher education in Indonesia, has become one of the topics of discussion on Twitter. UsersAo opinions about Telkom University vary, ranging from positive to negative. To gain deeper insights into public view, sentiment analysis is The analysis follows the Knowledge Discovery in Databases (KDD) process, utilizing the Naive Bayes classification algorithm. The evaluation results indicate the best accuracy achieved with an 80:20 data split, resulting in an accuracy rate of 82. 05%, precision of 82. recall of 82. 05%, and F1-Score of 82. The Nayve Bayes model demonstrates good performance for sentiment analysis of public views regarding Telkom University on Twitter. Keywords: nayve bayes. sentiment analysis. telkom university. Abstrak: Media sosial Twitter berkembang pesat sebagai sarana masyarakat berekspresi untuk menuangkan opini dan pikiran mereka mengenai topik apapun, termasuk pendidikan. Pengguna Twitter meningkat tajam hingga 10. 00 pengguna pada tahun 2020 dan terus meningkat selama pandemi. Telkom University sebagai perguruan tinggi menjadi salah satu topik yang dibicarakan yang berkaitan dengan pendidikan. Pendapat mengenai Telkom University yang diungkapkan oleh pengguna Twitter beragam, baik positif maupun negatif. Analisis sentimen diperlukan untuk memahami pandangan publik lebih mendalam. Digunakan tahapan Knowledge Discovery in Databases dan algoritma klasifikasi Nayve Bayes dalam analisis ini. Hasil evaluasi menunjukkan akurasi paling baik dicapai dengan rasio data 80:20, dengan nilai akurasi sebesar 05%, nilai presisi sebesar 82. 3%, nilai recall sebesar 82. 05%, dan nilai F1-Score sebesar Model klasifikasi Nayve Bayes memiliki performa baik untuk analisis sentimen pandangan publik di Twitter mengenai Telkom University. Kata kunci: analisis sentimen. nayve bayes. telkom university. JURTEKSI (Jurnal Teknologi dan Sistem Informas. Vol. X No 1. Desember 2023, hlm. 59 - 66 DOI: https://doi. org/10. 33330/jurteksi. Available online at http://jurnal. id/index. php/jurteksi ISSN 2407-1811 (Prin. ISSN 2550-0201 (Onlin. INTRODUCTION become a topic of discussion among universityAos stakeholders and the general public. Social media, especially on Twitter, is no exception, with many users expressing their opinions about Telkom University. With the discovery of various opinions, both positive and negative, expressed by Twitter users about Telkom University, the author finds it necessary to conduct sentiment analysis on the publicAos opinions about the Sentiment analysis is a process conducted to determine opinions, emotions, and attitudes reflected through text, usually classified into positive and negative opinions. This process is carried out to gather and examine public views on specific products or topics . In order to enhance the precision of sentiment analysis, machine learning methods such as the Nayve Bayes classification algorithm are used, which can accelerate the automated evaluation of data . To perform sentiment analysis, several stages are required based on the Knowledge Discovery in Database (KDD) methos. These stages begin with transformation, data mining, and evaluation . classification method for sentiment Nayve Bayes This method is used to categorize or assess opinions or tendencies towards a particular issue or object, determining whether it falls under the positive or negative category . Several studies have been conducted related to sentiment analysis using Nayve Bayes because it is considered one of the methods that are Twitter has become the number one platform for people to express their feelings, opinions, views, and real-time events through Live Tweets. Twitter is a social media platform for computerbased online communication . Various organizations and businesses are interested in Twitter data to determine different peopleAos opinions about their products and events. Twitter is also used to understand different peopleAos opinions about political events, movies, and more . The number of Twitter users significantly increased to reach 10,645,000 users in 2020, which was the year the pandemic emerged. then continued to rise, reaching 18. million users in 2022 . According Permatasari. Twitter users increased by 34% in the second quarter of 2020, with the platform becoming a means of expression for people regarding their activities during the pandemic in Indonesia, including those relate to education . Education in Indonesia is divided into three types: academic education . achelor, master, and doctoral degre. , professional/specialist education, and vocational education . iploma/applied bachelorAos degre. Academic education focuses on the mastery and development knowledge and technology. Generally, academic education is provided by universities and institutes . Telkom University is one of the private higher education institutions in Indonesia, founded in 1994. Throughout its existence. Telkom University has In 2022. The university obtained 20 awards from various These achievements have JURTEKSI (Jurnal Teknologi dan Sistem Informas. Vol. X No 1. Desember 2023, hlm. 59 - 66 DOI: https://doi. org/10. 33330/jurteksi. Available online at http://jurnal. id/index. php/jurteksi ISSN 2407-1811 (Prin. ISSN 2550-0201 (Onlin. easy to understand and still yield good In a study, it was mentioned that this method is often used because it only requires a small amount of training data to determine the estimated parameters needed in the classification process . With the issues outlined earlier, this forms the foundation for the author to conduct research with the aim of determining the public sentiment towards Telkom University through Twitter using the Nayve Bayes method. METHOD This research utilizes the text mining method. This method is employed because the data obtained is in the form of text. By using text information can be derived from the data being used. This research uses the Nayve Bayes classification for sentiment analysis as it is considered as one of the understandable classification algorithms that can produce reasonably accurate classifications . ( | ) ( ) ( | ) ( ) Image 1 Problem-Solving Systematics Image 1 illustrates the stages of systematic problem-solving. There are several stages in the problem-solving systematics: data collection, text preprocessing, data splitting, term weighting. Nayve Bayes classification. K-Fold cross validation, and evaluation using confusion matrix. In data collectinig stage, data will be obtained by scraping tweets related to Telkom University on Twitter using predefined Once the data is collected, preprocessing will be performed, involving data cleaning, case folding, stemming, and manual data labeling. The data will be divided into two labels, positive and negative. After mapping the data according to the labels, oversampling will be conducted. On the next step, data will be split into tranining and testing data and term weighting will be applied using Term Frequency-Inverse Document Frequency technique. Following those, ( ) Description: : Data with unknown class : Hypothesis on data R which is a special class P(R|S) : The probability value of R based on codition S P(R) : The probability value of R P(S|R) : The probability value of S based on condition R P(S) : The probability value of S JURTEKSI (Jurnal Teknologi dan Sistem Informas. Vol. X No 1. Desember 2023, hlm. 59 - 66 DOI: https://doi. org/10. 33330/jurteksi. Available online at http://jurnal. id/index. php/jurteksi ISSN 2407-1811 (Prin. ISSN 2550-0201 (Onlin. the divided data will undergo data classification using the Nayve Bayes The subsequent stage is ensuring the reliability and validity of the data division. K-Fold cross validation will be employed. Finally, the model will be evaluated using the confusion matrix to examine the performance results of the model. Image 2 Data Sentiment Bar Chart RESULT AND DISCUSSION After preprocessing and data labeling, oversampling is performed to increase the number of samples from the minority class . n this case, the negative sentiment clas. by creating new examples or repeating some existing examples to balance it with the majority class . ositive sentiment The oversampling is done using the reandom oversampler method. As a result, the total usable data becomes 780, with 390 data each for positive and negative sentiment after oversampling. After going through the stages of data collection, processing, data splitting, and term weighting, the data that was divided into training and testing data will be used for model creation using the Nayve Bayes The training data is utilized to train the model with the weighting results and predefined labels given to the Nayve Bayes algorithm, while the testing data is used to measure the classifierAos performance in classification with accurate predictions. Here is the accuracy comparison with three different ratios of training and testing Data scraping is performed on Twitter by searching for relevant tweets according to the research needs. In this study, relevant keywords like Aotelkom universityAo. Aotelkom univAo. Aouniversitas telkomAo. Aokuliah telkomAo. Aokuliah telyuAo. AotelyuAo. Aotel uAo, and Aotel-uAo were used, with a date range limitation from January 1, 2022, to July 16, 2023. TwitterSearchScraper library is used, utilizing the Python programming The data is extracted based on the specified keywords, resulting in a total of 3026 data. Subsequently, the data is manually filtered to remove irrelevant and neutral sentiment tweets. After this filtering process, 588 relevant data are obtained and ready for use. After obtaining the data, manual labeling was done by matching the content of tweets with the appropriate With this technique, out of the total 588 data, 390 data were labeled as positive sentiment and 198 data were labeled as negative sentiment. JURTEKSI (Jurnal Teknologi dan Sistem Informas. Vol. X No 1. Desember 2023, hlm. 59 - 66 DOI: https://doi. org/10. 33330/jurteksi. Available online at http://jurnal. id/index. php/jurteksi ISSN 2407-1811 (Prin. ISSN 2550-0201 (Onlin. Table 1 Accuracy Comparisons Ratio Accuracy 60:40 70:30 80:20 Based on Table 1, the Nayve Bayes classification model with 80:20 ratio has the highest performance with an accuracy of 80. With this ratio, the number of training data used is 624 data and 156 data are used for testing Next. K-Fold cross-validation is performed using the scikit-learn . library with the modules AoGaussianNBAo and Aocross_val_scoreAo. Using 5-fold, the data is divided into 5 subsets . and the model is trained and evaluated 5 times. Image 3 Confusion Matrix for 80:20 Based on Image 3, the confusion matrix shows that out of the total 156 testing data, 63 data are classified as true positives, 11 data as false positives, 63 data as true negatives, and 19 data as false negatives. By using the confusion matrix, the average values of accuracy, precision, recall, and F1-score can be Table 2 K-Fold Cross Validation results with K=5 Accuracy Fold 60:40 70:30 80:20 5% 72. 7% 70. 3% 79. 8% 82. 7% 73. 8% 73. 3% 74. 7% 70. 6% 78. Average 76% Table 3 Evaluation Results Ratio 60:40 70:30 80:20 3% 82. Accuracy 1% 82. Precision 8% 80. 3% 82. Recall 2% 82. F1-Score Based on Table 2, the results from the five folds show that the 80:20 ratio produces the highest crossvalidation score with an average of By using the confusion matrix classification model can be evaluated or The evaluation is performed using the testing data. Based on Table 3, the performance evaluation results with an 80:20 ratio show the highest accuracy value, indicating that the Nayve Bayes classification model with this ratio is considered good. To display the results of the modeling implementation. WordCloud is used to show the list of words used in the collected tweets. The more frequently these words are used, the larger their size will be in WordCloud. JURTEKSI (Jurnal Teknologi dan Sistem Informas. Vol. X No 1. Desember 2023, hlm. 59 - 66 DOI: https://doi. org/10. 33330/jurteksi. Available online at http://jurnal. id/index. php/jurteksi ISSN 2407-1811 (Prin. ISSN 2550-0201 (Onlin. to hace a fairly positive opinion about the university. Positive opinions are frequently found in tweets discussing scholarships and the discussion of Telkom University as a reputable private higher education institution. However, negative opinions about this university are also present, as seen in tweets assuming that the employment prospects for Telkom University graduates are limited to being technicians fixing WiFi networks. Therefore, in order to mitigate assumptions about job prospects. Telkom University could publish information about its graduates having diverse employment opportunities. Additionally. Telkom University could periodically monitor Twitter to gauge public opinions for the purpose of institutional evaluation. Image 4 Positive Sentiments WordCloud Based on Image 4, which is the result of the WordCloud for data labeled as positive sentiment, the words are found in tweets that provide information about scholarships offered by Telkom University. Twitter usersAo admissions in university selection, and their desire to study at Telkom University as a good private university. CONCLUSION Based on the sentiment analysis results of public opinions on Twitter about Telkom University using Nayve Bayes classification, it was found that out of the total 588 data, 66. 3% showed as positive sentiments, while only 6% showed as negative sentiments. This result indicates that on Twitter, the publicAos Telkom University tends to be positive. The process involved data selection through data scraping, followed by data Subsequently, transformation was carried out. Followed by Nayve Bayes classification, and the evaluation phase was performed, which included K-Fold Cross Validation and the use of a confusion matrix to obtain evaluation results of the modelAos performance. The Image 5 Negative Sentiments WordCloud On the other hand, in Image 5, the result of the WordCloud for data labeled as negative sentiment include words that can be found in tweets congestion on the way to Telkom University, and the hot weather in the university area. After obtaining 588 relevant data representing public opinions on Telkom University from the Twitter social media platform, the public tends JURTEKSI (Jurnal Teknologi dan Sistem Informas. Vol. X No 1. Desember 2023, hlm. 59 - 66 DOI: https://doi. org/10. 33330/jurteksi. Available online at http://jurnal. id/index. php/jurteksi evaluation results showed that the best accuracy was achieved with 80:20 data ratio after oversampling with the total of 780 data, 624 data for training and 156 data for testing. With that ratio, the evaluation results showed an accuracy 05%, precision of 82. 3%, recall of 05%, and F1-score of 82. Based on these findings, the modelAos average performance with this ratio exceeds 80%, suggesting that the Nayve bayes classification algorithm demonstrates a strong performance. ISSN 2407-1811 (Prin. ISSN 2550-0201 (Onlin. Pendidikan Klasifikasi Pendidikan," Jurnal Dirosah Islamiyah, vol. 5, no. 3, pp. 754762, 2023. Ambarwati and D. Lieharyani, "Visualisasi Data Tweet di Sektor Pendidikan Tinggi Pada Saat Masa Pandemi," Building Informatics. Technology and Science (BITS), 4, no. 1, pp. 116-123, 2022. Cindo. Rini and Ermatita, "Literatur Review: Metode Klasifikasi Sentimen Analisis," Seminar Nasional Teknologi Komputer & Sains (SAINTEKS), pp. 66-70, 2019. Singh. Singh and R. Singh, "Optimization analysis using machine learning Human-centric Computing Information Sciences, 2017. Harahap and A. Nastuti, "Teknik Data Mining untuk Penentuan Paket Hemat Sembako dan Kebutuhan Harian dengan Menggunakan Algoritma FPGrowth (Studi Kasus di Ulfamart Lubuk Alun. ," Informatika: Jurnal Ilmiah Fakultas Sains dan Teknologi. Universitas Labuhanbatu, vol. 7, no. 3, pp. 111-119, 2019. Sudiantoro and E. Zuliarso, "Analisis Sentimen Twitter Menggunakan Text Mining dengan Algoritma Naive Bayes Classifier," Dinamika Informatika, vol. 10, no. 2, pp. 69-73, 2018. Saleh, "Implementasi Metode Klasifikasi Naive Bayes dalam Memprediksi Besarnya Penggunaan Listrik Rumah Tangga," Citec Journal, vol. 2, no. BIBLIOGRAPHY