Computer Science (CO-SCIENCE) Volume 6 Issue 1 January 2026 Accreditation Sinta 4 No.
SK : 230/E/KPT/2022
Analyzing Public Sentiment Toward Makanan Bergizi Gratis Program Using Machine Learning Musriatun Napiah1*.
Sujiliani Heristian2.
Mugi Raharjo3.
Rachmat Adi Purnama4 Informatics Department.
Faculty of Engineering and Informatics.
Universitas Bina Sarana Informatika Jl.
Kramat Raya No.
RT.
2/RW.
Senen.
Kota Jakarta Pusat.
Daerah Khusus Ibukota Jakarta.
Indonesia Informatics Department.
Faculty of Engineering and Informatics.
Universitas Bina Sarana Informatika Jl.
Banten No.
Karangpawitan.
Kec.
Karawang Barat.
Kabupaten Karawang.
Jawa Barat Informatics Department.
Faculty of Information Technology.
Universitas Nusa Mandiri Jl.
Jatiwaringin Raya No.
02 RT 08 RW 13 Kelurahan Cipinang Melayu.
Kecamatan Makasar Jakarta Timur e-mail: 1musriatun.
mph@bsi.
id, 2sujiliani.
she@bsi.
id, 4rachmat.
rap@bsi.
mou@nusamandiri.
(*) Corresponding Author Article Info: Received: 17-10-2025 | Revised : 10-11-2025 | Accepted : 21-11-2025 Abstrak - Makanan Bergizi Gratis (MBG) program is a strategic initiative of the Indonesian government to improve the nutritional quality of schoolchildren.
This research seeks to examine public sentiment regarding the MBG program by leveraging 10,000 tweets obtained from Kaggle.
The method used combines Natural Language Processing (NLP) and Machine Learning approaches, several algorithms such as Logistic Regression.
Support Vector Machine (SVM).
Random Forest.
Naive Bayes.
XGBoost, and LightGBM were tested to compare classification performance.
The dataset contains a collection of public reviews categorized into three sentiment classes: positive, negative, and neutral.
The analysis process includes text cleaning, tokenization, stopword removal, and stemming to obtain a cleaner text representation.
Text features were then extracted using the Term FrequencyAeInverse Document Frequency (TF-IDF) method.
The results showed that the Logistic Regression 97% with an F1-score of 0.
9552 models showed the most optimal performance.
Sentiment analysis revealed 65% positive responses, 25% neutral, and 10% negative, with the dominant keywords being Aunutrisi,Ay Ausehat,Ay Auanak sekolah,Ay and Augratis.
Ay The results visualization, in the form of a Word Cloud and a bar chart, indicate that public opinion tends to be positive towards the implementation of the MBG program, particularly regarding improving the nutrition of schoolchildren.
This research is expected to provide input for policymakers in evaluating public perceptions of the implementation of food-based social programs.
Keywords: MBG.
Machine Learning.
Natural Language Processing INTRODUCTION Makanan Bergizi Gratis Program (MBG) is one of the Indonesian government's policies aimed at improving the quality of school nutrition.
Along with its implementation, public response to this program has spread widely on social media, especially X (Twitte.
This program is expected to help reduce malnutrition rates and support the achievement of sustainable development goals (Sustaunable Development Goal.
in the health and education That there are 30% of children in Indonesia under the age of 5 have experienced stunting due to malnutrition (Hidayatullah, 2.
As the stunting rate in Indonesia in 2024 was 14%.
Starting from this concern, when Prabowo served as President of the Republic of Indonesia tried to realize the idea of free meals that would be provided to students throughout Indonesia.
Furthermore, the free meal idea was called Makan Bergizi Gratis (MBG) (Prasetyo, 2.
The MBG program launched by the government is a program with a noble goal, it is in line with the spirit of the fifth principle of Pancasila, namely "Social justice for all Indonesian people.
" The government strives to This work is licensed under a Creative Commons Attribution-ShareAlike 4.
0 International License.
Copyright .
2026 The Autour.
Computer Science (CO-SCIENCE) Volume 6 Issue 1 January 2.
E-ISSN: 2774-9711 | P-ISSN: 2808-9065 meet the nutritional needs of all students equally, from remote areas such as the 3T .
nderdeveloped, frontier, infecte.
to big cities without regard to socioeconomic background.
Every student has the same right to be able to enjoy nutritious food.
The government's steps reflect equitable social justice because all students are treated equally and equally to achieve shared prosperity.
(Prasetyo, 2.
Nutrition plays a crucial role in a child's development.
By meeting nutritional needs, such as carbohydrates as a source of energy, protein as a building block, and vitamins and minerals as regulators, a child's health will be This will help prevent various diseases that can hinder their growth and development, which in turn can affect their intelligence (Andreas Halomoan Tambunan et al.
, 2.
However, public reception of this program has been mixed.
Some consider the MBG program very beneficial, while others criticize its effectiveness and implementation.
Therefore, sentiment analysis of these conversations can provide insight into the acceptance and emerging issues within the community.
Proper nutrition for school-age children is essential to ensure optimal growth and development.
This stage of life is a critical period marked by rapid physical and cognitive progress, making balanced nutrient intake highly Sufficient nutrition helps maintain overall health, strengthens the immune system, and supports cognitive abilities.
(Andreas Halomoan Tambunan et al.
, 2.
Therefore, this policy will be an indicator in differentiating the quality between cities and rural areas in adapting to the industrial revolution 4.
(Albaburrahim et al.
, 2.
Machine Learning (ML) holds transformative potential for addressing the energy challenges in the mining Supervised learning algorithms are commonly applied in predictive maintenance to enable early detection of equipment failures and minimize operational downtime (Sakiru Folarin Bello et al.
, 2.
Advances in Natural Language Processing (NLP) and machine learning technology enable automated sentiment analysis of text extracted from social media, news, and public comments.
Through this approach, public opinion can be categorized as positive, negative, or neutral.
This study aims to identify public opinion trends related to the Free Nutritious Food (MBG) program and measure the accuracy of various machine learning algorithms in classifying sentiments by offering a comprehensive comparison between machine learning algorithms and demonstrating the effectiveness of a transformer-based model (IndoBERT) in understanding the Indonesian language sentiment context, thereby enriching local NLP studies and its application in AI-based policy evaluation, thus contributing to policy and science by presenting a data-driven analysis of public opinion towards the MBG program, which can provide input for the government in evaluating public acceptance.
RESEARCH METHOD
In this study, the authors employed a quantitative approach based on machine learning.
The analysis was conducted experimentally by comparing several classification models on a dataset of public opinion on the MBG Source: Research Result .
Figure 1.
Research Method Data Collection The data obtained from this research comes from the kaggle.
com website which can be accessed publicly Sentimen Publik Terhadap Makan Bergizi Gratis A total of 10,000 tweets containing keywords related to MBG.
The data includes columns such as full_text .
weet tex.
, favorite_count .
etweet_coun.
, and engagement metrics .
sername and locatio.
The data is stored in CSV format with two main attributes: text .
he content of the comment or revie.
, and sentiment .
ositive, negative, and neutral sentiment category label.
Pre-Processing The stages carried out in this section are text cleaning (URL, mention, hashta.
, tokenization and normalization, stopward removal and stemming, and finally feature extraction using TF Ae IDF and http://jurnal.
id/index.
php/co-science Computer Science (CO-SCIENCE) Volume 6 Issue 1 January 2.
E-ISSN: 2774-9711 | P-ISSN: 2808-9065 CountVectorizer to represent text in numeric form that can be processed by the classification algorithm.
Algorithm Implementation Machine Learning Machine learning (ML) has become a leading area of study within computer science and related disciplines, fostering significant progress across various other fields (Ezugwu et al.
, 2.
Machine Learning is a sub-unit of Artificial Intelligence that enables machines to learn independently using data without having to be programmed repeatedly by humans.
(Alfarizi M.
Riziq Sirfatullah et al.
, 2.
Machine learning It is a system that can learn to make decisions on its own without having to be repeatedly programmed by humans, allowing the computer to become increasingly intelligent by learning from the data it has.
Based on the learning technique, supervised learning can be distinguished as using a labeled dataset .
raining dat.
, while unsupervised learning draws conclusions based on the dataset.
(Retnoningsih & Pramudita, 2.
Machine Learning works when data is available which is used for analysis of large data sets (BigDat.
to find certain patterns.
MachineLearning has three types, namely Supervised Learning.
Unsupervised Learning, and Reinforcement Learning (R.
Zer et al.
, 2.
In this study, the system was given a training dataset in the form of desired input and output information, so that the system would learn based on existing data.
(R.
Zer et al.
, 2.
The system will look for patterns from the dataset, where these patterns will be used as a reference for subsequent data collection.
Source : (Napiah et al.
, 2.
Figure 2.
Machine Learning Logitic Regression Logistic Regression is commonly applied because it is easy to implement and requires minimal computational resources (Chase et al.
, 2.
A machine learning classification algorithm used to predict the probability of a categorical dependent variable.
In Logistic Regression, the dependent variable is a binary variable containing data coded as 1 (Ye.
or 0 (N.
(Ramadhy & Sibaroni, 2.
Logistic Regression is the appropriate regression analysis to perform when the dependent variable is binary .
wo possibilitie.
(Junifer Pangaribuan et al.
, 2.
Source : (Learning , 2.
Figure 3.
Logistic Regression Random Forest The Random Forest algorithm can be used to classify big data.
Variable pruning, like decision trees, is not available in the random forest algorithm.
However, the advantage of random forests is that they can http://jurnal.
id/index.
php/co-science Computer Science (CO-SCIENCE) Volume 6 Issue 1 January 2.
E-ISSN: 2774-9711 | P-ISSN: 2808-9065 combine multiple trees and use a single tree to perform classification and class prediction (Prasojo & Haryatmi.
Source : (Serokell, 2.
Figure 4.
Random Forest
SVM
Support Vector Machine (SVM) is a method used to find the best hyperplane that can separate two classes in an input space.
In the classification process.
SVM uses training data to build a model, which is then used to predict the class of new, previously unseen data, also known as testing data (Valentino et al.
, 2.
SVM is a powerful and effective classification algorithm, especially in cases of linear and non-linear Sentiment analysis is a technique used to identify, understand, and classify sentiments or opinions in text (Munandar et al.
, 2.
XGBOOST
XGBoost is a development of the ensemble-based gradient tree boosting algorithm, capable of efficiently handling large-scale machine learning problems.
This method was chosen because it has various additional features that can speed up the computational process while reducing the risk of overfitting.
XGBoost can be applied to various types of problems such as classification, regression, and ranking.
This algorithm works by combining several decision trees (CART.
to form a more robust model.
The main advantage of XGBoost lies in its ability to adapt to various conditions, which comes from a series of improvements to its predecessor algorithm.
(Yulianti et al.
, 2.
LightGBM LightGBM is a popular method and has been consistently proven to be able to solve various classification problems (Febriantoro et al.
, 2.
Thus, this method is able to produce more effective models in various classification tasks and regression studies, and achieve excellent detection rates.
Nayve Bayes The Naive Bayes (NB) classifier is a widely utilized supervised learning algorithm in Machine Learning (ML).
Its performance, however, depends on the strong assumption of conditional independence, an assumption that is frequently not met in real-world applications (Gohari et al.
, 2.
Nayve Bayes is an algorithm in machine learning that works based on probability calculations (Susana & Suarna, 2.
Training Process Training is the process by which a machine learning or deep learning model learns from labeled data to recognize patterns and relationships between input .
and output .
The data used constitutes 80% of the dataset.
Testing Process The testing process is the stage of evaluating model performance using data never seen during training to measure how well the model can generalize.
The data used is 20% of the dataset.
Model Evaluation Model evaluation is the practice of testing a trained machine learning model on new data to measure its effectiveness and identify its strong and weak points (Rainio et al.
, 2.
In the evaluation phase, an assessment is conducted to determine whether the data mining results have met the stated objectives.
Next, each cluster formed is profiled to determine its characteristics.
Furthermore, to assess its suitability to the interest pathway, further http://jurnal.
id/index.
php/co-science Computer Science (CO-SCIENCE) Volume 6 Issue 1 January 2.
E-ISSN: 2774-9711 | P-ISSN: 2808-9065 analysis is conducted by linking it to the interest attributes.
This is expected to yield useful information or patterns that can serve as a basis for future data updates.
The three main metrics at this stage are accuracy.
F1-score, and confusion matrix.
RESULTS AND DISCUSSION
Testing using Google Colab on a data set taken from kaggle, 10,000 Twitter User Sentiments Towards the Free Nutritious Meal Program taken.
Data Pre-processing The data obtained from online and social media contained public opinion regarding the Free Nutritious Meals (MBG) program.
Text preprocessing included cleaning irrelevant elements .
URLs, number.
, tokenization, stopword removal, and stemming using NLTK.
The stemming quality was verified through manual inspection and word distribution analysis to ensure semantic consistency.
TF-IDF comparison and model performance evaluation confirmed that sentiment-bearing features were retained, indicating that preprocessing effectively standardized the text without losing essential information.
Source : Research Result .
Figure 5.
Text Preprocessing Feature Extraction After the data is cleaned, the text is converted into a numerical representation using the TF-IDF (Term FrequencyAeInverse Document Frequenc.
This technique assesses the importance of a word in a document based on its frequency across the entire corpus.
Words that appear frequently in one document but rarely in another are weighted higher.
The results of this extraction produce a feature matrix that is used as input to the classification model.
Model Training Results In model testing, the dataset was divided into two parts, a training set of 80% and a testing set of 20%, and cross-validation was used during model training to tune hyperparameters and evaluate generalization To ensure robustness and reproducibility, the experiments were repeated with multiple random seeds .
, 42, 100, and .
, and the results showed stable accuracy and F1-score values across runs.
Several machine learning algorithms were applied to compare sentiment classification performance, including Logistic Regression.
Support Vector Machine (SVM).
Random Forest.
Naive Bayes.
XGBoost, and LightGBM.
Table 1.
Machine Learning Model Evaluation Model Accuracy F1Score Logistic Regression Random Forest Support Vector Machine (SVM) http://jurnal.
id/index.
php/co-science Computer Science (CO-SCIENCE) Volume 6 Issue 1 January 2.
E-ISSN: 2774-9711 | P-ISSN: 2808-9065 Model Accuracy F1Score XGBoost LightGBM Naive Bayes Source: Research Result .
The three best-performing models based on the test results were Logistic Regression.
Random Forest, and SVM, each achieving a maximum accuracy of 97% with an F1-score of 0.
Meanwhile.
XGBoost also performed very well with an accuracy of 96.
5%, followed by LightGBM with a slightly lower accuracy Naive Bayes achieved the lowest result with an accuracy of 87.
5%, although it still managed to maintain an F1-score of 0.
Source: Research Result .
Figure 6.
Model Performance Comparison Visualization of ResultsHasil This section uses a Word Cloud .
o display the dominant words per sentiment categor.
and a bar chart to compare model performance.
Then, a sentiment distribution analysis shows a predominance of positive opinions.
The most frequently occurring words in the sentiment category are: Positive: AugiziAy.
AuanakAy.
AusehatAy.
AugratisAy.
AuprogramAy.
AupemerintahAy.
Negative: Autidak merataAy.
AukurangAy.
Aumasalah distribusiAy.
Neutral: AupelaksanaanAy.
AukegiatanAy.
AumenuAy.
Source: Research Result .
http://jurnal.
id/index.
php/co-science Computer Science (CO-SCIENCE) Volume 6 Issue 1 January 2.
E-ISSN: 2774-9711 | P-ISSN: 2808-9065 Figure 7.
WordCloud From the image above.
In addition to the WordCloud visualization, a quantitative analysis was performed using word frequency and TF-IDF weighting.
The results showed that the words AubergiziAy (TF-IDF = .
AugratisAy .
AumakanAy .
AuprogramAy .
, and AupemerintahAy .
obtained the highest TF-IDF scores across sentiment classes.
These values indicate that the highlighted words are the most dominant and informative features distinguishing sentiment categories, thereby providing quantitative support for the visual WordCloud findings.
Source: Research Result .
Gambar.
7 Average Engagment by Sentiment Visualization using a bar chart shows that 65% of the public supports the program, highlighting the benefits of nutrition and education.
25% are neutral regarding factual news and program Finally, 10% are negative, with concerns about implementation, delays, and quality.
This indicates that the majority of the public has responded positively to the implementation of the MBG program, particularly in terms of improving the nutrition of schoolchildren.
Source: Research Result .
Figure 8.
Logistic Regression The best model obtained from the experiments was logistic regression in sentiment classification.
The model performed very well in recognizing neutral sentiment, but was unable to effectively http://jurnal.
id/index.
php/co-science Computer Science (CO-SCIENCE) Volume 6 Issue 1 January 2.
E-ISSN: 2774-9711 | P-ISSN: 2808-9065 generate negative and positive sentiment due to the imbalance in the amount of data between classes.
For further studies, we can try combining text augmentation and balancing techniques .
versampling/undersamplin.
, applying class weighting or focal loss, and evaluating with imbalance-sensitive metrics .
acro-F1.
PR-AUC).
Experiments should be repeated with multiple random seeds and stratified cross-validation to ensure robustness and reproducibility of the This can also be seen in Figure 9 for model evaluation using a confusion matrix.
Source: Research Result .
Figure 9.
Confusion Matrix
CONCLUSION
The experimental results demonstrate that the application of Natural Language Processing (NLP) with the TF-IDF technique can transform public opinion text into effective numerical features for sentiment classification.
Logistic Regression and LightGBM models demonstrated optimal performance.
Logistic Regression excels in its ability to linearly separate classes in text data processed using TF-IDF, resulting in accurate classification.
LightGBM, on the other hand, stands out for its efficient training time and better memory utilization, especially when processing data with a large number of features.
Other algorithms, such as Random Forest and Naive Bayes, still produce quite good results, but their performance is less than optimal when faced with text data with a high level of word variation.
Overall, the analysis results indicate that public sentiment toward the Free Nutritional Meals program tends to be positive, with the majority of opinions supporting the program as an appropriate step to improve child health and nutritional equity in schools.
This research successfully demonstrates that machine learning can be used effectively to analyze public perceptions of national-scale social policies.
REFERENCES