Jurnal SISFOKOM (Sistem Informasi dan Kompute.
Volume 14.
Nomor 03.
PP 298-305 Unveiling Public Sentiment on Quarter Life Crisis: A Comparative Performance Evaluation of Support Vector Machine and Nayve Bayes Algorithms on Social Media X Data Talitha Dwi Septyorini.
Khothibul Umam.
Maya Rini Handayani.
Information Technology.
UIN Walisongo.
Semarang.
Indonesia.
, .
Islamic Broadcasting Communication.
UIN Walisongo.
Semarang.
Indonesia.
2208096036@student.
, khothibul_umam@walisongo.
, maya@walisongo.
AbstractAi Quarter Life Crisis (QLC) is one of the psychological issues experienced by many young adults and is characterized by uncertainty, anxiety, and emotional distress.
the digital era, public opinion about QLC is increasingly expressed through social media, particularly platform X.
This study seeks to classify public opinion related to the QLC into positive and negative sentiments by employing two computational classification models, namely Support Vector Machine (SVM) and Nayve Bayes (NB).
Despite the growing discourse, there has been no study specifically comparing classification algorithms to analyze public sentiment on QLC.
Data collection was conducted through crawling techniques on platform X from November 2024 to January 2025, resulting in a total of 1120 tweets.
The data underwent preprocessing, lexicon-based sentiment labeling, and TF-IDF word weighting.
After preprocessing, classification using SVM and NB was evaluated by accuracy, precision, recall, and F1score.
Results indicate that SVM achieved superior performance with an accuracy of 83%, outperforming NB, which recorded These outcomes demonstrate that the SVM algorithm demonstrates superior performance in analyzing public sentiment regarding QLC.
This research contributes by providing empirical evidence regarding algorithm performance for sentiment analysis in mental health topics, offering recommendations for effective early detection strategies utilizing social media data.
KeywordsAi Quarter Life Crisis.
Sentiment Analysis.
Support Vector Machine.
Nayve Bayes.
Social Media INTRODUCTION Individuals who are unable to respond well to environmental problems tend to experience psychological disorders, such as confusion in the face of uncertainty, which can even result in psychological turmoil, commonly referred to as a Quarter Life Crisis (QLC) .
The QLC phenomenon is one of the psychological issues that is now widely discussed among young adults.
QLC describes a period of uncertainty commonly experienced by individuals aged 20 to 30 years .
, characterized by confusion, anxiety, and concern about the future, career, social relationships, and self-achievement.
This condition can trigger several negative effects on mental health, such as decreased self-confidence, a tendency to isolate oneself, loss of motivation to live, and symptoms of depression due to the habit of comparing oneself excessively with others .
Given its significant impact, it is important for the public, government, and mental health service institutions to understand public perceptions and opinions related to this The phenomenon of Quarter Life Crisis is not A 2017 LinkedIn survey found that 75% of individuals aged 25Ae30 faced such crises, driven primarily by job mismatches .
%) and social comparisons .
%).
A study conducted in Indonesia also reported similar patterns, indicating that 98% of 125 respondents experienced Quarter Life Crisis, with financial instability .
%), feelings of unworthiness .
%), and pressures from adult life demands .
6%) identified as the dominant contributing factors .
Along with the development of digital technology, public opinion is mostly expressed through social media, so digital data-based opinion analysis is one of the strategic alternatives that can be utilized.
Many young people today prefer to express their emotions, opinions, and complaints through posts on the platform X .
This platform has become one of the most actively used social media platforms by young people to express their views on various social issues, including QLC.
Features such as posting, retweeting, and commenting make platform X an effective tool for disseminating information as well as identifying public sentiment, both positive and negative .
However, the large volume of data generated and the diversity of language styles used by users make it challenging to analyze opinions Therefore, public opinion on QLC spread across platform X can be used as an object of research through a sentiment analysis method, which is an automated process to extract, understand, and process text data to determine the tendency of sentiment towards a topic, whether it is positive, negative, or neutral .
Previous studies on QLC have predominantly employed qualitative or manual methods of To date, no research has specifically conducted a comparative performance evaluation of two widely used algorithms.
Support Vector Machine (SVM) and Nayve Bayes (NB), within the context of public sentiment related to QLC on p-ISSN 2301-7988, e-ISSN 2581-0588 DOI : 10.
32736/sisfokom.
Copyright A2025 Submitted : Mey 28, 2025.
Revised : June 25, 2025.
Accepted : July 1, 2025.
Published : July 28, 2025 Jurnal SISFOKOM (Sistem Informasi dan Kompute.
Volume 14.
Nomor 03.
PP 298-305 social media platforms.
This study addresses that gap by offering the first empirical comparison in this context.
This research aims to identify public opinion related to the QLC phenomenon and determine the sentiment pattern that appears in the community, whether it is positive or negative.
addition, this study also evaluates how well the SVM and NB algorithms perform in analyzing public opinion sentiment.
The research data was obtained from platform X through a data crawling process.
The gathered data subsequently undergoes a preprocessing phase, which involves cleaning, converting text to lowercase .
ase foldin.
, tokenization, stopwords removal, and applying stemming.
After preprocessing is complete, the data will be labeled with positive or negative sentiment using a lexicon-based dictionary.
Next, weighting is done using the Term Frequency-Inverse Document Frequency (TF-IDF) Following the weighting process, the data are classified employing the SVM and NB algorithms.
The final stage of the research is model performance evaluation by employing a confusion matrix to calculate the accuracy, precision, and recall metrics for both algorithms.
Accordingly, this study offers a contribution to the analysis of public opinion on social media, with a specific focus on mental health issues associated with Quarter Life Crisis (QLC).
In addition to presenting an overview of the publicAos perception of QLC, this study also offers recommendations for selecting the most optimal algorithm to be used in sentiment analysis on social media platforms.
The importance of this study lies in its presentation of empirical findings that can serve as a reference for academics, mental health practitioners, government agencies, and social media platform managers in developing early intervention frameworks and public mental health Additionally, this study contributes to enhancing public awareness regarding the Quarter Life Crisis II.
LITERATURE REVIEW
The use of sentiment analysis to understand public discourse on social media has become increasingly common in In this area, machine learning techniques like Support Vector Machine (SVM) and Nayve Bayes (NB) are among the most commonly applied methods for sentiment classification.
These two algorithms have consistently demonstrated reliable performance in text classification tasks across different topics.
A previous study .
compared NB and SVM on a dataset containing 2956 entries related to working hours' impact on Generation ZAos mental health.
The study reported 84% accuracy for NB and 91% for SVM, indicating that SVM consistently achieved superior performance.
However, the study focused solely on work-related stress without addressing other psychological phenomena.
In addition, a study .
evaluated NBAos performance on 1200 data entries concerning adolescent mental health, achieving 80% accuracy.
This research, however, was limited by the absence of multi-model comparisons, thus providing no insight into how other algorithms might perform under similar conditions.
Meanwhile research .
analyzed the potential for depression and anxiety through tweets using SVM, obtaining an accuracy of 82.
Although this study demonstrated the capability of SVM in handling mental health sentiment analysis on social media, it did not offer comparative performance analysis with other While these studies successfully demonstrated the effectiveness of both SVM and NB in sentiment analysis tasks related to mental health issues, they primarily addressed broader psychological topics such as work-related stress, adolescent mental health, and symptoms of depression and None of these studies specifically focused on the Quarter Life Crisis phenomenon, which has become an increasingly relevant mental health issue among young adults on social media platforms.
Moreover, previous works rarely incorporated a direct comparative analysis of the performance of SVM and NB within the same sentiment classification task, especially for public sentiment related to Quarter Life Crisis.
Considering the informal and varied linguistic styles found in social media content, it is essential to evaluate how these two algorithms handle such data in a specific domain.
Therefore, this study addresses the identified research gap by empirically comparing the performance of SVM and NB algorithms in classifying public opinion on Quarter Life Crisis using data collected from social media platform X.
In addition, this study utilizes the InSet lexicon for sentiment labeling, which has not been widely integrated in previous Quarter Life Crisis sentiment studies.
METHODOLOGY
This research employs a comparative method to assess and differentiate the effectiveness of SVM and NB models in analyzing community opinions related to the Quarter Life Crisis phenomenon on platform X.
The model implementation in this research consists of several systematic stages that are carried out systematically, starting from the data collection process to the evaluation of model performance.
The research process begins with collecting data through crawling techniques on the platform X.
Then, the crawled data will go through a preprocessing process, which consists of several steps, namely cleaning, case folding, tokenization, stopwords removal and stemming.
After the data is processed, a sentiment labeling stage is performed using the lexicon method to categorize the data based on whether the sentiment expressed is positive or negative.
The next stage is feature extraction by applying the TF-IDF The weighted data is then classified using SVM and NB techniques.
Furthermore, classification outcomes are assessed through a confusion matrix to derive the metrics of accuracy, precision, recall, and F1-score, which reflect how well each model performs.
As a complement to the quantitative evaluation process, this study also visualizes sentiment data by presenting it through a word cloud representation.
This visualization illustrates a group of frequently occurring terms found within both positive and negative tweets, providing an overview of dominant word patterns in discussions surrounding the Quarter Life Crisis as expressed by users on platform X.
The overall research procedure is illustrated in Figure 1.
p-ISSN 2301-7988, e-ISSN 2581-0588 DOI : 10.
32736/sisfokom.
Copyright A2025 Submitted : Mey 28, 2025.
Revised : June 25, 2025.
Accepted : July 1, 2025.
Published : July 28, 2025 Jurnal SISFOKOM (Sistem Informasi dan Kompute.
Volume 14.
Nomor 03.
PP 298-305 individual words.
The procedure proceeds with the removal of stopwords, which are frequently occurring words that carry minimal significance in sentiment analysis.
The last stage is stemming, which converts words that have affixes into their basic form using Indonesian language algorithms.
In this study, the Sastrawi library was employed for stemming, as it is a widely used and reliable Indonesian stemming algorithm capable of accurately removing affixes in Bahasa Indonesia All of these preprocessing stages resulted in a cleaner, more structured dataset, prepared for application in the subsequent sentiment analysis stage.
Data Labeling Data labeling is one of the important stages in preparing raw data before it is used in training machine learning models.
the sentiment analysis model training process, each tweet needs to be labeled in the form of positive or negative sentiment as reference data that will be used by the model to learn classification patterns .
This stage aims to annotate the data in the form of meaningful labels, so that the model can recognize, understand, and classify new data based on patterns that have been learned.
Fig 1.
Research Procedure Data Collection In the data collection process, there are two methods that can be used, namely manual methods and automated methods.
Manual methods are done by manually copying data directly from certain sources, while automated methods use the help of coding, applications, or browser extensions to collect data more quickly and efficiently.
This study adopts an automatic data collection method is used by utilizing the Python library and crawling techniques through Google Collab.
The dataset used comes from public comments and opinions on platform X, which were collected during the period November 2024 to January 2025.
Preprocessing Data At this phase, preprocessing data is performed as a preliminary step prior to conducting sentiment analysis.
Preprocessing involves transforming raw and unorganized text into an organized structure that enables effective analysis .
This stage aims to prepare text data by cleaning and normalizing the dataset so that the data quality becomes optimal for the classification process.
The initial phase of preprocessing starts by performing a cleaning step that eliminates non-essential components from the dataset, including symbols, punctuation marks, emojis.
URLs, mentions, numbers, and numeric characters.
Subsequently, a case folding procedure is carried out to convert every character within the dataset into lowercase letters, ensuring consistency in text format.
Following this, the cleaned data undergoes tokenization, a process that segments the text into smaller units, typically The sentiment labeling process in this study is conducted by employing a lexicon-based approach using the InSet Indonesian Sentiment Lexicon .
InSet is an open-source sentiment lexicon specifically compiled for Bahasa Indonesia, consisting of categorized positive and negative word lists.
Each word in the dataset was matched against the InSet lexicon, and a sentiment value was assigned by analyzing how often positive and negative terms appeared within each tweet.
The final sentiment category was determined by comparing the total positive and negative scores for each text.
Word Weighting Term weighting is the process of giving weight to each word in text data to optimize the performance of sentiment analysis in the text mining process .
The weighting method applied in this study is TF-IDF.
TF-IDF serves as a common method for feature extraction and word weighting, determining a wordAos significance within a dataset by evaluating its frequency both in specific documents and across the whole corpus .
This study employed TF-IDF as it efficiently emphasizes significant terms by accounting for their occurrence in individual documents and their infrequency throughout the full This characteristic makes TF-IDF particularly suitable for processing short and informal texts commonly found on social media platforms.
The TF-IDF method works by switching two key elements, namely Term Frequency (TF) and Inverse Document Frequency (IDF).
The TF is computed by counting how many times a given term occurs within a single text sample, whereas IDF reflects how uncommon that word is across the whole set of documents.
The result of multiplying the two components produces the final weight for each word in the The TF-IDF weighting calculation process can be formulated in Equation 1 .
, where wi represents the i-th word, d denotes the document.
TF.
cycn , yc.
indicates the frequency of word wi within document d, and IDF.
cycn ) represents the inverse document frequency value related to the term wi .
p-ISSN 2301-7988, e-ISSN 2581-0588 DOI : 10.
32736/sisfokom.
Copyright A2025 Submitted : Mey 28, 2025.
Revised : June 25, 2025.
Accepted : July 1, 2025.
Published : July 28, 2025 Jurnal SISFOKOM (Sistem Informasi dan Kompute.
Volume 14.
Nomor 03.
PP 298-305 ycNya O yayaya = ycNya.
i , ycc ) O yayaya.
While the IDF value is calculated using Equation 2 .
, where N represents the overall count of documents, while DF.
cycn ) indicates how many documents include the term wi .
ycA yayaya.
i ) = yaya.
) .
Model Analysis The analysis of the algorithmic models involves a comprehensive and detailed comparison between SVM and NB, conducted independently with the objective of determining which model provides the most optimal accuracy in classifying sentiment data.
Support Vector Machine (SVM) SVM is among the most frequently utilized algorithms in the machine learning domain, especially in solving classification and regression problems .
In the classification process.
SVM is able to produce optimal performance even on limited datasets.
This algorithm functions as a binary classifier that divides data into two classes using a dividing line or The hyperplane is positioned to optimize the separation distance from itself to the nearest data points representing opposing categories, a distance commonly known as the Points located precisely on the margin are known as support vectors .
The mathematical equation of the hyperplane in SVM is expressed in Equation 3 .
ycu yca = 0 In this formulation, w signifies the vector of weights, x indicates the feature vector, while b serves as the bias value or .
Nayve Bayes (NB) NB is a commonly applied classification technique in the domain of text mining, especially to perform sentiment This method is based on the principle of Bayes Theorem.
The method has good potential in terms of accuracy and data calculation efficiency.
One notable advantage of Nayve Bayes is its capability to perform classification effectively even with limited training data, while still maintaining the accuracy of the parameter estimates needed for the classification task .
The basic equation of the Bayes Theorem method can be written in Equation 4 .
ycU) = ycE.
cU) .
In this equation.
P(H|X) indicates the updated probability of a hypothesis H given evidence X.
The component P(X|H) reflects the chance of encountering evidence X assuming hypothesis H is valid.
Additionally.
P(H) expresses the initial belief or prior estimate of hypothesis H, while P(X) describes the marginal likelihood of observing X independently.
These probabilities are computed to identify the class that most likely corresponds to the input data, based on the distribution patterns learned from the training dataset.
Model Evaluation To determine the effectiveness of the classification approach, performance is assessed through a set of measurement indicators, including accuracy, precision, recall, and a confusion matrix, which collectively reflect the modelAos ability to achieve its classification objectives .
This evaluation produces a confusion matrix that describes the distribution between correct and incorrect predictions, as well as a classification report that contains precision, recall, and F1score values.
Precision serves as a performance measure that quantifies the proportion of positive outputs generated by the model that genuinely fall into the positive class, reflecting the reliability of positive classification outcomes.
Meanwhile, recall refers to the modelAos capacity to correctly detect all occurrences of the actual positive class within the dataset, highlighting its sensitivity.
The F1-score serves as a unified measure that merges precision and recall through their harmonic mean, offering a better overall assessment of model behavior by reflecting both its precision in minimizing false positives and its effectiveness in capturing true positives.
Evaluation through the confusion matrix involves comparing the modelAos predicted outcomes with the true labels by categorizing results into four groups: true positive (TP), true negative (TN), false positive (FP), and false negative (FN) .
Based on the values of the components, the accuracy, precision, and recall values are calculated using Equations .
, .
, and .
ycaycaycaycycycaycayc = ycNycE ycNycA ycNycuycycayco ycNycE ycyycyceycaycnycycnycuycu = ycNycE yaycE ycNycE ycyceycaycaycoyco = ycNycE yaycA In Equation 5, accuracy is determined by the proportion of correct predictions comprising true positives (TP) and true negatives (TN) relative to the overall count of evaluated data Equation 6 defines precision as the ratio between true positive outcomes (TP) and the sum of all instances labeled as positive by the model, comprising both true positives and false positives (TP FP).
In contrast.
Equation 7 quantifies recall by calculating the proportion of correctly identified positive cases (TP) in relation to the total count of actual positives present in the dataset (TP FN).
These evaluation indicators serve to examine the reliability and effectiveness of each method in sentiment classification.
IV.
RESULTS AND DISCUSSION
Data Collection At this stage, data is collected by using the crawling method to retrieve public opinions related to Quarter Life Crisis phenomenon on platform X.
Through the crawling process, this study successfully obtained 1120 tweets containing user comments and opinions.
All collected data is then stored as a dataset for preprocessing and sentiment analysis.
Table I presents the results of the data collection process obtained through crawling techniques on platform X.
p-ISSN 2301-7988, e-ISSN 2581-0588 DOI : 10.
32736/sisfokom.
Copyright A2025 Submitted : Mey 28, 2025.
Revised : June 25, 2025.
Accepted : July 1, 2025.
Published : July 28, 2025 Jurnal SISFOKOM (Sistem Informasi dan Kompute.
Volume 14.
Nomor 03.
PP 298-305 TABLE I.
SAMPLE COMMENT DATA
No.
Comment @daesxie memasuki life quarter crisis serba bingung mau ngapain besoknya tanggal merah tapi malamnya lembur pulang2 ketiduran di mess sampai sore part jalam2nya manaa?indahnya fase quarter life crisis ternyata quarter life crisis itu bener bener ada ya ges ya quarter life crisis ternyata parah bgt yah Jatuh cinta di usia quarter life crisis semenyeramkan dan secapek ini Preprocessing Data As outlined in the research methods section, the preprocessing phase in this study includes several key steps, specifically data cleaning, case folding, tokenization, stopwords removal, and stemming.
This stage focuses on refining and organizing textual content to ensure it is well structured and suitable for subsequent sentiment analysis.
After going through the preprocessing stage, the amount of data was slightly reduced.
From a total of 1120 initial data collected, 1 entry was deleted because it did not meet the eligibility criteria, such as blank or only contains symbols.
URLs, or irrelevant characters.
Therefore, a total of 1119 data entries were retained to proceed with the subsequent analytical Table II displays the result of the preprocessing process that has been carried out.
TABLE II.
DATA PREPROCESSING RESULTS
Dataset @daesxie memasuki life quarter crisis serba bingung mau ngapain TABLE i.
LABELING RESULTS
No.
Comment Score Sentiment @daesxie memasuki life quarter crisis serba bingung mau ngapain Negative besoknya tanggal merah tapi malamnya lembur pulang2 ketiduran di mess sampai sore part jalam2nya manaa?indahnya fase quarter life crisis Negative ternyata quarter life crisis itu bener bener ada ya ges ya Positive quarter life crisis ternyata parah bgt yah Negative Jatuh cinta di usia quarter life crisis semenyeramkan dan secapek ini ternyata Positive Table i presents the sentiment labeling results obtained from the score calculation using the lexicon formula.
Comments are categorized as positive sentiment if the score is > 0, and negative if O 0.
Out of a total of 1119 processed data points, the analysis revealed that 651 comments were classified as negative, while 468 comments were categorized as positive.
Support Vector Machine (SVM) Classification The subsequent phase of this study involves conducting the classifying data process by utilizing the SVM technique.
The SVM model is implemented with an 80:20 data division scheme, employing 80% of the dataset to train the model and reserving 20% for testing.
Figure 2 displays the classification results obtained from applying this SVM algorithm to the test data.
Based on these results, 79 comments were classified into the positive sentiment category, while 145 comments were classified into the negative sentiment category.
Preprocessing Cleaning memasuki life quarter crisis serba bingung mau Case Folding memasuki life quarter crisis serba bingung mau Tokenisasi .
emasuki, life, quarter, crisis, serba, bingung, mau.
Stopwords .
emasuki, life, quarter, crisis, serba, bingung.
Stemming .
asuk, life, quarter, crisis, serba, bingung, ngapai.
Data Labeling The collected data then goes through a sentiment labeling stage to classify each comment into positive or negative Labeling in this study is performed by applying a lexicon-based method, where each comment is given a score based on the count of sentiment-bearing words classified as positive or negative.
Fig 2.
SVM Classification Results Nayve Bayes (NB) Classification Besides employing the SVM algorithm, classification of the data is also performed utilizing the Nayve Bayes (NB) method.
Just like in the previous stage, data division is done using an 80 to 20 division, 80% of the data is designated for training the model, while the other 20% is used to test its performance.
Figure 3 shows the classification results obtained from p-ISSN 2301-7988, e-ISSN 2581-0588 DOI : 10.
32736/sisfokom.
Copyright A2025 Submitted : Mey 28, 2025.
Revised : June 25, 2025.
Accepted : July 1, 2025.
Published : July 28, 2025 Jurnal SISFOKOM (Sistem Informasi dan Kompute.
Volume 14.
Nomor 03.
PP 298-305 applying the NB algorithm to the test data.
Based on the classification results, 61 comments are categorized as positive sentiments, while 163 comments fall into the negative sentiment category.
Fig 6.
NB Confusion Matrix Results Fig 3.
NB Classification Results Model Evaluation After the classification process is completed, the next step involves assessing the modelAos performance to evaluate the accuracy and effectiveness of the implemented algorithm.
This assessment utilizes a confusion matrix to derive the accuracy, precision, recall, and F1-score metrics for every classification The classification modelAos performance utilizing the SVM algorithm is illustrated in Figures 4 and 5, whereas the results of the modelAos performance using the NB algorithm are presented in Figures 6 and 7.
Fig 7.
NB Evaluation Results Referring to the performance assessment.
SVM produced an accuracy score of 83%, with corresponding precision, recall, and F1-score values of 83%, 82%, and 83%, respectively.
Conversely, the NB method recorded an accuracy of 74%, with corresponding precision, recall, and F1-score values of 76%, 71%, and 71%.
Word Cloud Visualization As explained in the Methodology section, this study successfully produced visualizations in the form of two word clouds that illustrate the frequency of the word occurrences in comments with positive and negative sentiments.
Figure 8 displays a word cloud visualization containing frequently appearing terms from tweets identified as having positive Fig 4.
SVM Confusion Matrix Results Fig 8.
Positive Word Cloud Fig 5.
SVM Evaluation Results Figure 8 shows the dominant words that appear in tweets with positive sentiments related to Quarter Life Crisis (QLC).
The words reflect various expressions that contain positive and constructive meanings.
To strengthen the analysis.
Table IV p-ISSN 2301-7988, e-ISSN 2581-0588 DOI : 10.
32736/sisfokom.
Copyright A2025 Submitted : Mey 28, 2025.
Revised : June 25, 2025.
Accepted : July 1, 2025.
Published : July 28, 2025 Jurnal SISFOKOM (Sistem Informasi dan Kompute.
Volume 14.
Nomor 03.
PP 298-305 displays several examples of positive sentences extracted from the dataset.
TABLE IV.
SAMPLE POSITIVE SENTENCES EXTRACTED FROM THE DATASET
Sentences in the dataset ternyata quarter life crisis itu beneran ada dan nyata.
kayak rawan hilang semangat hopeless no energy feeling lonely tiba-tiba semangat tapi tiba tiba hilang semangat lagi.
Apa sih Quarter-Life Crisis itu? QLC adalah fase di mana kita merasa stuck antara harapan vs kenyataan hidup.
Banyak yang mempertanyakan:
Aku ini siapa? Mau jadi apa di masa depan? Kenapa pencapaianku nggak segemilang mereka? Kalian nggak sendirian kok.
#QuarterLifeCrisis In addition.
Figure 9 illustrates a word cloud highlighting the most frequently occurring words in data categorized as having negative sentiment.
Fig 9.
Negative Word Cloud Figure 9 can represent the dominant words in tweets that show negative sentiments toward the QLC phenomenon.
The words that appear reflect expressions with unfavorable or critical content.
Table V provides several examples of negative sentences from the dataset.
TABLE V.
SAMPLE NEGATIVE SENTENCES EXTRACTED FROM THE DATASET
Sentences in the dataset Part tersedih di quarter life of crisis adalah minta maaf ke ibu karena merasa gagal huhu maaf ya bu Udah tamat 2 bln nih tp blm sampe fase quarter life crisis ngadep kanan ada yg lolos bumn ngadep kiri cpns ngadep depan ada yg udh nikah tp blm ada yg lolos pcpm/pcs/bps sih Discussion The classification performance evaluation demonstrates that the Support Vector Machine (SVM) algorithm outperforms Nayve Bayes (NB) in classifying public sentiment related to Quarter Life Crisis (QLC).
Compared to NB, the SVM model produced superior results across all performance metrics, including accuracy, precision, recall, and F1-score.
This superiority is primarily attributed to SVMAos strength in handling sparse and high-dimensional data structures efficiently, a common trait of short, informal text content on social media platforms like X.
The hyperplane-based classification mechanism in SVM can optimally separate classes by maximizing the margin between positive and negative sentiment groups, leading to better generalization and reduced misclassification.
In contrast, the lower performance of NB can be linked to its assumption of word independence within a document, which is less applicable in the context of sentiment analysis on social Tweets often contain contextually dependent phrases, colloquial expressions, and abbreviations, reducing the effectiveness of NBAos probabilistic classification approach.
These findings are consistent with previous studies, such as .
, which reported similar trends where SVM achieved superior accuracy in sentiment classification tasks compared to NB.
The implications of these results highlight that SVM holds significant potential for supporting sentiment-based monitoring systems, particularly in identifying public expressions of mental health issues through social media.
This model could be integrated into digital early detection frameworks for mental health services or governmental social media monitoring initiatives to track negative public sentiment trends associated with Quarter Life Crisis.
Additionally, the insights gained from this study can assist practitioners and policymakers in formulating targeted interventions and mental health awareness campaigns that are responsive to the sentiments expressed by the public.
Future studies are encouraged to investigate deep learning approaches, including models like LSTM and BERT, which have demonstrated superior performance in handling contextual and sequential text data.
These models are expected to capture the contextual nuances and dependencies between words more effectively than conventional machine learning algorithms.
Furthermore, future studies should consider expanding the sentiment categories beyond binary classification by including a neutral sentiment class, providing a more comprehensive analysis of public opinion dynamics related to Quarter Life Crisis.
CONCLUSION
The sentiment analysis conducted in this study centers on public opinion concerning the Quarter Life Crisis (QLC) phenomenon, utilizing the SVM and NB classification methods.
Both methods are tested to measure their effectiveness in categorizing opinion-based data.
The data used was obtained through a crawling process from platform X totaling 1120 After the preprocessing stage, 1119 data were successfully cleaned and ready for analysis, with a composition of 651 negative comments and 468 positive comments.
In the test with a data split ratio 80:20, the SVM method classified 145 comments as negative and 79 comments as Meanwhile, the NB method classified 163 comments as negative and 61 comments as positive.
The performance assessment of both classification methods indicates that the SVM algorithm achieved an accuracy of 83%, with a precision of 83%, recall of 82%, and F1-score of 83%.
Conversely, the NB algorithm recorded an accuracy of 74%, precision of 76%, recall of 71%, and an F1-score of 71%.
Findings from the evaluation indicate that the SVM approach performs more effectively than the NB technique in categorizing sentiment from public discourse related to the QLC phenomenon.
This study contributes by providing empirical evidence of p-ISSN 2301-7988, e-ISSN 2581-0588 DOI : 10.
32736/sisfokom.
Copyright A2025 Submitted : Mey 28, 2025.
Revised : June 25, 2025.
Accepted : July 1, 2025.
Published : July 28, 2025 Jurnal SISFOKOM (Sistem Informasi dan Kompute.
Volume 14.
Nomor 03.
PP 298-305 the comparative performance between SVM and NB algorithms in sentiment analysis on mental health-related topics, specifically Quarter Life Crisis discussions on social media.
These research outcomes can be utilized as a reliable source for academic reference and development, mental health practitioners, and policy-makers in designing early detection strategies and digital mental health monitoring systems.
However, this study is limited by its reliance on a single social media platform and the use of a lexicon-based labeling approach, which may not fully capture the nuanced meanings in informal and context-dependent tweets.
Upcoming studies are advised to broaden data sources by including various social media platforms and to investigate more sophisticated NLP techniques, such as word embeddings and deep learning Additionally, a neutral sentiment category may lead to a deeper and more complete analysis of how public opinion is shaped around the Quarter Life Crisis.
REFERENCES