OPEN ACCESS ISSN 2356-5462 http://socj. id/ijoict/ Intl. Journal on ICT Vol. No. Jun 2024. doi: doi. org/10. 21108/ijoict. Application of SSA Decomposition in LSTM and GRU Models for Crude Oil Price Forecasting Annisa Martina 1*. Irwan Girana 2. Rini Cahyandari 3 Mathematics Department. Faculty of Science and Technology. UIN Sunan Gunung Djati Bandung Jalan A. Nasution 105. Bandung. Indonesia *annisamartina@uinsgd. Abstract The world price of crude oil is very influential on the rate of inflation and the rate of economic therefore, it is indispensable that the best methods for forecasting future prices with high accuracy are indispensable. To improve forecasting accuracy, various forecasting methods continue to be developed, for example the Hybrid model. This model can improve the forecasting accuracy of a single model. Hybrid Singular Spectrum Analysis (SSA)-Long Short-Term Memory (LSTM) and hybrid Singular Spectrum Analysis (SSA)-Gated Recurrent Unit (GRU) models combine the concept of decomposition with forecasting. Hybrid forecasting works with through two stages. First. SSA breaks the data down into trend, seasonal, noise and residue components. Second, decomposition outcome data were predicted using LSTM and GRU models. This Hybrid model has been shown to improve forecasting accuracy. Using weekly data on world crude oil opening prices. SSA-LSTM can improve forecasting accuracy by 78% over LSTM forecaster models. This is seen from respectively the RMSE values of 4. 36 and the MAPE of 5. 2% changing to 0. 97 and 1. Whereas in SSA-GRU can improve forecasting accuracy by 79% than single forecasting models GRU. This is seen from respectively RMSE values of 4. 86 and MAPE 6. 33% changing to 1. 01 and Keywords: Forecasting. Singular Spectrum Analysis. Long Short-Term Memory. Gated Recurrent Unit. SSAAeLSTM. SSAAeGRU INTRODUCTION rude oil is a dark-colored liquid formed from the remains of dead organisms that were buried in layers of soil and rock millions of years ago. Crude oil can be processed into energy sources such as vehicle fuel, power plants, and others. Crude oil can also be referred to as petroleum. Crude oil prices often experience spikes on international markets. The surge is due to the fact that this oil has an influence in driving the economy . Crude oil prices are very influential on the rate of inflation and the rate of economic growth . Therefore, it is indispensable that the best methods for forecasting future prices with high accuracy are indispensable. Forecasting is an activity that can predict future events. There are many classic methods that are often used in forecasting time series, namely Autoregressive (AR). Moving Average (MA). Autoregressive Integrated Moving Average (ARIMA), etc . Some classical forecasting methods rely on the assumption of stationarity, data that remains constant over time. However, there are many situations when data is non-stationary such as having significant trends or random fluctuations. This method assumes a linear relationship. If the forecasted Received on 22 Nov 2023. Revised on 22 Dec 2023. Accepted and Published on 30 Jun 2024. ANNISA MARTINA ET AL. APPLICATION OF SSA DECOMPOSITION IN LSTM AND GRU MODELS FOR CRUDE OIL PRICE FORECASTING data has nonlinear properties and complex patterns, this method may not be able to provide accurate and adequate results. This method is also susceptible to the influence of outlier data, namely extreme data points that are significantly different. Some forecasting methods assume that the data depends on a certain distribution such as the normal distribution. If the data does not match these assumptions, the forecast resulting from the method is inaccurate. Therefore, these methods have limitations to some degree. Over time, other forecasting methods were introduced. One method that is currently popular is ANN forecasting method. ANN can predict linear or nonlinear data and can learn complex patterns. The ANN method has the ability to learn complex relationships between input and output that are difficult to understand by classical forecasting methods. This method can also work well on large datasets with many input variables. some cases, the resulting forecasts are more accurate than classical forecasting methods. Examples of forecasting methods that use the ANN concept are LSTM and GRU. According to K. ArunKumar a, et al . stated that the ARIMA model is suitable and performs well in modeling data that has a linear Meanwhile. Reccurent Neural Network (RNN)-based models such as LSTM and GRU have better performance on data that tends to be nonlinear . In research by Oxaichiko Arissintaa. I, et al . , it was found that forecasting results using the ARIMA method had a greater error value than using LSTM and GRU . Apart from ANN, a new forecasting method was introduced, namely the hybrid model . A Hybrid Model is an approach that combines two or more different forecasting methods. This approach aims to utilize the advantages of each model so that it can overcome the limitations of a single forecasting model. Combining different approaches can improve the results of a single forecast. Some forecasting data is complex, such as containing nonlinear patterns and unstable fluctuations, so hybrid models make it possible to deal with complexity better. Hybrid models can be tailored to specific needs and data characteristics. This flexibility provides the advantage of being able to choose the most suitable modeling approach for each situation. example of a hybrid model forecasting is combining the concept of decomposition with forecasting. The principle of decomposition is to separate a data series into several simple parts. This separation is useful for increasing accuracy in forecasting and helps understand the behavior of data series better . SSA is a non-parametric data analysis method that functions to separate time series data into several simpler components such as trends, cycles, seasonality and random elements. This method can ignore assumptions in classical time series analysis such as stationarity assumptions and does not need to carry out logarithmic transformations . If the decomposed data contains a strong trend component, then this component is not suitable for modeling with ARIMA because it has non-stationary properties. So, if you forecast using SSAARIMA model, it is not very suitable. In research by Qi Tang, et al . , it was stated that forecasting using the hybrid SSAAeLSTM model had better accuracy than other models . Based on the research of K. Wang, et al . SSA-LSTM model was built for each intrinsic mode function (IMF) components and one residual (RES) component, which effectively extracted the predicted values . According to research from Y. Yang et al, . SSA-GRU model has a stronger predictive impact to increase the reliability of electric load forecasts. In research. Chao Li, et al . stated that the hybrid SSA-GRU forecasting model had quite good forecasting results compared to the single model . Therefore, based on the problems that have been explained, this research will propose a hybrid SSA-LSTM and SSA-GRU forecasting model which can overcome the limitations and improve the forecasting results of a single forecasting model, especially when faced with nonlinear data. It is hoped that this research can be used as a reference to predict time series data better. INTL. JOURNAL ON ICT VOL. NO. JUNE 2024 II. LITERATURE REVIEW Singular Spectrum Analysis (SSA) SSA is a time series data analysis technique used for forecasting by combining elements of classical time series analysis, multivariate geometry, multivariate statistics, dynamic systems and signal processing . The aim of SSA is to decompose data into small, identifiable components such as trend, seasonality and noise. Embedding: Embedding is the step of changing time series data into an X path matrix, from initially being one-dimensional data to multidimensional data. For example, a time series ya = . A , yceycA ) with length ycA. Assume ycA > 2 and ya is a nonzero series that has at least one i so that yce1 O 0. The path matrix is ya y ya and L is the window length, which is an integer with the condition 2 < ya < ycA and forms ya = ycA Oe ya 1. The path matrix X is called a Hankel matrix because all its anti-diagonal elements have the same elements, written as yce1 ycU = . cuycnyc ]ycn,yc=1 = [ U U yceya A yceya 1 U yceyaya Singular Value Decomposition (SVD): SVD aims to obtain component separation from the decomposition of time series data. Given ycI = ycUycU ycN , let yuI1 . A , yuIya be the eigenvalues of matrix S with yuI1 Ou U Ou yuIya Ou 0and ycO1 . A , ycOya be the eigenvectors of matrix S which correspond to the eigenvalues. Let = max . cn, ycycuycycyco yuIycn > . , if ycCycn = ycU ycN ycOycn /OoyuIycn for ycn = 1, 2. A , ycc, then the SVD of the path matrix X is obtained: ycU = ycU1 ycU2 U ycUycc Where X_ a ycUycn = OoyuIycn ycOycn ycCycnycN . The set . uIycn , ycOycn , ycCycn ) is called the ith eigentriple of SVD. Grouping: This step is groups the ycUycn matrix to separate the eigentriple components into several subgroups, trend, seasonality and noise. This process groups the index set . cn, 2. A , yc. into m mutually exclusive subsets, ya1 , ya2 . A , yayco _m with yco = ycc. For example, ya = . cn1 , ycn2 . A , ycnyco }, then the matrix ycUya corresponding to group I is defined as ya = ya1 , ya2 . A yayco . Then ycUycn = ycU1 ycU2 U ycUycc can be expanded to become: ycUya = ycUya1 ycUya2 U ycUyayco The selecting the set ya = . a1 , ya2 . A , yayco }is called eigentriple grouping. Determining group members can be seen with the scatter plot of ycUya , where those who have almost the same shape are classified into similar groups. Diagonal Averaging: The final step is to change each matrix from equation . into a new series with length N. For example, matrix Y is an ya y ya matrix with elements ycycn,yc , 1 O ycn O ya, 1 O yc O ya for ya O ya. Given yaO = min. a, y. , ya O = max. a, y. , ycu = ya ya Oe 1, ycycnycO = ycycnyc if ya < ya and ycycnycO = ycycycn if ya > ya. Y=[ U ycUya ycUya 1 U ] ycUyaya Matrix Y will be converted into the series yci0 , yci1 . A , yciycuOe1 in the following way: ANNISA MARTINA ET AL. APPLICATION OF SSA DECOMPOSITION IN LSTM AND GRU MODELS FOR CRUDE OIL PRICE FORECASTING Oc ycyco,ycoOeyco 2 0 O yco < yaO Oe 1 yco=1 yaO Oc ycyco,ycoOeyco 2 yciyco = yaO Oe 1 O yco < ya O yco=1 ycAOeya O 1 cA Oe yco ycyco,ycoOeyco 2 yaO O yco < ycA yco=ycoOeya O 2 For yco = 0, yci0 = yc1,1 , for yco = 1, yci2 = . c1,2 yc2,1 ) and so on. Long Short-Term Memory (LSTM) LSTM is a variation of RNN . LSTM was developed specifically to overcome the problem of gaining a good understanding of the long-term context in time series data. Fig. 1 is an illustration of the LSTM LSTM can capture complex nonlinear relationship patterns in time series data, making it possible to make more accurate forecasts. Long-term memory cells in LSTM can AurememberAy important information from the past that can influence future forecasts, so they can learn and remember long-term patterns in the data. LSTM uses gates to control the flow of information in the network so that it can help with vanishing gradient and exploding gradient problems. LSTM has four main structures, i. Forget Gate: Gate to determine what information will be removed from the cell state. yceyc = yua. ycOyce EaycOe1 . ycOyce ycayce ) . Input Gate: Gate to determine which part will be updated. yayc = yua. ycOycn EaycOe1 . ycOycn ycaycn ) yaCyc = tanh. ycOyca EaycOe1 . ycOyca ycayca ) . Cell State: Important information is stored and accessed when the network sees new data. ycayc = yceyc O ycaycOe1 yayc O yaCyc Output Gate: Functions to control how information in the cell state will be used to produce output ycuyc = yua. ycOycu EaycOe1 . ycOycu ycaycu ) . Eayc = ycuyc O tanh. cayc ) . Gated Recurrent Unit (GRU) GRU is a simpler variation of LSTM. GRU can combine many of the same concepts, but has a much simpler structure so it can train more quickly on hidden layers . The purpose of creating a GRU is to sort all the information correctly that will be used in making decisions because not all information from the past can be used in making decisions. Fig. 2 is an illustration of the GRU architecture. INTL. JOURNAL ON ICT VOL. NO. JUNE 2024 Fig. LSTM Architecture . Fig. GRU Architecture . GRU can capture long-term patterns of time series data so that it can deal with forecasting involving complex The GRU gate mechanism allows organizing information in the network well so that it can learn long-term dependencies and avoid vanishing gradients and exploding gradients. GRU has two main structures, . Update Gate. A gate that functions to decide how much past information will be stored. ycsyc = yua. ycOyc EaycOe1 . ycOyc ycayc ) . Reset Gate. Useful in determining the combination of new input information with past information. ycIyc = yua. ycOyc EaycOe1 . ycOyc ycayc ) . yaCyc = tanh. ycOEa . cI1 O EaycOe1 ). ycOEa ycaEa ) . Eayc = . Oe ycsyc ) O yaCyc EaycOe1 O ycs1 ANNISA MARTINA ET AL. APPLICATION OF SSA DECOMPOSITION IN LSTM AND GRU MODELS FOR CRUDE OIL PRICE FORECASTING Where: yceyc yayc yaCyc ycayc ycuyc : forget gate : input gate : calon cell state : cell state : output gate Eayc ycsyc ycIyc ycuyc yua : hidden state : update gate : reset gate : data input : sigmoid function ycycaycuEa ycO ycO yca : tanh function : weight : recurrent weight : biased Hybrid SSAAeLSTM and Hybrid SSAAeGRU The Hybrid SSAAeLSTM and Hybrid SSAAeGRU model is a combination of the SSA decomposition concept with LSTM and GRU forecasting. The SSA decomposition method can break down a data series into several components such as trend, seasonality, and noise. The aim of the decomposition stage is so that the forecasting model can learn patterns better because the data has been broken down based on its respective characteristics. The SSA method is used as a preprocessing stage before entering the forecasting stage with LSTM and GRU . Next, each SSA component will be forecasted using LSTM and GRU. The forecasting results for each component will be reconstructed by adding them according to the decomposition principle. Mathematically, the final result of forecasting can be represented as follows. ycUCyc = ycNCyc yaCyc yaCyc yuACyc Where: ycUCyc ycNCyc yaCyc yaCyc yuACyc : Hybrid forecasting in period t : forecast of the SSA trend component in period t : forecast of SSA seasonal components in period t : forecast of the SSA noise component in period t : forecast of SSA residual components in period t i. RESEARCH METHOD The method used is hybrid SSA-LSTM and hybrid SSA-GRU forecasting model. In Fig. 3 is a research flowchart, which consists of hybrid SSA-LSTM, hybrid SSA-GRU, and a single method without hybrid which is LSTM and GRU. Then, the study will be carried out in three stages, namely the first stage of data description analysis, the second stage of decomposition, and the third stage of forecasting. Descriptive Analysis This stage is carried out to see in detail the data to be processed. Flowchart descriptive analysis can be seen in Fig. As for this stage, several steps will be carried out as follows: Data Description. The data used will be described in the form of graphs, in addition, simple measurements such as mean, standard deviation, maximum and minimum will be carried out. Stationarity Test. This stage will check the data used to be stationary or nonlinear with graph test and ADF The graph test can be viewed visually by performing a plot between the observation value and the time Stationary data occurs when the graph tends to be constant and has no interesting or decreasing trend. The unit root test is a formal concept used to determine stationarity. INTL. JOURNAL ON ICT VOL. NO. JUNE 2024 Fig. Research Flowchart The test was developed by D. Dickey and W. Fuller and became known as the Augmented Dickey-Fuller Test (ADF Tes. ADF test is the most widely used type of test today because it can take into account the presence of autocorrelation problems that often occur in data time series . The ADF test is determined by comparing the statistical value of the ADF with its critical value. It can also be compared with alpha () confidence degrees with a value of 1%, 5%, or 10%. If the result obtained is that the statistical value of ADF is lower than the degree of confidence, the null hypothesis is rejected. It can be concluded that the data is Hypothesis: ya0 : non-stationary data ya1 : stationary data If p-value < yu then ya0 is rejected, the opposite is true. Terasvirta Test. This stage will check the data used to be linear or nonlinear with the Terasvirta test. Nonlinear refers to the context of a relationship pattern that does not follow an aligned relationship pattern. Nonlinear data show irregular complex patterns, do not follow specific patterns or no disproportionate change between input and output variables . Nonlinearity testing aims to ascertain whether a data follows a linear or nonlinear pattern. The Terasvirta test is one of the tests often used in nonlinearity testing. This test is better than the White test and Ramsey test in detecting nonlinearities . The Terasvirta test belongs to the Lagrange ANNISA MARTINA ET AL. APPLICATION OF SSA DECOMPOSITION IN LSTM AND GRU MODELS FOR CRUDE OIL PRICE FORECASTING Multiplier (LM) test group, in addition, this test is developed from neural networks. The steps in the Terasvirta test are as follows: Regressing ycUyc at 1, ycUycOe1 . A , ycUycOeycy then computes ycC = ycUyc Oe ycUCyc . Regressing ycC at 1, 1, ycUycOe1 . A , ycUycOeycy and m additional predictors of the quadratic tribe from the Taylor expansion approximation results. Determine the coefficient of determination . cI2 ) of the regression in the previous stage. Determine the test statistic ycU 2 = ycuycI2 where n is the number of observations. Hypothesis: ya0 : linear model ya1 : nonlinear model If p-value < degree of confidence (), then ya0 is rejected, true the opposite. Fig. Flowchart Descriptive Analysis Decomposition This stage is the process of decomposition to break data into several components based on their nature. The decomposition stage uses SSA method. Flowchart decomposition can be seen in fig. In general, the stages of SSA are as follows: Embedding. Stage converts the time series data into a trajectory matrix X. The window length parameter (L) is also specified. Singular Value Decomposition (SVD). Stage to obtain the sorting of components from the decomposition of time series data. The result is eigentriple. Grouping. Stage for grouping ycUycn matrices with the aim of separating eigentriple components into several subgroups namely trend, seasonality, and noise. Diagonal Averaging. Stage transforms the matrix from the previous grouping result into a new series with Length N. The final result of the decomposition process is data broken down by parts based on the properties of its components. INTL. JOURNAL ON ICT VOL. NO. JUNE 2024 Fig. Flowchart Descomposition Forecasting After going through the decomposition stage, the data obtained will enter the forecasting stage which is using LSTM and GRU models. In this forecasting it will go through several stages namely: Normalization. This process is done to equalize all the data to be predicted into a scale of zero to one. Segmentation. This data will be separated and grouped into series. Suppose a sequence length of 50 data is selected for each group. Then a series of 50 data from the past will be used to predict the 51st data. Data Division. The data division will be divided into two i. training data used in the formation of training models and test data for useful Implement this model. In general, the data is divided into an 80:20 ratio, with 80% for the training data and 20% for the test data. Model Construction and Parameter Selection. In this forecasting, two models will be used, namely LSTM and GRU models. As for the required parameters, they are as follows: Units, the number of memory cells in the model layer. The number of units will affect the complexity of the model. Batch size, the number of data samples processed simultaneously in one iteration when training or testing a model. Batch size determines how many time steps will be processed simultaneously before weight Dropout, to reduce overfitting that randomly disables a number of units in the network during the training Dropout level selection is the presentation of the disabled unit. Sequence length is the length of the time step used in a batch of data that is fed into the model as input to each iteration. ANNISA MARTINA ET AL. APPLICATION OF SSA DECOMPOSITION IN LSTM AND GRU MODELS FOR CRUDE OIL PRICE FORECASTING Epoch, refers to one full round of the entire training data used to train the model. Each epoch, the model will receive all batches in sequence and perform calculations to update the weights. Learning rate, is used to control how large the change, fast slow weight and bias at each renewal step during model training in order to achieve the desired gradient calculation. Feature, refers to variables that are used as inputs to train the model. Calculation. This is a data training process performing a training process based on forecasting models and predefined parameters. After the training process is completed, the model data will be stored to be applied into the data testing so that the final results will be obtained from the forecasting results. Fig. LSTM and GRU Forecasting Flowchart . Denormalisation. This process is the inverse of normalisation. After the completion of the forecast, the data is still in the state of the value range between zero to one. Therefore, this process will return the data to its original state. INTL. JOURNAL ON ICT VOL. NO. JUNE 2024 Evaluation. of forecasting results with data testing calculated their accuracy with RMSE and MAPE. After going through all stages, the process will return to stages 4 to 6 by updating the parameters. The final result will be taken by looking at the smallest RMSE and MAPE values. LSTM and GRU forecasting flowchart can be seen in Fig. The data used is world crude oil price data. World oil prices are taken from the West Texas Intermediate (WTI) price which is the standard. The data taken is 1113 weekly opening prices from January 6 2003 to April 30 2023 . This data is nonstationary and nonlinear, meaning it has a trend component and has an unstable movement pattern. When decomposing SSA, the parameter window length (L) = 10 will be used with the grouping process divided into 3 so that the final result is data that is broken down into groups of trend, seasonality, noise and residues created from the remains of the decomposition. At the forecasting stage, four scenarios will be used, namely using the LSTM. GRU. Hybrid SSA-LSTM, and Hybrid SSA-GRU models. The final forecasting results will measure the accuracy of the prediction error using Root Mean Square Error (RMSE) and Mean Absolute Percentage Error (MAPE). The forecasting process will use the hyperparameter tuning grid search process to determine optimal initialization parameters. The parameters used are: units . , 25, . , batch sizes . , 4, 16, . , dropout . , sequence length . , . , epochs . , 50, 100, . , learning rate . 1, 0. , and features . IV. RESULTS AND DISCUSSION In the SSA decomposition process, 10 components are obtained based on the previously determined L In the SVD stage, triple eigen is obtained. The eigenvector will be plotted to see the proportion of data sharing from the 10 components (Fig. These components will enter the grouping process to be united based on their similar characteristics. To see components that have similarities more clearly, a W-correlation graph will be shown. W-correlation is useful for seeing the correlation between different components. A higher correlation value indicates a strong relationship between two or more components. The higher the relationship, the darker the color obtained, the opposite applies (Fig. The results of the W-correlation value will be divided into three components, namely trend, seasonality and noise. The results are obtained in Table I. Fig. Eigenvector Graph of World Crude Oil Data Fig. W-correlation Graph for World Crude Oil Data ANNISA MARTINA ET AL. APPLICATION OF SSA DECOMPOSITION IN LSTM AND GRU MODELS FOR CRUDE OIL PRICE FORECASTING TABLE I GROUPING BASED ON CHARACTERISTICS Characteristics Trends Seasonal Noise ycyc ycUya1 = ycU1 ycUya2 = ycU2 ycU3 ycU4 ycUya3 = ycU5 U ycU10 Group members 2, 3, dan 4 5, 6, 7, 8, 9, dan 10 Influence on data The final results of the SSA decomposition process, obtained four data breakdowns with the characteristics of trend, seasonality, noise, and the residual data remaining from the decomposition was also taken as seen in Fig. Red is for actual data, green is trend data, black is seasonal data, blue is noise data, and orange residual In the forecasting process, the optimal parameters obtained from hyperparameter tuning are as shown in Table II. These parameters are used to predict data for the next five weeks based on the respective models and data characteristics. The prediction results are obtained in Table i. TABLE II OPTIMAL PARAMETERS FOR PREDICTING WORLD CRUDE OIL OPENING PRICES Parameter Type LSTM GRU Model Model Unit Batch size Dropout Sequence length Epoch Learning rate Feature *T: trends. S: seasonal. N: noise. R: residue Hybrid SSA-LSTM Hybrid SSA-GRU TABLE i PREDICTION OF THE OPENING PRICE OF WORLD CRUDE OIL IN THE NEXT 5 WEEKS Date Actual Data LSTM GRU 07/05/2023 14/05/2023 21/05/2023 28/05/2023 04/06/2023 Hybrid SSALSTM Hybrid SSA-GRU Fig. Graph of SSA Decomposition Results of World Crude Oil Data INTL. JOURNAL ON ICT VOL. NO. JUNE 2024 Forecasting accuracy measurements are calculated to see the level of accuracy of the prediction results obtained as shown in Table IV. In Table IV, it can be seen that overall, the type of forecasting using the LSTM model has better results than the GRU model. This can be seen from the respective RMSE and MAPE values in forecasting the opening price of world crude oil. TABLE IV PREDICTION OF THE OPENING PRICE OF WORLD CRUDE OIL IN THE NEXT 5 WEEKS Measure of Accuracy LSTM GRU RMSE MAPE (%) Hybrid SSA Ae LSTM Hybrid SSA-GRU The Hybrid SSA-LSTM and Hybrid SSA-GRU model improves forecasting accuracy. This can be compared between the Hybrid SSA-LSTM, and Hybrid SSA-GRU models with the LSTM and GRU models. The Hybrid SSA-LSTM model can improve forecasting accuracy by as much as 78% compared to a single LSTM forecasting model. This can be seen from the respective values of RMSE 4. 36 and MAPE 5. 2% changing to 97 and 1. Meanwhile, the Hybrid SSA-GRU model can increase forecasting accuracy by 79% compared to the single GRU forecasting model. This can be seen from the respective values of RMSE 4. 86 and MAPE 33% changing to 1. 01 and 1. In a case study using weekly data on world crude oil opening prices. SSA decomposition can increase forecasting accuracy by 78-79% in LSTM and GRU forecasting. In Fig. 10, you can see a graphic summary of the four forecasting models. Blue indicates actual data, green for LSTM forecasting results, black for GRU forecasting results, red for Hybrid SSA-LSTM forecasting results, and orange for Hybrid SSA-GRU forecasting results. It is estimated that the weekly opening price of world crude oil in the next five weeks will experience a slight decline, after that it will experience a slight increase as shown in table 3. However, the results of this prediction could be slightly off due to the fluctuating and difficult to predict movement of the opening price of world crude oil. because there may be other factors that could cause this price change. Fig. Graph of Summary of World Crude Oil Price Predictions ANNISA MARTINA ET AL. APPLICATION OF SSA DECOMPOSITION IN LSTM AND GRU MODELS FOR CRUDE OIL PRICE FORECASTING CONCLUSION Hybrid SSA-LSTM and Hybrid SSA-GRU forecasting works in two stages. First. SSA decomposition is useful for breaking down data into trend, seasonal, noise and residual components. Second, the decomposed data is predicted using LSTM and GRU models. The Hybrid SSA-LSTM and Hybrid SSA-GRU model is proven to be able to increase forecasting accuracy. This can be compared between the Hybrid SSA-LSTM, and Hybrid SSA-GRU models with the LSTM and GRU models. In this study, the data used are data on world crude oil prices. World oil prices are taken from the price of West Texas Intermediate (WTI) which is the standard. The data taken is the weekly opening price from January 06, 2003 to April 30, 2023 with 1113 data. Then generated the predicted value of the next 5 weeks of the world crude oil opening price. The Hybrid SSA-LSTM model can improve forecasting accuracy by as much as 78% compared to a single LSTM forecasting model. This can be seen from the respective values of RMSE 4. 36 and MAPE 5. 2% changing to 0. 97 and 1. Meanwhile, the Hybrid SSA-GRU model can increase forecasting accuracy by 79% compared to the single GRU forecasting model. This can be seen from the respective values of RMSE 4. 86 and MAPE 6. 33% changing 01 and 1. Thus, in a case study using weekly data on world crude oil opening prices, the application of SSA decomposition can increase forecasting accuracy by 78-79% in LSTM and GRU forecasting. There are several things that cause this component to have poor forecasting. First, seasonal components, noise, and residues are stationary. Stationary data are less likely to have trends and have a pattern of long-term LSTM and GRU models are designed to study and remember long-term patterns on data. Secondly, excess complexity. LSTM and GRU models have complex structures with many parameters that need to be Incongruous initialization of parameters can also be the cause. Thus, when used to forecast stationary data, this complexity can likely lead to overfitting, where the model performs well on training data, but results in poor performance on test data or new data. It is recommended that further research consider the type of method that is more effective in predicting each component of decomposition results, especially data with stationary properties. In addition, it is worth paying attention in the hyperparameter tuning stage to think of the most optimal method in maximizing the parameters of initialization results that have a shorter computational REFERENCES