HTTPS://JOURNALS. UMS. AC. ID/INDEX. PHP/FG/ ISSN: 0852-0682 | E-ISSN: 2460-3945 Research article Assessing the Reliability of Predicted Decadal Surface Temperatures in Southeast Asia Dara Kasihairani1. Rahmat Hidayat1*. Supari Supari2 Applied Climatology Program. Department of Geophysics and Meteorology. Faculty of Mathematics and Natural Sciences. IPB University. Bogor 16680. Indonesia Directorate of Climate Change. Deputy for Climatology. BMKG. Indonesia Correspondence: rahmath@apps. Citation: Abstract Kasihairani. Hidayat. Supari. Climate predictions spanning 10-year periods, known as Decadal Climate Predictions (DCP. , have become an important aspect of the latest Coupled Model Intercomparison Project (CMIP. These DCPs have the capability to capture the El Niyo-Southern Oscillation (ENSO) phenomena, which affects heatwave frequency in Southeast Asia over years to decades. This research assesses the ability of six General Circulation Model (GCM) DCPs to predict surface temperature over the Southeast Asian region, using the dcpp-A hindcast as the main product. The metrics of Anomaly Correlation Coefficient (ACC) and Mean Error (ME) are employed to assess the model outputs, with 51 hindcast datasets spanning initial years from 1960 to 2010 and ERA5 reanalysis data serving as the reference. The evaluation reveals that DCP model skill varies across lead times and subregions, with no single model consistently outperforming the others. The highest correlation values are observed during the September-October-November (SON) season, and the ENSEMBLE model demonstrates the ability to increase correlation values compared to the individual DCP models. However, the ENSEMBLE approach is unable to effectively reduce ME values due to the contrasting errors among individual models. PBIAS metric aligns with the ME, consistently identifying similar areas of underestimation . and overestimation . aritime continen. across the models. Despite these challenges, the evaluation results highlight the potential of DCPs in predicting surface temperature variability for the Southeast Asian region over decadal periods, particularly in capturing ENSO-related signals. Further improvements in model initializations, internal variability representation, and bias reduction are necessary to enhance the utility of CMIP6 decadal predictions for heatwave preparedness and mitigation strategies in this vulnerable region. Assessing the Reliability of Predicted Decadal Surface Temperatures in Southeast Asia. Forum Geografi. Article history: Received: 15 June 2024 Revised: 15 October 2024 Accepted: 21 December 2024 Published: 27 December 2024 Keywords: ENSO. dcpp-A hindcast. CMIP6. Introduction Southeast Asia has a tropical climate characterized by relatively small variations in surface temperatures throughout the year. However. Southeast Asia's mainland is experiencing an increasing trend of extreme temperature events, causing human discomfort and health concerns (Thirumalai et al. , 2. Observational data shows that all heatwave-related characteristics, including frequency, intensity, and duration, display rising trends in the majority of Southeast Asia (Li et al. , although the magnitude and statistical significance of these trends vary across different regions (Guo et al. , 2. There is substantial evidence linking high temperatures to significant health burdens (Arsad et al. , 2. , with heatwaves posing a greater mortality risk in moderately cold and moderately hot areas compared to extremely cold or hot regions (Guo et al. , 2. Furthermore, in mainland Southeast Asia, it has been discovered that the El Niyo-Southern Oscillation climate variability mode is tightly related to heatwave events (Lin et al. , 2. During El Niyo years, an increase in heatwave frequency, duration, and amplitude is found. Meanwhile, during the La Niya years, there is a decrease in these characteristics. These ENSO events may be predicted by decadal climate prediction (DCP), which covers a multiannual to decades long prediction period to support mitigation or planning efforts to reduce the death toll from heatwave During El Niyo years, an increase in heatwave frequency, duration, and amplitude is observed, while La Niya years tend to exhibit a decrease in these characteristics. The ability to predict ENSO events through DCP, which covers multiannual to decadeslong time scales, is crucial for supporting mitigation strategies and planning efforts to minimize the mortality and health consequences of heatwave occurrences in the region. Copyright: A 2024 by the authors. Submitted for possible open access publication under the terms and conditions of the Creative Commons Attribution (CC BY) license . ttps://creativecommons. org/licenses/by/4. 0/). Kasihairani et al. DCP, a new type of forecast, is categorized as a short-term climate prediction. The previous Coupled Model Intercomparison Project (CMIP. protocol for coordinated climate change experiments strongly emphasized decadal predictability and prediction. This emphasis on decadal forecasting has transitioned into a significant aspect of the current Coupled Model Intercomparison Project (CMIP. (Boer et al. , 2. DCP is divided into three components. The primary component, dcpp-A hindcast, encompasses a retrospective forecast for ten years ahead of time, with the first initialization of hindcast from 1960 to the present. It provides enough dataset numbers for Page 413 Forum Geografi, 38. , 2024. DOI: 10. 23917/forgeo. the evaluation study. Forecast products are the focus of the dcpp-B component. The objective of this component is to satisfy societal needs by developing a nearly operational process for forecasting, aggregating, and combining data for decadal climate prediction. The component adopts CMIP5, other models, and hindcast data sets to produce its real-time decadal forecasts. Component C is the final component and focuses on studies of the ability to forecast climate patterns and variability over a given decade, as well as specific cases and coordinated processes related to climate variability. DCP is an emerging field of forecasting that attempts to fill the gap between short-term seasonal predictions and long-term projections of climate. Given the relatively new nature of this approach, it is crucial to assess the performance of DCP models to determine their capability to attain specific forecasting objectives. While previous global verification studies have demonstrated the skill and potential of DCP models (Corti et al. , 2012. Dyscher et al. , 2022. Kadow et al. , 2016. Nicoly et al. , 2. , their reliability in predicting surface temperatures specifically for the Southeast Asia region remains largely unexplored. Consequently, this studyAos primary purpose is to comprehensively evaluate the reliability of DCP models in forecasting surface temperature variability on decadal periods across Southeast Asia. This assessment is essential for determining the suitability and limitations of these models in supporting climate adaptation strategies and decision-making processes within this climatically diverse and vulnerable region. Research Methods Study Area Southeast Asia is the study region (Figure . The region consists of the maritime continent of Indonesia. Timor-Leste. The Philippines. Papua New Guinea, and the mainland of Southeast Asia (Myanmar. Cambodia. Lao, and Thailan. , including the Malay Peninsula. The area spans 89,26 E-146,96 E and 27,26 N-15,14 S (CORDEX, 2. Further analysis also divides the region into the mainland subregion . rown dash lin. and maritime continent . lue dash lin. The mainland subregion is from 93AE to 107AE and 22AN to 12AN while the maritime continent subregion spread from 95AE to 140AE and 8AN to 12AS. Figure 1. Southeast Asia as the region of the evaluation study. Kasihairani et al. Page 414 Forum Geografi, 38. , 2024. DOI: 10. 23917/forgeo. Data and Method The surface temperature of a 2-meter height from six CMIP6 DCPP-A Hindcast will be evaluated using reanalysis data of ERA5 (Hersbach et al. , 2. , as a benchmark reference. The analysis utilizes each model's first ensemble member . to ensure a consistent and reliable comparison. These data were then re-gridded to have equal spatial resolution. Each model has a tenyear prediction, with 1960 as the first initialization year. The prediction starts from 1961 to 2011 to meet the last ERA5 data of reference, 2020. Therefore, each model has 51 datasets of decadal climate prediction experiments. The requisite data were accessed and downloaded from the following website: https://esgf-node. gov/search/cmip6/. The measurement of the surface temperature prediction models will use the Anomaly Correlation Coefficient (ACC), a dimensionless metric, to measure and assess the phase agreement between anomalies of retrospective forecast and observation, ranging from -1 to 1. Two bias verification metrics Mean Error (ME) and Percent Bias (PBIAS) will incorporate the ACC. The ACC is given below Equation 1. OcycA ycn=1. cUycn Oe ycUycn ). cUycn Oe ycUycn ) yayaya = . A I 2 A I 2 ycn=1. cUycn Oe ycUycn ) . cUycn Oe ycUycn ) Iycn denotes forecast average value, ycUIycn denotes observation average value and . cUycnA Oe ycU Iycn )2 where ycU means the squared standard deviations of the forecast anomalies, while . cUycn Oe ycUIycn ) means the squared standard deviations of analysis anomalies from the climate, respectively. The statistical significance of the ACC is ascertained through the implementation of a two-tailed t-statistic Equation 2 . u = 0. 05, ycu = . yc= . yayayaOoycu Oe 2 Oo1 Oe yayaya 2 ME and PBIAS measure the models' tendencies to underestimate or overestimate. ME provides a deterministic quantification of the average magnitude of forecast errors while PBIAS illustrates the diverging ratio of the model's hindcast relative to the observed data (Mengistu et al. , 2021. Viana et al. , 2. These two bias metrics complement each other. Let ycUycn is the model forecast . r hindcas. and ycUycn is the observation data. Negative values indicate model underestimation, while positive values indicate overestimation in Equations 3 and 4. ycAya = ycA OcycA cUycn Oe ycUycn ) ycEyaAyayaycI = 100 y Oc. cUycn Oe ycUycn ) Oc ycUycn The decadal climate prediction of ten years is then divided into three parts: the period called the first half decadal (FHDeca. for prediction year 1 to 5 average, the second half decadal (SHDeca. for prediction year 6 to 10 average, and the decadal (Deca. for ten years prediction average . ear 1 to . The investigation will examine how the model's performance varies by season. The season division will be December to February (DJF). March to May (MAM). June to August (JJA), and September to November (SON). Table 1 lists the models that were used in this investigation. Apart from the six models, the ENSEMBLE model is the average of all studied models. Tabel 1. CMIP6 DCPP-A Hindcast GCM. Model Institution Spatial resolution CMCC-CM2-SR5 (Nicoly et al. , 2. CMCC (Canad. MPI-ESM1-2-HR (Myller et al. , 2. DWD (German. 0,9 x 0,9 IPSL-CM6A-LR (Boucher et al. , 2. IPSL (Franc. 1,25 x 2,5 MIROC6 (Tatebe et al. , 2. MIROC (Japa. 1,4 x 1,4 NorCPM1 (Bethke et al. , 2. NCC (Norwa. 1,9 x 2,5 FGOALS-f3-L (Hu et al. , 2. CAS (Chin. 1,0 x 1,0 0,9 x 1,25 Results and Discussion The annual time series of mean surface temperature hindcasts for Southeast Asia from each climate model, along with the corresponding ERA5 reanalysis data, are presented in Figure 2. The figure displays the temperature anomaly values for three distinct periods: the FHDecad, the SHDecad, and the Decad period. Consistent with previous studies examining global mean surface temperature trends (Corti et al. , 2012. Hu et al. , 2023. Nicoly et al. , 2. , the model hindcasts Kasihairani et al. Page 415 Forum Geografi, 38. , 2024. DOI: 10. 23917/forgeo. indicate a similar increasing pattern in surface temperatures for the Southeast Asia region over the analysed period. This similarity suggests that the region is experiencing temperature changes in line with the broader global warming trend. The average annual anomaly hindcast in all models ranges from -0. 61AK to 1. 25AK for FHDecad 55AK to 1. 25AK for SHDecad. As for the Decad period, the range is from -0. 58AK to 17AK. Although the Southeast Asia region is following the trend of global climate change, the retrospective predictions can capture the effect of volcanic forcing. The temperature anomaly decreased in value between 1982 and 1991 due to the cooling effect caused by the El Chichyn and Pinatubo volcanic eruptions, as indicated by NorCPM1 (Bethke et al. , 2. IPSL-CM6A-LR (Boucher et al. , 2. MIROC6 (Kataoka et al. , 2. , and CMCC-CM2-SR5 models (Nicoly et , 2. The graph also reveals that CMCC-CM2-SR5. MIROC6, and MPI-ESM1-2-HR models tend to overpredict surface temperatures, resulting in positive anomalies, while the FGOALSf3-L. IPSL-CM6A-LR, and NorCPM1 models typically underpredict, leading to negative anomalies. Figure 2. Time series graph of regional average surface temperature anomaly of Southeast Asia. FHDecad average shown in, . is for the SHDecad average, and . is for the Decad average. The vertical dash line . indicates 1982 and 1991. Metrics of Verification The direct evaluation of the models' proficiency utilized the ERA5 reanalysis data as a baseline The relevant climatological period range was considered when calculating anomalies for the hindcast and reference data. Employing the ACC metric, an evaluation was conducted to determine the decadal climate prediction models' ability to capture the climatological pattern of the region. Furthermore, the ME metric will indicate part of the region is inclined towards warm or cold bias. At the same time. PBIAS will measure the relative degree to which the model's predictions diverge from the reference climate condition. Southeast Asian climate models generally show positive ACC values (Figure . However, negative values are observed in mainland areas, particularly during winter (DJF) and spring (MAM) seasons and in some oceanic regions. Among the models. IPSL-CM6A-LR demonstrates superior performance with higher significant ACC values. In contrast. MIROC6 and NorCPM1 exhibit the lowest ACC values. Significant ACC values are rarely found in mainland areas across all models. Additionally, models show agreement in positive ACC during the fall (SON) season, with the highest values of positive ACC. During SHDecad, the ACC exhibits a similar pattern to FHDecad. However, lower ACC values are observed in FGOALS-f3-L. CMCC-CM2-SR5, and MIROC6 . specially for MAM and JJA season. , as well as IPSL-CM6A-LR (Figure . While NorCPM1 displays slightly higher values for this period. MPI-ESM1-2-HR shows a notable increase in ACC value. In the Decad period category . igure not show. , the ACC values of the models are in the middle of FHDecad and SHDecad values range. The spatial pattern of ACC values in this period is similar to FHDecad and SHDecad. Notably, the ENSEMBLE model, except for a few areas of the maritime continent, increases the correlation value for each season in the region across the three time period categories. Additionally, it is crucial to note that the ACC value is more prevalent in the oceanic part than in the regionAos terrestrial area. Kasihairani et al. Page 416 Forum Geografi, 38. , 2024. DOI: 10. 23917/forgeo. Figure 3. Anomaly Correlation Coefficient (ACC) for FHDecad period from DCP and ENSEMBLE of DCP The dots in the figure signify that the experimental results across all datasets satisfy the statistical significance criteria with 95% confidence. The ACC maximum and average values are shown below the subplots and the fraction of the significant grid area. Kasihairani et al. Page 417 Forum Geografi, 38. , 2024. DOI: 10. 23917/forgeo. Figure 4. Anomaly Correlation Coefficient (ACC) for SHDecad period from DCP and ENSEMBLE of DCP The dots in the figure signify that the experimental results across all datasets satisfy the statistical significance criteria with 95% confidence. The ACC maximum and average values are shown below of the subplots, along with the fraction of the significant grid area. Kasihairani et al. Page 418 Forum Geografi, 38. , 2024. DOI: 10. 23917/forgeo. Figure 5. Mean Error value for Decad period from DCP and ENSEMBLE of DCP models. FHDecad and SHDecad are not shown, as their pattern resemble each other. The value pattern of the ME metric exhibits similarity across all time categories (FHDecad. SHDecad, and Deca. , underscoring the importance of understanding how each model's hindcast responds to different surface features of land and ocean in the region. Figure 5, the FGOALS-f3L model's ME value varies over terrestrial areas but remains invariable above the ocean, displaying a cold bias. Similarly, the NorCPM1 model exhibits a clear hot bias over the maritime continent area. Conversely, the MPI-ESM1-2-HR model's pattern is almost the opposite of FGOALSf3-L and NorCPM1, showing a cold bias for the terrestrial area of the region . xcept for the central mainland area in the MAM seaso. At the same time, the temperature hindcast above the ocean uniformly exhibits a hot bias. Kasihairani et al. Page 419 Forum Geografi, 38. , 2024. DOI: 10. 23917/forgeo. The CMCC-CM2-SR5. MIROC6, and IPSL-CM6A-LR models share a similar pattern above the ocean part of the region, where they exhibit a hot bias on the equatorial region and a cold bias in the upper equatorial region. Notably, the CMCC-CM2-SR5 model tends to display a cold bias for the eastern part of the mainland, while the western part and maritime continent predominantly have a hot bias. The MIROC6 ME value varies across the region, and the IPSL-CM6A-LR model tends to have a cold bias over the terrestrial area of the region. The fact that one model has a relatively large error and another has a relatively small error prevents the ENSEMBLE model from concluding that it has effectively reduced the ME value. On the other hand, it can detect the distinct pattern of mean error values between the region's upper and equatorial sections. Figure 6. Anomaly Correlation Coefficient (ACC) heatmap for the two subregions . mainland and . maritime continent. The PBIAS metric values align closely with the patterns observed in the ME metric. Both metrics consistently identify similar areas where each hindcast model either underestimates or overestimates. Notably, the PBIAS values remain within a narrow range, not exceeding A2% for any of the models. The spatial analysis shows that the models have different abilities to predict the two distinct areas, the mainland of Southeast Asia and the maritime continent. Therefore, subsequent examination is carried out by dividing the evaluation value metric into two subregions: the mainland area and the maritime continent area. The mainland subregion is from 93AE to 107AE and 22AN to 12AN while Kasihairani et al. Page 420 Forum Geografi, 38. , 2024. DOI: 10. 23917/forgeo. the maritime continent subregion spread from 95AE to 140AE and 8AN to 12AS. Each model's prediction performance for the two subregions can be observed from the heatmap in Figures 6 and 7. ACC values in the mainland subregion are more modest compared to those of the maritime continent. Autumn (SON) consistently shows the highest ACC values across models and periods for both subregions. The maritime continent exhibits higher and more consistent ACC values over time, with slight improvements in SHDecad. Among individual models. IPSL-CM6A-LR demonstrates the highest performance. Notably, the ENSEMBLE model successfully increases ACC values in both subregions. Figure 7. ME heatmap for the two subregions . mainland and . maritime continent. Unit is in Kelvin. The ME metric reveals contrasting values for Southeast AsiaAos mainland and maritime continent subregions (Figure . The hindcast datasets tend to predict the area to have a colder temperature than the climatology on the mainland. Opposite to that, in the maritime continent, the models tend to predict warmer conditions compared to the climatology persistently. Winter is the season with the highest ME value, and the ME value for both subregions is marginally higher in the FHDecad The model with the highest error for the mainland area is NorCPM1, especially during the DJF season when its error reaches more than 4AK cooler than its climatology reference. While the model of MPI-ESM1-2-HR particularly during the MAM season exhibits a warm bias. MIROC6 tends to have a warm bias during spring and summer, and CMCC-CM2-SR5 extends it to autumn. Kasihairani et al. Page 421 Forum Geografi, 38. , 2024. DOI: 10. 23917/forgeo. Figure 8. PBIAS heatmap for the two subregions . mainland and . maritime continent. Unit in percent (%). Negative values represent underestimation, and positive values represent overestimation. The CMCC-CM2-SR5 and MPI-ESM1-2-HR models display the most pronounced errors when considering the maritime continent subregion. However, the average ME in the maritime continent subregion ranges only from 0. 31AK to 1. 20AK. The FGOALS-f3-L and NorCPM1 models have a relatively minor cold bias from the climatology reference. In this subregion, the models tend to have more significant biases during the JJA and SON seasons. PBIAS values align with the ME verification metric results. Spatially. PBIAS exhibits the same pattern as ME . aps not show. , indicating areas of underestimation or overestimation in past predictions. Each model consistently underestimates or overestimates across the analysed periods. PBIAS values in the two subregions range from -1. 43% to 0. 51% (Figure . , falling within the "very good" category. The mainland shows more prominent PBIAS values than the maritime continent, especially during the DJF season. Interestingly, the NorCPM1 model, with an ME of -4. has a PBIAS of only -1. Similar results are observed for IPSL-CM6A-LR (ME=-2. PBIAS=-0. and FGOALS-f3-L (ME=-2. PBIAS=-1. The fact that a 2-4AK error translates to only about 1% bias suggests that the average temperatures in the hindcast datasets are quite low. Although models capture temperature patterns well, as evidenced by small PBIAS values, a 2-4AK underestimation is significant for surface temperature prediction related to human health, particularly for the region of Southeast Asia (Sun et al. , 2. Furthermore, the analysis examining the influence of initialization on prediction skill reveals that the ACC values of models FGOALS-f3-L. CMCC-CM2-SR5, and MIROC6 . or MAM and JJA Kasihairani et al. Page 422 Forum Geografi, 38. , 2024. DOI: 10. 23917/forgeo. are higher during FHDecad than SHDecad. In contrast. IPSL-CM6A-LR. MPI-ESM12-HR, and NorCPM1 demonstrate improved performance in SHDecad. ME metric indicates higher error values during FHDecad in mainland subregions and a slight decrease in the maritime continent during SHDecad. A consistent cool bias throughout the period across models in the mainland and warm bias in the maritime continent suggests systematic issues in how each model's physical parameterizations or processes respond to these distinct geographical subregions. The PBIAS remains within a very good range throughout the entire period for all models despite these regional biases. Regional Predictability of Decadal Climate Prediction Models There are two sources of forecast predictability in climate models: internal variability and external The external forcing components stem from factors such as fluctuations in solar irradiance, aerosols from anthropogenic activity and volcanoes, concentration alterations of greenhouse gases, and other external influences. On the other hand, the estimation of climate predictability depends on how climate models respond to these factors and the regions where accurate forecasts emerge on relevant timescales (Meehl et al. , 2. Notably, the maritime continent subregion exhibits the highest average ACC value over the entire decadal period. In Southeast Asia, the ENSO stands out as the most significant source of internal climate variability, crucial in driving the increasing trend of heatwave events (Lin et al. , 2. Accurately capturing the ENSO signal is essential for climate models aiming to provide reliable projections for this area. A model's predictive capability in Southeast Asia hinges heavily on its ability to appropriately simulate key oceanic features in the Pacific basin, not only sea surface temperatures (SST) but also ocean heat content, as well as the associated regional variability patterns linked to ENSO dynamics. The proficiency of climate models in representing ENSO-driven internal variability is a critical determinant of their forecast skill for the Southeast Asia region. Considering FGOALS-f3-L and CMCC-CM2-SR5 models exhibit significant predictability for ocean heat content in regions closely linked to ENSO dynamics, including the tropical Pacific and the subtropical North Atlantic (Hu et al. , 2023. Nicoly et al. , 2. , they display large errors in the equatorial region. These errors could impact their ability to capture ENSO variability accurately, which is crucial for reliable projections in Southeast Asia. Thus, despite their strengths in specific ocean basins, the limitations in simulating equatorial ocean conditions may hinder the performance of these models in representing ENSO-driven climate variability and its impacts on the Southeast Asia region. The metrics of ACC for both subregions indicate that models have a relatively similar range of values throughout the decadal period (Figure . While ACC values over the maritime continent exhibit greater stability, models FGOALS-f3-L. CMCC-CM2-SR5, and MIROC6 show diminished ability to replicate climatology patterns in the mainland subregion during the latter half of the decadal period (SHDeca. compared to the earlier half (FHDeca. The decrease in performance can be attributed to the models' diminishing ability to represent the ENSO signal beyond the FHDecad accurately. For instance, the CMCC-CM2-SR5 model's ability to forecast ENSO is limited beyond one year (Nicoly et al. , 2. , while the MIROC6 SST hindcast shows a delayed response to ENSO evolution until the second forecast year and overestimates the Niyo 3. 4 index at longer leads (Kataoka et al. , 2. In contrast to the models showing declining performance, the MPI-ESM1-2-HR and NorCPM1 models demonstrate more consistent predictive skill throughout the decadal period. The MPI-ESM1-2-HR model, however, tends to overestimate ENSO behaviour (Boucher et al. , 2. and NorCPM1 skill degrades for the eastern Pacific area at longer lead times (Bethke et al. , 2. Despite their individual limitations, both MPI-ESM1-2-HR and NorCPM1 models exhibit better overall performance in maintaining their predictive skill across the entire decadal period compared to the other models discussed. Although ENSO is a crucial internal variability factor for Southeast Asia, climate models also recognize broader timescale climate patterns in the Pacific basin, such as the Decadal Variability of Pacific (PDV) and Multidecadal Variability of Atlantic (AMV), which exhibit distinct teleconnections to the region (Boucher et al. , 2. However, it is worth noting that extreme temperatures in Southeast Asia are not significantly related to these multidecadal variabilities (Fan et al. Nonetheless, the ability of models to replicate variability patterns in the Pacific region, including ENSO and other decadal and multidecadal climate variables, is crucial for their overall skill in simulating the Southeast Asian climate accurately. The maritime continent verification metrics display better values than the mainland of Southeast Asia (Figures 6 and . Prior research in the region also suggests that models simulate the surface Kasihairani et al. Page 423 Forum Geografi, 38. , 2024. DOI: 10. 23917/forgeo. temperature in the maritime continent subregion more accurately than in the mainland subregion (Kamworapan & Surussavadee, 2. This improved performance for the maritime continent can be attributed to its proximity to the Indian Ocean, where models exhibit relatively higher decadal predictive skills than other ocean basins (Guemas et al. , 2. This conclusion is further supported by the narrow seasonal variability of surface temperature in the maritime continent region. Conclusion The assessment of decadal surface temperature predictions provides critical insights for mitigating heatwave events in Southeast Asia. This study evaluated six hindcast models initialized with observational data to examine initialization effects across different prediction timeframes: FHDecad. SHDecad, and Decad. Through comprehensive analysis using ACC. ME, and PBIAS metrics, our findings revealed distinct subregional patterns between mainland and maritime continent. Acknowledgements The authors would like to thank the Indonesia Endowment Fund for Education Agency (LPDP) for funding this research and the anonymous reviewer for valuable comments. Author Contributions Conceptualization: Kasihairani. Hidayat. Supari. methodology: Kasihairani. Hidayat. Supari. Kasihairani. Hidayat. Supari. writingAioriginal draft preparation: Kasihairani. writingAi review and editing: Kasihairani. Hidayat. Supari. visualization: Kasihairani. Hidayat. Supari S. All authors have read and agreed to the published version of the manuscript. Conflict of interest All authors declare that they have no conflicts of interest. Data availability Dataset is provided in this URL: https://esgfnode. gov/search/cmip6/. Funding This research was funded by the Indonesia Endowment Fund for Education Agency (LPDP) as part of the given MasterAos Degree Scholarship Program. Kasihairani et al. The results demonstrated superior predictive skill during the first half of the decade (FHDeca. , particularly during the SON season, with stronger surface temperature correlations over oceanic This enhanced performance in FHDecad underscores the positive impact of initialization on predictive capabilities. However, we observed a general deterioration in predictive skill over longer lead times, likely due to challenges in representing ocean features and internal variability. Notably, some models maintained or showed improved performance in the latter half of the decade, although this phenomenon warrants careful interpretation regarding its physical basis. CMIP6 dcpp-A hindcast models demonstrated enhanced predictability of sea surface temperature and ocean heat content in key regions. While the models successfully captured decadal Pacific variabilities and showed promise in replicating ENSO variability patterns, their reliability remains limited in the FHDecad period. Despite achieving favorable ACC values for Southeast Asia, several challenges persist. Future work should focus on refining initialization techniques, improving the representation of internal variability, and reducing systematic biases to enhance the reliability of heatwave projections in this climatically vulnerable region. References