Journal of Sustainable Tourism and Entrepreneurship (JoSTE) ISSN: 2714-6480. Vol 7. No 1, 2025, 107-123 https://doi. org/10. 35912/joste. The role of seasonal trends in shaping tourist preferences for luxury resort: Big data approach Luh Made Gunapria Hindu Rajeswari Pamungkas1*. Putu Diah Sastri Pitanatri2. Clearesta Adinda3 Bali Tourism Polytechnic. Bali. Indonesia1,2,3 gunapriaa25@gmail. com1, diahsastri@ppb. id2 , clearestaadinda@ppb. Abstract Purpose: This study aims to examine seasonal patterns in tourist preferences for luxury resort stays in Bali, with a focus on how cultural backgrounds influence accommodation choices. The goal is to help resorts better understand guest behavior and optimize occupancy strategies. Methodology/approach: The research analyzes monthly online review data from Tripadvisor for Bvlgari Resort Bali, a prominent luxury hotel. A time-series analysis using the ARIMA (Autoregressive Integrated Moving Averag. model is applied to Article History forecast occupancy trends. Prior to modeling, the data is tested for Received on 20 May 2025 In addition to forecasting, the study explores guest 1st Revision on 9 June 2025 preferences by analyzing cultural characteristics inferred from 2nd Revision on 1 July 2025 reviews, categorizing them into collectivist and individualist Accepted on 24 July 2025 Results/findings: Findings reveal that occupancy trends do not strictly align with the hotelAos predefined seasonal categories. Instead, they are shaped by global travel trends and cultural factors. Guests from collectivist cultures tend to prefer facilities that support group interaction and shared experiences, while those from individualist cultures prioritize privacy, exclusivity, and personalized services. The ARIMA model delivers accurate forecasting results, helping to predict future occupancy rates effectively. Conclusion: IoT integration enhances the reliability of hospital-based PV systems. Tourist behavior is not solely dictated by conventional seasons but also by cultural expectations and travel motivations. Leveraging these insights allows hotels to better align operations, marketing, and pricing strategies with actual guest preferences. Limitations: The study is limited to a single resort and uses data from one online review platform, which may not fully capture the diversity of all guests. Contribution: This study contributes to tourism analytics, crosscultural marketing, and hotel management by offering data-driven strategies to enhance occupancy performance. Keywords: ARIMA. Guest Preference. Occupancy. Seasonality. Time-Series How to Cite: Pamungkas. Pitanatri. Adinda, . The role of seasonal trends in shaping tourist preferences for luxury resort: Big data approach. Journal of Sustainable Tourism and Entrepreneurship, 7. , 107-123. Introduction Seasonality is a primary challenge in the tourism sector, significantly affecting stability and This phenomenon leads to a concentration of tourist arrivals during specific periods, thereby impacting the spatial and temporal distribution of tourists throughout the year (Doran and Schofield, 2. In the tourism industry, there are three types of seasonal patterns: off-peak, singlepeak, and two-peak (Butler and Mao in Zvaigzne et al. , and Dembovska . Consequently, the industry must adapt its facility and infrastructure management strategies to make them more efficient (Corluka, 2. Accommodation, particularly hotels, plays a crucial role in supporting tourism infrastructure and services, especially in Bali, by catering to touristsAo needs (Pujiningrum & Sulistijanti, 2. However, the hospitality sector also faces seasonal challenges, as room demand fluctuates due to trends, cycles, and random business movements. The primary patterns observed are peak seasons . igh deman. and low seasons . ff-season. (Maulana & Koesfardani, 2. The Bvlgari Resort Bali, a luxury hotel in Uluwatu, implements a strategy based on single-peak seasonality, dividing the year into low, shoulder, high, and peak seasons, as shown in Table 1. Variations in seasonal patterns significantly affect guests' length of stay, with occupancy rates often reflecting fluctuations in demand that influence revenue, transportation requirements, and employment in the tourism industry (Mitra, 2. Table 1. Categorization of seasonal patterns in Bvlgari Resort Bali Seasonal Patterns Month Low Season January. February. May. June Shoulder Season April. October. September. November High Season July. August Peak Season December Source: AuthorAos research . Nevertheless. Figure 1 reveals a significant discrepancy between the established seasonal categorization and actual average monthly occupancy trends. For instance. June is classified as a low season. occupancy levels show an increasing trend. Occupancy data for July and August peak during the high season, aligning with Central Statistics Agency data . , which reported a sharp increase in tourist visits in July to 625,665, representing a 20. 11% rise. Conversely. October and November, categorized as the shoulder season, experience a steep decline in occupancy, approaching levels typical of the low Figure 1. Monthly average occupancy at Bvlgari Resort Bali period 2022 Ae 2024 according to seasonal category Source: Processed data . Therefore, understanding and managing seasonal patterns is essential for enhancing hotel efficiency and promoting tourism sustainability, particularly when serving a diverse range of guest Accordingly, scientific data analysis methods and mathematical models can be employed to identify trend patterns, seasonality, and factors influencing accommodation demand (Gudiato et al. (Pujiningrum & Sulistijanti, 2. These fluctuations stem from a complex interplay of seasonal patterns, cultural dynamics, and the preferences of tourists from various countries (Singgalen, 2. 2025 | Journal of Sustainable Tourism and Entrepreneurship / Vol 7 No 1, 107-123 There is a significant correlation between cultural background and variations in tourist destinations, types of activities, and places of origin (Velea et al. , 2. , which extends to accommodation expectations, as evidenced by an in-depth analysis of guest feedback from diverse geographical regions (Apaza-Panca. Ramos. Ramos. Saico, & Apaza-Apaza, 2. Cultural preferences manifest in distinct rating patterns: guests from collectivist societies tend to emphasize the importance of community spaces and group-oriented amenities, whereas guests from individualist cultures prioritize privacy and personalized experiences (Shen in Singgalen . Therefore, a hotelAos market-sensing capabilities, specifically its ability to collect and utilize guest information, play a critical role in anticipating shifts in guest preferences. TripAdvisor, the leading online accommodation review platform, offers benefits to tourists and presents opportunities for hotels (Ladhari & Michaud in Yilmaz, 2. Baleiro . confirmed the importance of online reviews in evaluating guest satisfaction. thus, text mining of these comments provides valuable insights into guest experiences. Although most previous studies have relied on static data analyses, such as topic modeling or sentiment analysis (Jain. Pamula, & Srivastava, 2. , longitudinal data can reveal the dynamics of accommodation feature importance across different seasons, long-term trends, or disruptive events (Teichert. Gonzylez-Martel. Hernyndez, & Schweiggart. Time-series analysis is a valid tool for measuring secular trends, seasonal variations, and irregular changes (Burger in (Teichert et al. , 2. The ARIMA (Autoregressive Integrated Moving Averag. model has been used to project occupancy in various hotel classes in Bali (Indradewi & Parwita, 2022. Pramudita, 2. Several studies employing time-series analytics and the ARIMA method indicate that such data can effectively anticipate and predict occupancy trends Putri. Prabudhi. Putranto. Astasia, and Tjahjo . Furthermore, the ARIMA model has been successfully applied to forecast occupancy using big data sources, such as Google Trends (Ayuningtyas & Wirawati, 2020. Rizalde. Mulyani, & Bachtiar. However, few studies have focused on analyzing TripAdvisor guest review data to identify seasonal patterns in guest preference. This research gap primarily concerns the application of time-series analysis with the ARIMA method, combined with web scraping and Application Programming Interface (API) engineering, which offers a powerful opportunity to deepen the understanding of guest preferences through a detailed analysis of spatially and temporally authentic guest review data, specifically at the Bvlgari Resort Bali. Literature Review Seasonal Yabanc . , defines seasonality as fluctuations in tourism demand occurring within specific periods each year. Similarly. Ekananda in Kristiyanti and Sumarno . explained that seasonal patterns may be quarterly, semi-annual, or annual. The primary causes of seasonality are generally classified into two categories: natural seasonality, which relates to environmental and climatic variations, and institutionalized seasonality, which stems from social and policy-driven events (Butler in Yabanc . Furthermore. Butler and Mao, as cited in Zvaigzne et al. , identified three main seasonal patterns in tourism: off-peak, single-peak, and two-peak. Seasonal patterns profoundly influence tourism businesses by affecting staffing levels, inventory management, and promotional activities. Therefore, understanding these cycles is essential for optimizing revenue during peak periods and minimizing losses during the low season. Moreover, it facilitates better resource allocation and workforce planning and ensures consistent service quality throughout the year. Guest Preferences Venkatraman, cited in Kuncoro and Kusumawati . , posited that guest preferences for products or brands arise from a combination of product attributes . uch as price and durabilit. and consumer characteristics . ncluding goals, attitudes, and disposable incom. Singgalen . emphasized that analyzing the relationship between culture and guest preferences through global review data can reveal 2025 | Journal of Sustainable Tourism and Entrepreneurship / Vol 7 No 1, 107-123 distinctive patterns in accommodation choices, service expectations, and satisfaction levels among guests from diverse cultural backgrounds. Guest preferences are dynamic and continuously evolve in response to market trends, lifestyle changes and global influences. Contemporary travelers increasingly prioritize personalization, sustainability, and convenience. Consequently, hotels that understand and adapt to these preferences can tailor their services to enhance guest satisfaction, foster loyalty, and achieve competitive differentiation through TripAdvisor Launched in 2000 by Stephen Kaufer. Tripadvisor serves as a digital guidebook that simplifies destination exploration for tourists (Bassett, 2. Additionally, the analysis of scores and reviews enables hotels to better understand guest preferences, improve operations, and maximize occupancy This research utilizes TripAdvisorAos AuTime of YearAy feature to help hotels adjust pricing and service strategies according to visiting seasons, while the AuTraveler TypeAy filter groups reviews by traveller categories, allowing hotels to customize services based on market segmentation. As a trusted travel review platform. TripAdvisor functions both as a marketing channel and a feedback mechanism for hospitality providers. High ratings can increase bookings by building guest trust, while reviews promote transparency, enabling prospective guests to assess service quality and encouraging providers to continuously improve. Time-Series Hanke, as cited in Sorlury. Mongi, and Nainggolan . , defined time-series data as observations recorded sequentially over time. According to Athiyarath. Paul, and Krishnaswamy . , time-series data comprise systematic componentsAilevel, trend, and seasonalAiand non-systematic components, or Time-series patterns are generally categorized into four types: horizontal . andom fluctuation. , trend . ong-term movement. , seasonal . ecurring variations within a yea. , and cyclical . ong-term fluctuations influenced by economic and external factor. Time-series analysis supports forecasting and strategic planning in hospitality by identifying historical trends that predict future demand. This approach aids decision-making related to pricing, staffing, and inventory management. Furthermore, visualizing time-series data helps stakeholders to monitor performance and prepare for potential disruptions, thereby enhancing operational readiness and ARIMA (Autoregressive Integrated Moving Averag. Murbarani . explains that ARIMA is a forecasting method that combines autoregressive (AR) and moving average (MA) components. It uses the past and present values of the dependent variable to generate short-term predictions without incorporating independent variables. ARIMA models include several variants, such as AR. MA. ARMA, and seasonal ARIMA (SARIMA). ARIMA is particularly effective for analyzing time-series data that exhibit trends and seasonality. Its integration component addresses non-stationary data, making it well-suited for real-world applications, such as tourism In the hospitality sector. ARIMA facilitates predictions of occupancy rates, revenue, and guest flow, thereby supporting strategic adjustments and efficient resource utilization in the hospitality Occupancy Ayuningtyas and Wirawati . defined occupancy as the ratio of occupied to available rooms, expressed as a percentage over a given period. Moreover, the occupancy rate serves as a key performance indicator for hotels (Sugiarto in Jatmiko and Sandy, 2. , reflecting operational efficiency, marketing effectiveness, and pricing strategies. Analyzing occupancy rates enables hotels to benchmark their performance against competitors and market standards. High occupancy indicates strong demand and effective operations, whereas low 2025 | Journal of Sustainable Tourism and Entrepreneurship / Vol 7 No 1, 107-123 occupancy may indicate service issues or misalignment with market needs. Continuous monitoring of occupancy also supports dynamic pricing strategies aimed at maximizing profits, while maintaining guest satisfaction. Research Methodology Design & Strategy The research design was systematically developed to identify the key elements necessary for analyzing seasonal patterns in tourist preferences (Saunders. Lewis, & Thornhill, 2. This study adopts a quantitative, data-driven approach that emphasizes objectivity and leverages numerical data to identify and predict behavioral trends. A quantitative design is particularly suitable when the aim is to measure variables and uncover patterns using statistical or computational methods (Creswell and Creswell. Unlike qualitative research, quantitative research prioritizes outcomes and the generalizability of results by utilizing standardized instruments to ensure replicability. Importantly, this study addresses a research gap in the application of big data to analyze seasonal tourist preferences by utilizing the ARIMA time-series model on guest reviews from TripAdvisor. While previous studies have employed ARIMA and tools such as Google Trends to predict hotel occupancy (Ayuningtyas & Wirawati, 2020. Putri et al. , 2024. Rizalde et al. , 2. , few have focused on usergenerated reviews to uncover these temporal patterns. To fill this gap, this study uses web scraping via the TripAdvisor API to collect monthly review data for the Bvlgari Resort Bali and applies ARIMA to identify seasonal trends. Ultimately, the goal is to provide data-driven insights that support more effective pricing, marketing, and resource planning in the luxury hospitality industry. Data Collection & Preparation The primary dataset comprises online guest reviews sourced from TripAdvisor covering the period from January 2022 to December 2024. Data were collected using a web scraping process that leveraged TripAdvisorAos publicly accessible Application Programming Interface (API). This method enables the automated extraction of large-scale textual content. Following data extraction, a custom scripting process was employed to transform and structure the raw data, which were then stored in a database and exported into Excel format for analysis (Sirisuriya, 2. This approach ensured a consistent and analyzable dataset that was suitable for time-series modeling. To ensure the reliability of the analysis, the raw dataset underwent rigorous data cleaning. This included addressing missing values, standardizing date formats, and filtering out irrelevant content from the data. Python, a high-level, open-source programming language, was used for these tasks because of its robust data manipulation libraries and flexibility in handling unstructured data. The versatility of Python makes it an ideal tool for preparing large datasets for time-series analysis (Nongthombam & Sharma, 2. Data Analysis & Theoretical Framework The cleaned dataset was subjected to time-series analysis to identify patterns, such as trends, seasonal fluctuations, and irregular events. The ARIMA model was applied because of its ability to manage nonstationary time-series data by removing trend and seasonal components through differencing. The model then isolates the signal from the noise, producing accurate forecasts of future occupancy rates. These insights provide practical value in guiding pricing strategies, marketing efforts, and resource allocation in luxury hospitality operations (Kontopoulou. Panagopoulos. Kakkos, & Matsopoulos. Building upon the time-series forecasting process, the conceptual framework in this study integrates empirical data and theoretical insights to explain the underlying factors influencing occupancy patterns at the Bvlgari Resort Bali. By grounding the research in a data-driven model while incorporating established theories of consumer behavior and tourism demand, the framework supports a systematic analysis of observed patterns. In this quantitative approach, the framework not only guides hypothesis testing but also informs the interpretation of ARIMA-generated forecasts, linking statistical outputs to actionable insights for decision-making in luxury resort management (Syahputri et al. , 2. 2025 | Journal of Sustainable Tourism and Entrepreneurship / Vol 7 No 1, 107-123 Figure 2. Theoretical framework Source: Adapted from (Bawana. Mansor, & Noordin, 2. 1 Model of the study 1 ARIMA The ARIMA model is defined by three parameters as follows: A p . utoregressive orde. A d . egree of differencin. A q . oving average orde. Once these components are identified, the model is estimated using the historical data. The general form of the ARIMA. , d, . model used in this study is expressed as follows: This formulation consists of the following: A Autoregressive (AR) terms: capturing the influence of past values of the series. A Moving Average (MA) terms: capturing the impact of past error terms. A A constant term yca. A The current error term yuAyc . The model summary provides the estimated coefficients for these components along with their statistical significance, which is typically evaluated using p-values and confidence intervals. A wellfitting model is expected to show statistically significant parameters and low residual errors. Residual diagnostics were performed to validate the adequacy of the model. The residuals, defined as the difference between the actual and predicted values, were computed as follows: 2 Augmented Dickey-Fuller (ADF) Before applying time-series forecasting models, such as ARIMA, it is essential to determine whether the series is stationary. Stationarity implies that the statistical properties of a series, such as the mean and variance, remain constant over time. This assumption is crucial for the validity of several forecasting techniques. The Augmented Dickey-Fuller (ADF) test was employed to assess stationarity. The ADF test builds upon the basic Dickey-Fuller test by incorporating lagged differences of the 2025 | Journal of Sustainable Tourism and Entrepreneurship / Vol 7 No 1, 107-123 dependent variable to account for serial correlation in the error terms, thereby improving its robustness. The regression model used in the ADF test is expressed as Where: OIycUyc is the first difference of the series, yc is the time trend, yu represents the coefficient for the lagged level of the series, yuycn are the coefficients for the lagged differences, and OOyc is the error term. 2 Autocorrelation Fuction (ACF) The autocorrelation Function (ACF) is a statistical tool in time-series analysis that measures the correlation between a time series and its past values at different time lags. It helps determine the extent to which current observations in the series are influenced by historical observations, providing insight into underlying patterns such as seasonality, trends, and cyclic behavior. Mathematically, the ACF at lag yco, denoted by yuUyco , is defined as follows: Where: ycUyc the value of the time series at time yc, yuN is the mean of the series, yua 2 is the variance, and ya denotes the expected value or mean of a product. 3 Autocorrelation Function The autocorrelation Function (ACF) measures the total correlation between a time series and its lagged values, and the Partial Autocorrelation Function (PACF) provides a more refined perspective. The PACF isolates the direct correlation between an observation and its lagged counterpart after accounting for the effects of all shorter lags. This function is especially important for identifying the presence and order of autoregressive (AR) components in a time-series mode. Mathematically, the PACF at lag yco, denoted OIycoyco , is defined as: Where: ycUyc the value of the time series at time yc, yuN is the mean of the series, yua 2 is the variance, and ya denotes the expected value or mean of a product. 4 ARIMA model residual plot and residual density plot The density distribution of the residuals was plotted to test whether the residuals followed a normal distribution, which is a key assumption of the ARIMA models. The normal distribution of the residuals was evaluated using the probability density function: Where: yuN is the mean . , yua 2 \sigma^2E2 is the variance of the residuals, yce. is the probability of observing specific residual values. 2025 | Journal of Sustainable Tourism and Entrepreneurship / Vol 7 No 1, 107-123 5 Linear regression The analysis of the relationship between the monthly average occupancy and the number of online reviews for the Bvlgari Resort Bali, as published on TripAdvisor, revealed a strong positive linear This relationship was modeled using a simple linear regression equation of the following Where: ycU represents monthly occupancy rate (%), ycU denotes the number of online reviews, yu0 = 23. 17 is the intercept of the regression line, yu1 = 1. 08 is the regression coefficient . OO is the error term that captures the unexplained variation. Results and Discussion Results Web scraping API (Artificial Processing Intelligenc. Web scraping refers to the automated retrieval of data using web scraper software that systematically loads and extracts information from websites based on user-defined requirements (Sirisuriya, 2. Furthermore. APIs enable software systems to interact and build applications that leverage the data or functionalities of other systems (Raatikainen et al. , 2. The initial approach in this study involved accessing the TripAdvisor website to extract online reviews from guests who stayed at the Bvlgari Resort Bali between 2022 and 2024. The page URL was copied and entered into the API software, which then processed the data through serverless program configuration in the cloud, requiring no additional manual input. The automated extraction process rendered the review pages and parsed the HTML (HyperText Markup Languag. , providing an interface for software services available over the This process facilitated the publication of relevant open datasets in Excel format, which are easy to analyze. Figure 3. Dataset online reviews of Bvlgari Resort Bali on Tripadvisor Source: Processed data by API . Data Cleaning According to Nongthombam and Sharma . , data cleaning involves scanning and removing inconsistent or duplicate entries. Upon examining the extracted Excel data using Python, several columns related to TripAdvisor reviews were identified, including user information, ratings, and review The cleaned data frame retained the columns AotextAo. Aotitle. Ao and Aouser. Ao which are essential for the However, the AouserAo column was recorded as Aouser/nameAo, necessitating an adjustment to the list of key columns to reflect the actual structure of the cleaned data frame. The next step involved deleting any rows with missing values in the Aouser/nameAo. AotextAo, or AotitleAo columns before finalizing the data-cleaning process. After these adjustments, data cleansing was successfully completed, resulting in a final dataset comprising 71 rows and 44 columns. 2025 | Journal of Sustainable Tourism and Entrepreneurship / Vol 7 No 1, 107-123 Time Series Analysis A fundamental concept in time-series analysis is that current observations ycUyc are influenced by one or more lagged observations, such as ycUycOe1 , ycUycOe2 , and so forth. This temporal dependence underpins many forecasting models, including autoregressive models that leverage historical patterns to predict future For this study, the dataset included a column labeled AoAo publishedDate, which recorded the date each online review was posted. This column is critical for time-series analysis because it serves as a temporal index anchoring each observation to a specific point in time. Before the analysis, the AopublishedDateAo column was converted to a datetime format to ensure compatibility with PythonAos time-series functions, such as resampling and indexing. Once converted, the dataset was transformed into a deep-index data frame to enable more sophisticated time-based operations. Subsequently, the data were aggregated on a monthly basis, producing a new time series that represents the number of online reviews per month. Monthly aggregation is particularly effective for identifying broad patterns such as seasonality, cyclical trends, or long-term growth or decline. This level of temporal granularity strikes a balance between detail and the reduction of noise from the daily The resulting monthly review counts were visualized to provide a clearer understanding of the trends over the study period. Figure 4. Fluctuations in the number of online reviews from 2022 Ae 2025 Source: Processed data by Python . In this study, the Augmented Dickey-Fuller (ADF) test produced an ADF statistic of approximately 4. 05, with an associated p-value of 0. Since the p-value is less than the commonly used significance level of 0. 05, the null hypothesis that the series contains a unit root is rejected. This result confirms that the time series data are stationary. Given the stationarity of the data . , the order of differencing d=0d=. , the time series is suitable for modeling using the Autoregressive Integrated Moving Average (ARIMA) approach. The absence of a unit root simplifies the modeling process, as no further differencing is required, and ARIMA can proceed directly using the original . r firstdifferenced, if necessar. ARIMA 1 Autocorrelation Function (AF) The Autocorrelation Function (ACF) equation captures the normalized covariance between values that are kk periods apart, providing a standardized measure of the strength and direction of the correlation for each lag kk. The ACF values range from -1 to 1, where values closer to 1 or -1 indicate strong positive or negative correlation, respectively, and values near zero suggest little to no correlation. In the context of a Moving Average process of order qq, or MA. , the ACF is vital for model identification. Theoretical properties of an MA. process suggest that the ACF will exhibit non-zero autocorrelations only up to lag qq, after which it will drop to zero. This characteristic allows researchers to estimate the order qq by observing the point at which autocorrelation becomes insignificant. 2025 | Journal of Sustainable Tourism and Entrepreneurship / Vol 7 No 1, 107-123 However, as shown in Figure 5, the empirical ACF plot of the time-series data reveals a gradual decline in the autocorrelation values across successive lags. Rather than dropping off sharply to zero after a specific lag . s expected in a simple MA proces. , the ACF values tail off slowly. This behavior indicates that the time series exhibits memory or influence extending across multiple lags, which is characteristic of a model with an MA component where yc > 0. Figure 5. ACF plot Source: Processed data by Phyton . The slow decay of the ACF values indicates persistent stochastic dependencies, suggesting that a simple low-order MA model may be inadequate and that multiple MA terms may be necessary. This pattern also hints at a potential mixed ARMA or ARIMA process, requiring both autoregressive and moving average components to capture the dynamics of the data. To confirm and refine the model structure, further diagnostics, such as the Partial Autocorrelation Function (PACF) and model fitting criteria, are 2 Partial Autocorrelation Function (PACF) This expression measures the correlation between ycUyc , and ycUycOeyco , conditional on all intermediate lags from ycUycOe1 , through ycUycOeyco 1 ,. By removing indirect influences, the PACF provides a clearer view of the direct relationship between current and specific lagged values, revealing the structure of the autoregressive (AR) process. For a pure autoregressive model of order pp (AR. ), the PACF will display significant partial correlations up to lag pp and will cut off to zero afterward. This behavior is a defining characteristic of AR processes and is essential for determining the appropriate value of pp in time-series modeling. Figure 6 shows that the PACF plot for the dataset reveals a sharp cutoff after the first lag, indicating that only the first lag exhibits a statistically significant partial autocorrelation, whereas subsequent lags show little to no significant influence. This pattern strongly suggests the presence of an AR. process, in which the current value of the time series is primarily influenced by its immediately preceding value. 2025 | Journal of Sustainable Tourism and Entrepreneurship / Vol 7 No 1, 107-123 Figure 6. PACF plot Source: Processed data by Phyton . The cutoff observed in the PACF supports the inclusion of an autoregressive component with order ycy = 1 in the time-series model. When used alongside ACF analysis, which indicated the presence of a moving average component . c > . , the PACF findings suggest that the appropriate model structure could be an ARIMA. ,d,. model with ycy = 1, ycc = 0 . s established by the ADF test confirming stationarit. , and a suitable value for q based on the ACF. 3 Residual Plot and Density Plot of Residuals A residual plot was used to assess whether the residuals behaved like white noise, that is, fluctuating randomly around zero, with no apparent patterns or autocorrelation. In this study, the residuals were mostly centered around zero, indicating that the ARIMA model effectively captured the general trend in the data. However, the presence of several sharp spikes suggests potential outliers or latent patterns that may not have been fully addressed by this model. As shown in Figure 7, the residual distribution was not perfectly symmetrical, indicating a degree of Although this deviation was not severe, it suggests that the model could benefit from further refinement or transformation of the input data. Residual fluctuations around zero indicate that the model has captured the main pattern, but some extreme peaks may be indicative of outliers or patterns that have not yet been fully captured. Meanwhile, the residual distribution is not completely symmetrical . , whereas it should ideally approximate a normal distribution . symmetrical bell shap. 2025 | Journal of Sustainable Tourism and Entrepreneurship / Vol 7 No 1, 107-123 Figure 7. ARIMA model residual plot and residual density plot Source: Processed data by Phyton . Once the ARIMA components are determined and validated, the model is used to generate the Figure 8 presents a visualization of the forecast results, showing the predicted values alongside the historical data. The forecast is accompanied by confidence intervals, which provide a range of values within which future observations are expected to fall within. This visualization is critical for understanding the expected trends in online review counts and aids decision-makers in resource planning and strategic forecasting. Figure 8. Plot forecast number of monthly online reviews of Bvlgari Resort Bali Source: Processed data by Phyton . Discussion Based on Figure 9, each additional online review is associated with an estimated 1. 08% increase in the monthly occupancy rate. This demonstrates the measurable and economically meaningful impact of online engagement on hotel performance. The R-squared value of 94. 5% indicates that a very high proportion of the variability in occupancy rates can be explained by the fluctuations in the number of In statistical terms, this high R-squared value suggests an excellent model fit and implies that the model captures the majority of the patterns in the data. Additionally, the p-value of the regression coefficient is below 0. 05, confirming that the relationship between the number of online reviews and occupancy is significant. 2025 | Journal of Sustainable Tourism and Entrepreneurship / Vol 7 No 1, 107-123 Figure 9. Scatter plot of monthly average occupancy and number of online reviews of Bvlgari Resort Bali on TripAdvisor Sources: Processed data by Phyton . Figure 10 complements the regression model by illustrating the monthly trends of both occupancy rates and online reviews throughout the year. A clear seasonal pattern was observed. Occupancy and review volume both rise steadily from January, peaking in July and August, and then taper off sharply in September and October before recovering slightly in December. This synchronized pattern between occupancy and review counts supports the interpretation that online reviews reflect guest satisfaction and signal market demand. During peak tourism seasons, more guests stay at the resort and are likely to leave reviews, suggesting that the number of online reviews may serve as a leading or coincidental indicator of demand. Figure 10. Time-series plot train occupancy monthly and number of online reviews of Bvlgari Resort Bali on TripAdvisor Sources: Processed data by Phyton . The analysis of the Bvlgari Resort BaliAos occupancy trends and guest feedback revealed a strong positive relationship between the number of online reviews and monthly occupancy rates, with a linear regression model showing that each additional review is associated with a 1. 08% increase in occupancy. This model demonstrated high predictive accuracy, with an R-squared value of 94. 5% and statistically significant results . -value < 0. However, further insights from guest reviews and cultural preferences suggest that relying solely on numerical review counts may overlook important behavioral factors that affect hotel performance. word cloud generated from TripAdvisor reviews highlights key themes such as Auexperience,Ay Auhospitality,Ay Auluxury,Ay Auservice,Ay and Auprivacy,Ay reflecting the emotional and cultural drivers of 2025 | Journal of Sustainable Tourism and Entrepreneurship / Vol 7 No 1, 107-123 Figure 11. Word-cloud Bvlgari Resort Bali guest reviews on TripAdvisor Sources: Processed data by Phyton . Guests from collectivist cultures . %) tended to value social interaction and communal experiences, whereas those from individualist cultures . %) prioritized privacy and exclusivity, as shown in Figure If these cultural distinctions are not incorporated into the predictive model, it may lead to inaccurate forecasts and residual patterns that violate the regression assumptions. Therefore, incorporating cultural preferences into the model could enhance its robustness and relevance. Strategically, the resort can benefit from segmenting guest experiences by offering private, serene services for individualists and group-oriented, culturally immersive activities for collectivists. Moreover, culturally tailored marketing and review solicitation strategies can further improve guest satisfaction and online engagement, ultimately strengthening resort occupancy performance and market positioning. Figure 12. Bvlgari Resort Bali guest preference pie diagram on Tripadvisor Sources: Processed data by Phyton . Conclusion Conclusion The results of the analysis demonstrate that seasonal patterns significantly affect the occupancy levels at the Bvlgari Resort Bali. Through the application of time-series analysis using the ARIMA model, we found that occupancy variations do not always align with the predefined seasonal categorizations set by the hotel. This suggests that additional elements, such as evolving tourist travel trends and changing guest preferences, shape occupancy patterns. Moreover, an analysis of online reviews on TripAdvisor revealed that guest preferences are also influenced by cultural backgrounds. Guests from collectivist cultures tend to prioritize community-related amenities and private experience. The ARIMA model proved effective in forecasting future occupancy trends, supported by graphs showing review trends that aligned with occupancy rates and reflected forecast uncertainty. Limitation This study offers valuable insights. however, several limitations should be acknowledged in this regard. Key variables that may influence occupancy levels, such as macroeconomic factors, political 2025 | Journal of Sustainable Tourism and Entrepreneurship / Vol 7 No 1, 107-123 conditions, and global travel disruptions, have not been fully considered. Furthermore, the heavy reliance on online reviews may introduce bias as not all guests share their feedback on digital platforms. Focusing on a single resort also limits the generalizability of the findings to broader hospitality contexts. Therefore, future research should incorporate a wider range of external factors and include multiple properties across different hotel categories. This approach will provide a more comprehensive perspective and enhance the relevance of the findings for the hospitality industry. Suggestion Based on the findings and limitations, it is recommended that the Bvlgari Resort Bali adopt a datadriven marketing strategy focusing on peak occupancy months. Dynamic pricing models should be employed to match demand fluctuations, and promotional events should be planned during the predicted low-occupancy periods. Additionally, tailoring offerings to suit cultural preferences and utilizing big data analytics in strategic decision-making will enhance the competitiveness and operational sustainability of the hospitality industry. References