Performance Analysis of Prediction Methods on Tokyo Airbnb Data: A Comparative Study of Hyperparameter-Tuned XGBoost, ARIMA, and LSTM

Authors

  • Rizal Farhan Nabila Nurfalah Institut Teknologi Dan Bisnis STIKOM BALI
  • Dandy Pramana Hostiadi Institut Teknologi Dan Bisnis STIKOM BALI
  • Evi Triandini Institut Teknologi Dan Bisnis STIKOM BALI

DOI:

https://doi.org/10.26555/jiteki.v11i2.30631

Keywords:

Hyperparameter tuning, Machine Learning, Time Series Analysis, Prediction Methods, Occupancy Rate, Optimization

Abstract

The rapid growth of the digital economy has increased the importance of accurately predicting Airbnb property occupancy rates, especially in dynamic and competitive markets such as Tokyo, Japan. Property owners face significant challenges in forecasting occupancy rates due to seasonal patterns, non-linear trends, and complex temporal dependencies within the data. Addressing these challenges, this study investigates the performance of ARIMA, XGBoost, and LSTM models in predicting Airbnb occupancy rates in Tokyo. The dataset is collected from Airbnb listings and includes relevant features such as location, price, customer reviews, and historical occupancy rates. The models were optimized using Grid Search for ARIMA and Random Search for XGBoost and LSTM to identify the best hyperparameter configurations. Evaluation metrics included Mean Absolute Error (MAE), Root Mean Square Error (RMSE), and Coefficient of Determination (R²), which are more appropriate for regression tasks. The results indicate that XGBoost achieves the highest R² (0.23), followed by LSTM (0.19) and ARIMA (0.03). However, the low R² values suggest that the models struggle to capture occupancy rate variations, indicating the potential influence of unmodeled external factors such as seasonality and policy changes. This study highlights the importance of hyperparameter tuning in improving prediction accuracy and contributes by providing an in-depth comparison of regression-based models for Airbnb occupancy forecasting.

References

[1] T. Duso, C. Michelsen, M. Schaefer, and K. D. Tran, “Airbnb and rental markets: Evidence from Berlin,” Reg Sci Urban Econ, vol. 106, May 2024, https://doi.org/10.1016/j.regsciurbeco.2024.104007.

[2] D. A. Guttentag, S. W. Litvin, and W. W. Smith, “To Airbnb or not to Airbnb: Does Airbnb feel safer than hotels during a pandemic?,” Int J Hosp Manag, vol. 114, Sep. 2023, https://doi.org/10.1016/j.ijhm.2023.103550.

[3] M. Makkar, S. Appau, and R. W. Belk, “Value outcomes in Airbnb as a chronotopic service,” International Journal of Research in Marketing, 2024, https://doi.org/10.1016/j.ijresmar.2024.05.008.

[4] F. N. Priambodo and A. Sihabuddin, “An Extreme Learning Machine Model Approach on Airbnb Base Price Prediction,” International Journal of Advanced Computer Science and Applications, vol. 11, no. 11, pp. 179–185, 2020, https://doi.org/10.14569/IJACSA.2020.0111123.

[5] A. B. Adetunji, O. N. Akande, F. A. Ajala, O. Oyewo, Y. F. Akande, and G. Oluwadara, “House Price Prediction using Random Forest Machine Learning Technique,” in Procedia Computer Science, pp. 806–813, 2021, https://doi.org/10.1016/j.procs.2022.01.100.

[6] M. D. Islam, B. Li, K. S. Islam, R. Ahasan, Md. R. Mia, and M. E. Haque, “Airbnb rental price modeling based on Latent Dirichlet Allocation and MESF-XGBoost composite model,” Machine Learning with Applications, vol. 7, p. 100208, Mar. 2022, https://doi.org/10.1016/j.mlwa.2021.100208.

[7] A. Más-Ferrando, L. Moreno-Izquierdo, J. F. Perles-Ribes, and A. Rubia, “Has COVID-19 changed the factors explaining the occupancy of Airbnb accommodation? Madrid as a case study,” Journal of Destination Marketing and Management, vol. 31, Mar. 2024, https://doi.org/10.1016/j.jdmm.2023.100837.

[8] X. Dairu and Z. Shilong, “Machine Learning Model for Sales Forecasting by Using XGBoost,” in 2021 IEEE International Conference on Consumer Electronics and Computer Engineering, ICCECE, pp. 480–483, Jan. 2021, https://doi.org/10.1109/ICCECE51280.2021.9342304.

[9] S. Siami-Namini, N. Tavakoli, and A. S. Namin, “A Comparative Analysis of Forecasting Financial Time Series Using ARIMA, LSTM, and BiLSTM,” 2019, [Online]. Available: http://arxiv.org/abs/1911.09512.

[10] S. Gopali, S. Siami-Namini, F. Abri, and A. S. Namin, “The performance of the LSTM-based code generated by Large Language Models (LLMs) in forecasting time series data,” Natural Language Processing Journal, vol. 9, p. 100120, Dec. 2024, https://doi.org/10.1016/j.nlp.2024.100120.

[11] S. Siami-Namini, N. Tavakoli, and A. Siami Namin, “A Comparison of ARIMA and LSTM in Forecasting Time Series,” Proceedings - 17th IEEE International Conference on Machine Learning and Applications, ICMLA, pp. 1394–1401, 2018, https://doi.org/10.1109/ICMLA.2018.00227.

[12] H. Oukhouya and K. El Himdi, “Comparing Machine Learning Methods—SVR, XGBoost, LSTM, and MLP— For Forecasting the Moroccan Stock Market,” In Computer Sciences & Mathematics Forum, vol. 7, no. 1, p. 39, 2023, https://doi.org/10.3390/iocma2023-14409.

[13] N. Zougagh, A. Charkaoui, and A. Echchatbi, “Artificial intelligence hybrid models for improving forecasting accuracy,” in Procedia Computer Science, pp. 817–822, 2021, https://doi.org/10.1016/j.procs.2021.04.013.

[14] M. W. Murray, “Inside Airbnb: Tokyo Listings Dataset,” https://insideairbnb.com/tokyo/.

[15] S. Aldahmani, “Leveraging Sentiment Analysis for Tokyo Airbnb Hosts and Leveraging Sentiment Analysis for Tokyo Airbnb Hosts and Decision Makers Decision Makers.” Rochester Institute of Technology, 2024, [Online]. Available: https://repository.rit.edu/theses.

[16] G. Milunovich and D. Nasrabadi, “Airbnb Pricing in Sydney: Predictive Modelling and Explainable AI *,” Available at SSRN 4805859. 2024. [Online]. Available: https://ssrn.com/abstract=4805859.

[17] D. Boto-García, R. Balado-Naves, M. Mayor, and J. F. Baños-Pino, “Consumers’ Demand For Operational Licencing: Evidence From Airbnb In Paris,” Ann Tour Res, vol. 100, May 2023, https://doi.org/10.1016/j.annals.2023.103566.

[18] H. S. Le et al., “Predictive model for customer satisfaction analytics in E-commerce sector using machine learning and deep learning,” International Journal of Information Management Data Insights, vol. 4, no. 2, Nov. 2024, https://doi.org/10.1016/j.jjimei.2024.100295.

[19] A. Laukkarinen and J. Vinha, “Long-term prediction of hourly indoor air temperature using machine learning,” Energy Build, vol. 325, Dec. 2024, https://doi.org/10.1016/j.enbuild.2024.114972.

[20] Y. Ensafi, S. H. Amin, G. Zhang, and B. Shah, “Time-series forecasting of seasonal items sales using machine learning – A comparative analysis,” International Journal of Information Management Data Insights, vol. 2, no. 1, Apr. 2022, https://doi.org/10.1016/j.jjimei.2022.100058.

[21] Md. S. Rahman, A. H. Chowdhury, and M. Amrin, “Accuracy comparison of ARIMA and XGBoost forecasting models in predicting the incidence of COVID-19 in Bangladesh,” PLOS Global Public Health, vol. 2, no. 5, p. e0000495, 2022, https://doi.org/10.1371/journal.pgph.0000495.

[22] S. Stevenson, “A Comparison of the Forecasting Ability of ARIMA Models,” Journal of Property Investment & Finance, vol. 25, no. 3, pp. 223-240, 2007, https://www.emerald.com/insight/content/doi/10.1108/14635780710746902/full/html.

[23] Y. Wang, “The Prediction of the Guangzhou Housing Market Based on the ARIMA Model,” International Conference on Economics, Mathematical Finance and Risk Management (EMFRM), vol. 38, 2023, https://doi.org/10.54691/bcpbm.v38i.3937.

[24] Z. Shi, Y. Hu, G. Mo, and J. Wu, “Attention-based CNN-LSTM and XGBoost hybrid model for stock prediction,” vol. 14, no. 8, pp. 1–7, 2022, [Online]. Available: http://arxiv.org/abs/2204.02623.

[25] D. C. Yıldırım, I. H. Toroslu, and U. Fiore, “Forecasting directional movement of Forex data using LSTM with technical and macroeconomic indicators,” Financial Innovation, vol. 7, no. 1, pp. 1–36, 2021, https://doi.org/10.1186/s40854-020-00220-2.

[26] C. Kanthila, A. Boodi, A. Marszal-Pomianowska, K. Beddiar, Y. Amirat, and M. Benbouzid, “Enhanced multi-horizon occupancy prediction in smart buildings using cascaded Bi-LSTM models with integrated features,” Energy Build, vol. 318, Sep. 2024, https://doi.org/10.1016/j.enbuild.2024.114442.

[27] T. Falatouri, F. Darbanian, P. Brandtner, and C. Udokwu, “Predictive Analytics for Demand Forecasting - A Comparison of SARIMA and LSTM in Retail SCM,” in Procedia Computer Science, pp. 993–1003, 2022, https://doi.org/10.1016/j.procs.2022.01.298.

[28] H. Wijaya, D. P. Hostiadi, and E. Triandini, “Optimization XGBoost Algorithm Using Parameter Tunning in Retail Sales Prediction,” Jurnal Nasional Pendidikan Teknik Informatika (JANAPATI), vol. 13, no. 3, Dec. 2024, https://doi.org/10.23887/janapati.v13i3.82214.

[29] S. Yao, “Use XGBOOST to Predict the Rental Based on Airbnb Open Data,” 2022, https://ceur-ws.org/Vol-3150/paper5.pdf.

[30] H. Sharma, H. Harsora, and B. Ogunleye, “An Optimal House Price Prediction Algorithm: XGBoost,” Analytics, vol. 3, no. 1, pp. 30–45, Jan. 2024, https://doi.org/10.3390/analytics3010003.

[31] A. Malki, E. S. Atlam, and I. Gad, “Machine learning approach of detecting anomalies and forecasting time-series of IoT devices,” Alexandria Engineering Journal, vol. 61, no. 11, pp. 8973–8986, Nov. 2022, https://doi.org/10.1016/j.aej.2022.02.038.

[32] Y. Liu, Z. Pang, M. Karlsson, and S. Gong, “Anomaly detection based on machine learning in IoT-based vertical plant wall for indoor climate control,” Build Environ, vol. 183, Oct. 2020, https://doi.org/10.1016/j.buildenv.2020.107212.

[33] K. Lee, H. Kim, and D. H. Shin, “Forecasting Short-Term Housing Transaction Volumes using Time-Series and Internet Search Queries,” KSCE Journal of Civil Engineering, vol. 23, no. 6, pp. 2409–2416, Jun. 2019, https://doi.org/10.1007/s12205-019-1926-9.

[34] H. Kang, K. Lee, and D. H. Shin, “Short-term Forecast Model of Apartment Jeonse Prices using Search Frequencies of News Article Keywords,” KSCE Journal of Civil Engineering, vol. 23, no. 12, pp. 4984–4991, Dec. 2019, https://doi.org/10.1007/s12205-019-5885-y.

[35] X. Zhu, “The role of hybrid models in financial decision-making: Forecasting stock prices with advanced algorithms,” Egyptian Informatics Journal, vol. 29, Mar. 2025, https://doi.org/10.1016/j.eij.2025.100610.

[36] J. Li, Y. Huang, Y. Lu, L. Wang, Y. Ren, and R. Chen, “Sentiment Analysis Using E-Commerce Review Keyword-Generated Image with a HybridMachine Learning-BasedModel,” Computers, Materials and Continua, vol. 80, no. 1, pp. 1581–1599, 2024, https://doi.org/10.32604/cmc.2024.052666.

[37] Z. Gui et al., “LSI-LSTM: An attention-aware LSTM for real-time driving destination prediction by considering location semantics and location importance of trajectory points,” Neurocomputing, vol. 440, pp. 72–88, Jun. 2021, https://doi.org/10.1016/j.neucom.2021.01.067.

[38] J. Luo, Z. Zhang, Y. Fu, and F. Rao, “Time series prediction of COVID-19 transmission in America using LSTM and XGBoost algorithms,” Results Phys, vol. 27, Aug. 2021, https://doi.org/10.1016/j.rinp.2021.104462.

[39] M. D. Islam, B. Li, K. S. Islam, R. Ahasan, M. R. Mia, M. E. Haque,“ Airbnb rental price modeling based on Latent Dirichlet Allocation and MESF-XGBoost composite model,” Machine Learning with Applications, vol. 7, p. 100208, 2022, https://www.sciencedirect.com/science/article/pii/S2666827021001043.

[40] V. van Zoest, K. Lindberg, F. El Gohary, and C. Bartusch, “Evaluating the effects of the COVID-19 pandemic on electricity consumption patterns in the residential, public, commercial and industrial sectors in Sweden,” Energy and AI, vol. 14, Oct. 2023, https://doi.org/10.1016/j.egyai.2023.100298.

[41] Ç. Demirel, A. A. Tokuç, and A. T. Tekin, “Click prediction boosting via Bayesian hyperparameter optimization-based ensemble learning pipelines,” Intelligent Systems with Applications, vol. 17, Feb. 2023, https://doi.org/10.1016/j.iswa.2023.200185.

[42] X. Xu and Y. Zhang, “House price forecasting with neural networks,” Intelligent Systems with Applications, vol. 12, p. 52, 2021, https://doi.org/10.1016/j.iswa.2021.20.

[43] A. B. Adetunji, O. N. Akande, F. A. Ajala, O. Oyewo, Y. F. Akande, and G. Oluwadara, “House Price Prediction using Random Forest Machine Learning Technique,” in Procedia Computer Science, pp. 806–813, 2021, https://doi.org/10.1016/j.procs.2022.01.100.

[44] M. Parzinger, L. Hanfstaengl, F. Sigg, U. Spindler, U. Wellisch, and M. Wirnsberger, “Comparison of different training data sets from simulation and experimental measurement with artificial users for occupancy detection — Using machine learning methods Random Forest and LASSO,” Build Environ, vol. 223, Sep. 2022, https://doi.org/10.1016/j.buildenv.2022.109313.

[45] H. J. Escalante, “Automated Machine Learning—A Brief Review at the End of the Early Years,” in Natural Computing Series, pp. 11–28, 2021, https://doi.org/10.1007/978-3-030-72069-8_2.

[46] Z. N. Jawad and V. Balázs, “Machine learning-driven optimization of enterprise resource planning (ERP) systems: a comprehensive review,” Beni Suef Univ J Basic Appl Sci, vol. 13, no. 1, 2024, https://doi.org/10.1186/s43088-023-00460-y.

[47] K. Taylor-Sakyi, “Big Data: Understanding Big Data,” arXiv preprint arXiv:1601.04602, 2016. [Online]. Available: https://www.researchgate.net/publication/291229189.

[48] S. Wang, “An interview with Shouyang Wang: research frontier of big data-driven economic and financial forecasting,” Data Science and Management, vol. 1, no. 1, pp. 10-12, 2021, https://doi.org/10.1016/j.dsm.2021.01.001.

[49] K. Ding, Y. Niu, W. C. Choo, “The evolution of Airbnb research A systematic literature review using structural topic modeling,” Heliyon, vol. 9, no. 6, 2023, https://www.cell.com/heliyon/fulltext/S2405-8440(23)04298-6.

[50] O. Olubusola, N. Z. Mhlongo, D. O. Daraojimba, A. O. Ajayi-Nifise, and T. Falaiye, “Machine learning in financial forecasting: A U.S. review: Exploring the advancements, challenges, and implications of AI-driven predictions in financial markets,” World Journal of Advanced Research and Reviews, vol. 21, no. 2, pp. 1969–1984, Feb. 2024, https://doi.org/10.30574/wjarr.2024.21.2.0444.

[51] S. A. Sayed, Y. Abdel-Hamid, and H. A. Hefny, “Artificial intelligence-based traffic flow prediction: a comprehensive review,” Journal of Electrical Systems and Information Technology, vol. 10, no. 1, pp. 1–50, 2023, https://doi.org/10.1186/s43067-023-00081-6.

Downloads

Published

2025-04-22

How to Cite

[1]
R. F. N. Nurfalah, D. P. Hostiadi, and E. Triandini, “Performance Analysis of Prediction Methods on Tokyo Airbnb Data: A Comparative Study of Hyperparameter-Tuned XGBoost, ARIMA, and LSTM”, J. Ilm. Tek. Elektro Komput. Dan Inform, vol. 11, no. 2, pp. 184–193, Apr. 2025.

Issue

Section

Articles

Similar Articles

1 2 3 4 5 6 7 > >> 

You may also start an advanced similarity search for this article.