Depression Detection on Social Media X Using Hybrid Deep Learning CNN-BiGRU with Attention Mechanism and FastText Feature Expansion
DOI:
https://doi.org/10.26555/jiteki.v11i2.30687Keywords:
Mental Health, CNN, BiGRU, FastText, Attention MechanismAbstract
Depression is a global mental health disorder affecting over 280 million people, with significant challenges in identifying sufferers due to societal stigma. In Indonesia, the National Adolescent Mental Health Survey in 2022 revealed that 17.95 million adolescents experience mental health disorders, with a portion of them suffering from depression. Social media platform X offers an alternative for individuals to share their mental health status anonymously, bypassing societal stigma. This study proposes a hybrid deep learning model combining CNN and BiGRU with an attention mechanism, TF-IDF for feature extraction, and FastText for feature expansion to detect depression in Indonesian tweets. The dataset comprises 50,523 Indonesian tweets, supplemented by a similarity corpus of 151,117 data. To optimize model performance, five experimental scenarios were conducted, focusing on split ratios, n-gram configurations, maximum features, feature expansion, and attention mechanisms. The main contribution of this research is the novel integration of FastText for feature expansion and the attention mechanism within a CNN-BiGRU hybrid model for depression detection. The results demonstrate the effectiveness of this combination, with the BiGRU-ATT-CNN-ATT model achieving an accuracy of 84.40%. However, challenges such as handling noisy, ambiguous social media data and addressing out-of-vocabulary words remain. Future research should explore additional feature expansion techniques, optimization algorithms, and approaches to handle noisy data, improving model robustness for real-world applications in mental health detection.
References
[1] H. Tufail, S. M. Cheema, M. Ali, I. M. Pires, and N. M. Garcia, “Depression Detection with Convolutional Neural Networks: A Step Towards Improved Mental Health Care,” Procedia Comput Sci, vol. 224, pp. 544–549, 2023, https://doi.org/10.1016/j.procs.2023.09.079.
[2] Vandana, N. Marriwala, and D. Chaudhary, “A hybrid model for depression detection using deep learning,” Measurement: Sensors, vol. 25, p. 100587, 2023, https://doi.org/10.1016/j.measen.2022.100587.
[3] “Depressive disorder (depression).” [Online]. Available: https://www.who.int/news-room/fact-sheets/detail/depression.
[4] “Survei: 17,9 Juta Remaja Indonesia Punya Masalah Mental, Ini Gangguan yang Diderita.” [Online]. Available: https://www.detik.com/edu/detikpedia/d-7150554/survei-17-9-juta-remaja-indonesia-punya-masalah-mental-ini-gangguan-yang-diderita.
[5] M. Ahmad Wani, M. A. Elaffendi, K. A. Shakil, A. Shariq Imran, and A. A. Abd El-Latif, “Depression Screening in Humans With AI and Deep Learning Techniques,” IEEE Trans Comput Soc Syst, vol. 10, no. 4, pp. 2074–2089, 2023, https://doi.org/10.1109/TCSS.2022.3200213.
[6] S. Bengtsson and S. Johansson, “The Meanings of Social Media Use in Everyday Life: Filling Empty Slots, Everyday Transformations, and Mood Management,” Social Media and Society, vol. 8, no. 4, 2022, https://doi.org/10.1177/20563051221130292.
[7] Aschbrenner, J. A. Naslund, A. Bondre, J. Torous, and K. A., “Social Media and Mental Health: Benefits, Risks, and Opportunities for Researhc and Practice,” J Technol Behav Sci, vol. 5, no. 3, pp. 245–257, 2020, [Online]. Available: https://doi.org/10.1007/s41347-020-00134-x.
[8] H. Kour and M. K. Gupta, An hybrid deep learning approach for depression prediction from user tweets using feature-rich CNN and bi-directional LSTM, vol. 81, no. 17, 2022. https://doi.org/10.1007/s11042-022-12648-y.
[9] T. Srajan Kumar, “A Deep Learning Framework with a Hybrid Model for Automatic Depression Detection in Social Media Posts,” International Journal of Intelligent Systems and Applications in Engineering, vol. 12, no. 4, pp. 3217–3231, Jun. 2024, [Online]. Available: https://www.ijisae.org/index.php/IJISAE/article/view/6816.
[10] S. Almutairi, M. Abohashrh, H. Hayder, Zulqarnain, A. Namoun, and F. Khan, “A Hybrid Deep Learning Model for Predicting Depression Symptoms From Large-Scale Textual Dataset,” IEEE Access, p. 1, Feb. 2024, https://doi.org/10.1109/ACCESS.2024.3496741.
[11] L. Bendebane, Z. Laboudi, A. Saighi, H. Al-Tarawneh, A. Ouannas, and G. Grassi, “A Multi-Class Deep Learning Approach for Early Detection of Depressive and Anxiety Disorders Using Twitter Data,” Algorithms, vol. 16, no. 12, 2023, https://doi.org/10.3390/a16120543.
[12] A. Amalia, O. S. Sitompul, E. B. Nababan, and T. Mantoro, “An Efficient Text Classification Using fastText for Bahasa Indonesia Documents Classification,” International Conference on Data Science, Artificial Intelligence, and Business Analytics, DATABIA 2020 - Proceedings, pp. 69–75, 2020, https://doi.org/10.1109/DATABIA50434.2020.9190447.
[13] J. Teng, W. Kong, Y. Chang, Q. Tian, C. Shi, and L. Li, “Text Classification Method Based on BiGRU-Attention and CNN Hybrid Model,” ACM International Conference Proceeding Series, pp. 614–622, 2021, https://doi.org/10.1145/3488933.3488970.
[14] W. Yan, L. Zhou, Z. Qian, L. Xiao, H. Zhu “Sentiment Analysis of Student Texts Using the CNN‐BiGRU‐AT Model,” Scientific Programming, p. 8405623, 2021, https://doi.org/10.1155/2021/8405623.
[15] J. Opitz, “A Closer Look at Classification Evaluation Metrics and a Critical Reflection of Common Evaluation Practice,” Trans Assoc Comput Linguist, vol. 12, pp. 820–836, Jun. 2024, https://doi.org/10.1162/tacl_a_00675.
[16] Y. Zhang, Y. Zhou, and J. Yao, “Feature extraction with TF-IDF and game-theoretic shadowed sets,” in Communications in Computer and Information Science, in Communications in computer and information science, Cham: Springer International Publishing, pp. 722–733, 2020, https://doi.org/10.1007/978-3-030-50146-4_53.
[17] H. A. Taba and H. Suparwito, “Convolutional neural networks for text classification: A study on public activity restriction,” AIP Conf Proc, vol. 3077, no. 1, p. 40016, Feb. 2024, https://doi.org/10.1063/5.0201145.
[18] M. Rhanoui, M. Mikram, S. Yousfi, and S. Barzali, “A CNN-BiLSTM Model for Document-Level Sentiment Analysis,” Mach Learn Knowl Extr, vol. 1, no. 3, pp. 832–847, 2019, https://doi.org/10.3390/make1030048.
[19] L. Zhou and X. Bian, “Improved text sentiment classification method based on BiGRU-Attention,” J Phys Conf Ser, vol. 1345, no. 3, 2019, https://doi.org/10.1088/1742-6596/1345/3/032097.
[20] Q. Wang, J. Tian, M. Li, and M. Lu, “Text Classification Based on CNN-BiGRU and Its Application in Telephone Comments Recognition,” Int J Comput Intell Appl, vol. 22, Feb. 2023, https://doi.org/10.1142/S1469026823500219.
[21] C. and S. R. J. Jamali Ali Akbar and Berger, “Momentary Depressive Feeling Detection Using X (Formerly Twitter) Data: Contextual Language Approach,” JMIR AI, vol. 2, p. e49531, Nov. 2023, https://doi.org/10.2196/49531.
[22] P. Sankar, N. Palanichamy, and K.-W. Ng, “Sentiment Analysis on Twitter Data for Depression Detection,” Journal of Logistics, Informatics and Service Science, vol. 11, no. 3, pp. 21–36, 2024, https://doi.org/10.33168/JLISS.2024.0302.
[23] M. A. Palomino and F. Aider, “Evaluating the Effectiveness of Text Pre-Processing in Sentiment Analysis,” Applied Sciences, vol. 12, no. 17, 2022, https://doi.org/10.3390/app12178765.
[24] M. D. Samad, N. D. Khounviengxay, and M. A. Witherow, “Effect of Text Processing Steps on Twitter Sentiment Classification using Word Embedding,” arXiv preprint arXiv:2007.13027, 2020. [Online]. Available: https://arxiv.org/abs/2007.13027.
[25] C. P. Chai, “Comparison of text preprocessing methods,” Nat Lang Eng, vol. 29, no. 3, pp. 509–553, 2023, https://doi.org/10.1017/S1351324922000213.
[26] K. M. G. S. Karunarathna and R. Rupasingha, “Learning to Use Normalization Techniques for Preprocessing and Classification of Text Documents,” International Journal of Multidisciplinary Studies, vol. 9, no. 2, pp. 67–82, 2022, [Online]. Available: https://journals.sjp.ac.lk/index.php/ijms/article/view/6429.
[27] S. Pradha, M. N. Halgamuge, and N. Tran Quoc Vinh, “Effective Text Data Preprocessing Technique for Sentiment Analysis in Social Media Data,” in 2019 11th International Conference on Knowledge and Systems Engineering (KSE), 2019, pp. 1–8. https://doi.org/10.1109/KSE.2019.8919368.
[28] K. Harmandini and K. L, “Analysis of TF-IDF and TF-RF Feature Extraction on Product Review Sentiment,” Sinkron, vol. 8, pp. 929–937, Feb. 2024, https://doi.org/10.33395/sinkron.v8i2.13376.
[29] S. S. Shaker, D. Alhajim, A. A. T. Al-Khazaali, H. A. Hussein, and A. F. Athab, “Feature Extraction based Text Classification: A review,” J Algebr Stat, vol. 13, no. 1, pp. 646–653, 2022, [Online]. Available: https://www.researchgate.net/publication/361226607_Feature_Extraction_based_Text_Classification_A_review.
[30] S.-W. Kim and J.-M. Gil, “Research paper classification systems based on TF-IDF and LDA schemes,” Human-centric Computing and Information Sciences, vol. 9, no. 1, p. 30, 2019, https://doi.org/10.1186/s13673-019-0192-7.
[31] S. Robertson, “Understanding Inverse Document Frequency: On Theoretical Arguments for IDF,” Journal of Documentation - J DOC, vol. 60, pp. 503–520, Feb. 2004, https://doi.org/10.1108/00220410410560582.
[32] M. Umer et al., “Impact of convolutional neural network and FastText embedding on text classification,” Multimed Tools Appl, vol. 82, no. 4, pp. 5569–5585, 2023, https://doi.org/10.1007/s11042-022-13459-x.
[33] S. Khomsah, R. Ramadhani, and S. Wijaya, “The Accuracy Comparison Between Word2Vec and FastText On Sentiment Analysis of Hotel Reviews,” Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi), vol. 6, pp. 352–358, Feb. 2022, https://doi.org/10.29207/resti.v6i3.3711.
[34] V. R. Joseph, “Optimal ratio for data splitting,” Statistical Analysis and Data Mining: An ASA Data Science Journal, vol. 15, no. 4, pp. 531–538, 2022, https://doi.org/https://doi.org/10.1002/sam.11583.
[35] A. Jarrahi, R. Mousa, and L. Safari, “SLCNN: Sentence-Level Convolutional Neural Network for Text Classification,” arXiv preprint arXiv:2301.11696, 2023. [Online]. Available: https://arxiv.org/abs/2301.11696.
[36] Y. Zhu, “Research on News Text Classification Based on Deep Learning Convolutional Neural Network,” Wirel Commun Mob Comput, vol. 2021, no. 1, p. 1508150, 2021, https://doi.org/https://doi.org/10.1155/2021/1508150.
[37] S. Liu, W. Lin, Y. Wang, D. Z. Yu, Y. Peng, and X. Ma, “Convolutional Neural Network-Based Bidirectional Gated Recurrent Unit–Additive Attention Mechanism Hybrid Deep Neural Networks for Short-Term Traffic Flow Prediction,” Sustainability, vol. 16, no. 5, 2024, https://doi.org/10.3390/su16051986.
[38] A. Traoré and M. A. Akhloufi, “2D Bidirectional Gated Recurrent Unit Convolutional Neural Networks for End-to-End Violence Detection in Videos,” in Image Analysis and Recognition, Springer International Publishing, pp. 152–160, 2020, https://doi.org/10.1007/978-3-030-50347-5_14.
[39] S. R. Dubey, S. K. Singh, and B. B. Chaudhuri, “Activation Functions in Deep Learning: A Comprehensive Survey and Benchmark,” Neurocomputing, vol. 503, pp. 92-108, 2022, [Online]. Available: https://arxiv.org/abs/2109.14545.
[40] Z. Gao, Z. Li, J. Luo, and X. Li, “Short Text Aspect-Based Sentiment Analysis Based on CNN + BiGRU,” Applied Sciences, vol. 12, no. 5, 2022, https://doi.org/10.3390/app12052707.
[41] G. Brauwers and F. Frasincar, “A General Survey on Attention Mechanisms in Deep Learning,” IEEE Trans Knowl Data Eng, vol. 35, no. 4, pp. 3279–3298, 2023, https://doi.org/10.1109/TKDE.2021.3126456.
[42] G. Alfattni, N. Peek, and G. Nenadic, “Attention-based bidirectional long short-term memory networks for extracting temporal relationships from clinical discharge summaries,” J Biomed Inform, vol. 123, p. 103915, 2021, https://doi.org/10.1016/j.jbi.2021.103915.
[43] S. Visa, B. Ramsay, A. Ralescu, and E. Van Der Knaap, “Confusion matrix-based feature selection,” CEUR Workshop Proc, vol. 710, pp. 120–127, 2011, https://openworks.wooster.edu/facpub/88/ .
[44] S. Sathyanarayanan and B. R. Tantri, “Confusion Matrix-Based Performance Evaluation Metrics,” African Journal of Biomedical Research, vol. 27, no. 4S, pp. 4023–4031, Nov. 2024, https://doi.org/10.53555/AJBR.v27i4S.4345.
[45] K. Riehl, M. Neunteufel, and M. Hemberg, “Hierarchical confusion matrix for classification performance evaluation,” J R Stat Soc Ser C Appl Stat, vol. 72, no. 5, pp. 1394–1412, Feb. 2023, https://doi.org/10.1093/jrsssc/qlad057.
[46] O. Rainio, J. Teuho, and R. Klén, “Evaluation metrics and statistical tests for machine learning,” Sci Rep, vol. 14, no. 1, p. 6086, 2024, https://doi.org/10.1038/s41598-024-56706-x.
[47] J. Opitz, “A Closer Look at Classification Evaluation Metrics and a Critical Reflection of Common Evaluation Practice,” Trans Assoc Comput Linguist, vol. 12, pp. 820–836, Jun. 2024, https://doi.org/10.1162/tacl_a_00675.
[48] S. A. Hicks et al., “On evaluation metrics for medical applications of artificial intelligence,” Sci. Rep., vol. 12, no. 1, p. 5979, Apr. 2022, https://doi.org/10.1038/s41598-022-09954-8.
[49] C. Cao, D. Chicco, and M. M. Hoffman, “The MCC-F1 curve: a performance evaluation technique for binary classification,” 2020. [Online]. Available: https://arxiv.org/abs/2006.11278.
[50] Y. Setiawan, N. U. Maulidevi, and K. Surendro, “The Optimization of n-Gram Feature Extraction Based on Term Occurrence for Cyberbullying Classification,” Data Sci J, May 2024, https://doi.org/10.5334/dsj-2024-031.
[51] A. Maiti, A. Abarda, and M. Hanini, “The impact of feature extraction techniques on the performance of text data classification models,” Indonesian Journal of Electrical Engineering and Computer Science, vol. 35, p. 1041, Feb. 2024, https://doi.org/10.11591/ijeecs.v35.i2.pp1041-1052.
[52] S. Lamar, R. Fitting, and A. Jayasumana, “Cognitive Communications System for Ultra-Low Size, Weight and Power (SWAP) Attributable Platforms,” IEEE Access, vol. 10, pp. 41381–41387, 2022, https://doi.org/10.1109/ACCESS.2022.3167039.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 I Wayan Abi Widiarta, Erwin Budi Setiawan

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Authors who publish with JITEKI agree to the following terms:
- Authors retain copyright and grant the journal the right of first publication with the work simultaneously licensed under a Creative Commons Attribution License (CC BY-SA 4.0) that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work.
This work is licensed under a Creative Commons Attribution 4.0 International License