DETEKSI RISIKO DEPRESI PADA PENGGUNA MEDIA SOSIAL INDONESIA MENGGUNAKAN INDOBERTWEET

Authors

  • Suprianto Author
  • Dini Nur Wahyu Ningsih Universitas Muhammadiyah Sidoarjo image/svg+xml Author

Keywords:

IndoBERTweet, Depresi, Media Sosial, Pemrosesan Bahasa Alami (NLP), Kesehatan Mental Digital

Abstract

This study created a depression risk detection model for Indonesian-speaking social media users using IndoBERTweet, a pre-drilled transformer-based language model optimized specifically for Indonesian on Twitter. Data collection was conducted from 5,000 public Twitter posts using keywords related to negative emotional expressions and psychological terms, obtained between January and June 2024. The data was labeled using a combination of a translated PHQ-9 (Patient Health Questionnaire-9) lexicon and manual annotation by three clinical psychologists, resulting in a high level of inter-rater consistency (Cohen's Kappa of 0.82). The IndoBERTweet model was fine-tuned and compared with several baseline models, including SVM with TF-IDF, LSTM with Word2Vec, and IndoBERT-base. The results showed that IndoBERTweet achieved the highest accuracy of 91.4% and an F1 score of 0.89, significantly outperforming the baseline models (with a 7–10% difference in F1 score). These findings demonstrate that the local-language transformer model is highly effective in capturing the contextual and informal nuances of Indonesian language used to indicate depression in social media data. This research significantly contributes to the development of a Natural Language Processing (NLP)-based mental health early screening system within the Indonesian linguistic and cultural context.

Downloads

Download data is not yet available.

References

[1] Chandra, V., et al. (2023). Mental Health Analysis in Social Media with Deep Learning. IEEE Access, 11, 392–404.

[2] Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. Proceedings of NAACL-HLT.

[3] Liu, T., et al. (2024). Ethical Considerations in Social Media Mental Health Research: A Systematic Review. Journal of Medical Internet Research (JMIR), 26(7), e10231.

[4] Maranta, A., et al. (2022). IndoBERTweet: A Pre-trained Language Model for Indonesian Twitter Data. Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (ACL).

[5] Nugraha, R., & Fikri, M. (2022). Deteksi Kesehatan Mental Mahasiswa Indonesia dengan Algoritma NLP. Jurnal Teknologi Informasi, 8(2), 56–67.

[6] Nugraha, R., Purwitasari, D., & Kautsar, F. W. (2020). IndoBERT: Pretrained Language Model for Indonesian. Proceedings of the 2020 International Conference on Asian Language Processing (IALP).

[7] Rahman, Z., & Iskandar, H. (2024). Adaptive Language Modeling for Emotion Detection in Low-Resource Languages. ACM Transactions on Artificial Intelligence, 5(3).

[8] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., ... & Polosukhin, I. (2017). Attention Is All You Need. Advances in Neural Information Processing Systems (NeurIPS), 30.

[9] Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2020). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. NAACL.

[10] Cahyani, D., & Adriani, M. (2021). IndoBERTweet: Pretrained Language Model for Indonesian Social Media Text. Journal of Indonesian NLP Research.

[11] Wulandari, A., & Lestari, D. (2022). Klasifikasi tingkat depresi berbasis analisis teks bahasa Indonesia menggunakan model Transformer. Jurnal Teknologi Informasi dan Ilmu Komputer

Downloads

Published

2025-12-18