Comparative Analysis of Machine Learning Models for BUMN Bank Stock Sentiment Classification During Danantara Formation Period

Authors

  • Hafizha Nurul Qolby Universitas Pendidikan Indonesia Author
  • Rangga Gelar Guntara Universitas Pendidikan Indonesia Author
  • Syti Sarah Maesaroh Universitas Pendidikan Indonesia Author

DOI:

https://doi.org/10.35314/91z79392

Keywords:

sentiment analysis, cosine similarity, machine learning, stocks, SOE

Abstract

Discussions about state-owned bank stocks (BBRI, BBNI, and BMRI) on platform X intensified during the formation of Danantara. However, the correlation between social media sentiment and stock movements remains weak due to high noise levels and potential buzzer activity. This study combines sentiment and text similarity analyses (cosine similarity) to identify repeated communication patterns in discussions related to state-owned bank stocks. A total of 1,086 tweets were manually labeled and verified by two independent validators Text features were represented using TF–IDF and evaluated through four classical machine learning algorithms: Naïve Bayes, Logistic Regression, Support Vector Machine, and XGBoost. The model was validated using a hold-out scheme (80:20) and assessed with a confusion matrix. The sentiment distribution of the dataset shows 53% negative and 47% positive tweets Logistic Regression achieved the highest accuracy of 66%. The cosine similarity analysis identified 1.8% of tweets with similarity ≥0.90, indicating limited recurring communication patterns. These findings suggest that integrating sentiment and text similarity analyses can serve as an initial approach to detect indications of coordinated activity and to understand public opinion dynamics toward state-owned bank stocks.

Downloads

Download data is not yet available.

Downloads

Published

15-11-2025

Issue

Section

Articles

How to Cite

Comparative Analysis of Machine Learning Models for BUMN Bank Stock Sentiment Classification During Danantara Formation Period. (2025). INOVTEK Polbeng - Seri Informatika, 10(3), 1718-1729. https://doi.org/10.35314/91z79392