Performance Comparison Of BERT Metrics and Classical Machine Learning Models (SVM,Naive Bayes) for Sentiment Analysis
DOI:
https://doi.org/10.35314/wmh3rg23Keywords:
BERT, SVM, Naïve Bayes, Sentiment Analysis, Performance MetricsAbstract
Sentiment analysis is one of the important methods in understanding public opinion from large amounts of text, such as product reviews or user comments. Many studies have shown that the BERT (BiDirectional Encoder Representations from Transformers) model has advantages over classical machine learning models such as Support Vector Machine (SVM) and Naïve Bayes. However, there are still few studies that systematically compare the performance of the two on datasets from various topics and languages, especially those with imbalanced label distributions. This study compares four BERT variants (bert-base-uncased, distilbert-base-uncased, indobert-base-uncased, and distilbert-base-indonesian) with two classical models using three datasets of IMDb 50K (English), Amazon Food Reviews (English), and Gojek App Review (Indonesian). The classical model uses the TF-IDF vectorisation method, while the BERT model is optimised through a further training process (fine-tuning) with a layer freezing technique. The evaluation is carried out using accuracy, precision, recall, and F1-score. The results show that the BERT model excels on English data, while on imbalanced Indonesian data, SVM and Naïve Bayes produce higher F1-score results. These findings indicate that the selection of the right model must be adjusted to the characteristics of the data used.
Downloads
Downloads
Published
Issue
Section
License
Copyright (c) 2025 INOVTEK Polbeng - Seri Informatika

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.