Comparative Analysis 0f Random Forest and Xgboost Performance for Network Flow Based Malware Classification
DOI:
https://doi.org/10.35314/8f891c76Keywords:
Malware Detection, Network Flow, Random Forest, XGBoost, Computational EfficiencyAbstract
The evolving complexity of cyber threats, particularly malware propagation through network infrastructure, necessitates intrusion detection mechanisms that are both precise and computationally efficient. This study presents an in-depth comparative analysis of two ensemble learning algorithms, Random Forest (RF) and Extreme Gradient Boosting (XGBoost), in classifying network traffic anomalies based on network flow features. Empirical validation was conducted using the CSE-CIC-IDS2018 dataset, which comprehensively represents a spectrum of modern attacks. The research methodology systematically includes data preprocessing, handling class imbalance via weighting techniques, and performance evaluation based on accuracy, F1-score, and inference time metrics. Experimental results indicate that both models achieved high performance convergence with perfect Area Under Curve (AUC) scores. However, XGBoost demonstrated technical superiority with an accuracy of 99.8%, slightly surpassing Random Forest at 99.4%. The most significant finding of this study lies in computational efficiency, where XGBoost proved to be 14% faster (6.36 seconds) in prediction compared to Random Forest (7.42 seconds) on a large-scale test set. This fact confirms that the boosting architecture in XGBoost offers an optimal balance between detection sensitivity and system latency. Based on this evidence, XGBoost is recommended as the best classification model for real-time intrusion detection system implementations that prioritize rapid threat response.
Downloads
Published
Issue
Section
License
Copyright (c) 2026 INOVTEK Polbeng - Seri Informatika

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

