Comparison of Effectiveness of Machine Learning Methods in Predicting Chemical Compound Toxicity Enhance Pharmaceutical Product Safety

Dufan Yuwana; Pulung  Andono; Hendy Kurniawan

doi:10.35314/emkzcz13

Comparison of Effectiveness of Machine Learning Methods in Predicting Chemical Compound Toxicity Enhance Pharmaceutical Product Safety

Authors

Dufan Yuwana UNIVERSITAS DIAN NUSWANTORO Author
Prof. Dr. Pulung Nurtantio Andono, S.T., M.Kom. UNIVERSITAS DIAN NUSWANTORO Author
Hendy Kurniawan UNIVERSITAS DIAN NUSWANTORO Author

DOI:

https://doi.org/10.35314/emkzcz13

Keywords:

Machine Learning, Toxicity Prediction, Gradient Boosting, Model Validation, Pharmaceutical Safety

Abstract

This study compares the effectiveness of machine learning methods in predicting the toxicity of chemical compounds using a dataset containing 5,000 samples with 14 key features. The dataset underwent preprocessing, including normalization, missing data handling, and oversampling to address data imbalance. The models used include Decision Tree, Random Forest, Extra Trees, and Gradient Boosting, validated using k-fold cross-validation. Evaluation based on accuracy, precision, recall, and F1-score showed that Gradient Boosting achieved the best performance with 92.3% accuracy, though it still faces challenges such as overfitting and interpretability limitations. Compared to in vitro and in vivo methods, machine learning is more efficient but still requires further experimental validation. This study recommends optimizing models through ensemble learning and explainable AI to improve prediction reliability.

Downloads

Download data is not yet available.

Downloads

Published

26-02-2025

Issue

Vol. 10 No. 1 (2025): March

Section

Articles

License

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

How to Cite

Comparison of Effectiveness of Machine Learning Methods in Predicting Chemical Compound Toxicity Enhance Pharmaceutical Product Safety. (2025). INOVTEK Polbeng - Seri Informatika, 10(1), 458-469. https://doi.org/10.35314/emkzcz13