Analysis of Differences Between AI and Human Texts Using the Natural Language Processing Method

Dinda Cahyana; VitoReyLukito Sijabat; Mohammad  Irfan Fahmi

doi:10.35314/3wqgd409

Authors

Dinda Cahyana University Prima Indonesia Author
VitoReyLukito Sijabat Universitas Prima Indonesia Author
Mohammad Irfan Fahmi Universitas Prima Indonesia Author

DOI:

https://doi.org/10.35314/3wqgd409

Keywords:

Artificial Intelligence, NLP, human writing, linguistic study, generative text

Abstract

Artificial Intelligence has become increasingly proficient in generating text that mimics human writing, yet existing detection tools remain limited in accuracy and adaptability. Previous studies indicate that systems like Turnitin and GPTZero often perform below 80% accuracy and struggle with paraphrased or advanced AI-generated content. This study addresses that gap by analyzing linguistic differences between AI-generated and human-written texts using Natural Language Processing. A dataset of 487,235 texts (305,797 human-written and 181,438 AI-generated) was processed using TF-IDF vectorization and classified with the Multinomial Naive Bayes algorithm. The model achieved 99.35% accuracy and an F1-score of 0.9948, with balanced performance in detecting both text types. Results show that while AI-generated texts are structurally consistent, they often lack the emotional depth and cultural nuance found in human writing. These findings suggest NLP methods are highly effective in distinguishing between the two, and have practical implications for developing more reliable detection systems to ensure textual authenticity in education, journalism, and digital media monitoring.