•  
  •  
 

Corresponding Author

Halime Nuran CANER

Document Type

Original Study

Keywords

AI detection tools, Generative AI, Academic integrity, Systematic review, PRISMA, Authorship verification

Abstract

The rapid advancement of generative artificial intelligence has significantly transformed academic writing practices, prompting institutions to implement tools designed to verify authorship and uphold academic integrity. Artificial intelligence detection systems have emerged as a prominent, albeit increasingly debated response to these challenges. This systematic review synthesizes empirical evidence to assess the reliability, fairness, and pedagogical implications of artificial intelligence text-detection tools in educational settings. Adhering to PRISMA 2020 standards, this review identified twenty-five peer-reviewed empirical studies, 18 of which were conducted directly within educational settings. This review synthesizes empirical studies published between 2022 and 2025, encompassing quantitative, qualitative, and mixed-methods designs across diverse disciplinary and linguistic contexts. The findings indicate that AI text detection tools are unsuitable for high-stakes academic integrity decisions in their current form. Furthermore, there is substantial variability and instability in detection accuracy across tools, genres, and linguistic backgrounds; a noticeable weakness in paraphrasing, translation, and other adversarial techniques; and systemic biases that disproportionately affect non-native English writers. Human judgment was also found to be inconsistent, reinforcing the difficulty in reliably distinguishing AI-generated text from human-authored text. Collectively, these results raise significant ethical, pedagogical and institutional concerns. This review underscores the need for integrity strategies that prioritize transparency, AI literacy, fairness-aware design, and process-based assessment rather than relying on detection-centered approaches. The findings suggest the necessity of hybrid approaches that combine watermarking and fairness-aware detection algorithms with process-oriented assessment, AI literacy initiatives, and cross-linguistic benchmarking, alongside interpretability-focused and longitudinal research on students’ perceptions of AI detection.

Publication Date

12-31-2025

Share

COinS