article Article Summary
Jul 21, 2025
Blog Image

ChatGPT demonstrates a moderately positive impact on students' academic achievement across various educational contexts, with effectiveness varying significantly based on discipline, intervention duration, sample size, and instructional approach.

ChatGPT demonstrates a moderately positive impact on students' academic achievement across various educational contexts, with effectiveness varying significantly based on discipline, intervention duration, sample size, and instructional approach.

Objective: This comprehensive meta-analysis aimed to systematically evaluate the overall effectiveness of ChatGPT in enhancing students' academic achievement and identify key moderating factors that influence its educational impact. The researchers sought to address the lack of academic consensus regarding ChatGPT's effectiveness in education by synthesizing findings from multiple empirical studies and examining how various contextual factors affect learning outcomes.

Methods: The study employed a rigorous meta-analysis methodology following PRISMA guidelines, analyzing 37 studies comprising 37 effect sizes published between 2022 and 2025. The researchers conducted systematic searches across multiple databases including Web of Science, ProQuest, ScienceDirect, EBSCOhost, PsycINFO, IEEE Xplore, and Scopus. They calculated overall effect sizes using Hedges' g and employed the random-effects model due to high heterogeneity among studies. The analysis included comprehensive moderator analyses examining nine variables: educational level (K-12 vs. college), discipline (information engineering, natural sciences, social sciences), intervention duration, sample size, knowledge type (declarative vs. procedural), instructional model (traditional vs. flipped classroom), role-setting, learning approach (collaborative vs. individual), and generated content type (text vs. code). Quality assessment was conducted using established criteria, and publication bias was evaluated through multiple statistical tests.

Key Findings: The meta-analysis revealed several significant findings regarding ChatGPT's educational effectiveness. The overall effect size was g=0.577 (95% CI [0.395, 0.759], p<0.001), indicating a moderately positive impact on student academic achievement. Moderator analysis showed that ChatGPT was most effective in social sciences compared to natural sciences and information engineering, with significant intergroup differences. The optimal intervention duration was 5-10 weeks, showing larger effects than shorter or longer periods. Sample sizes of 21-40 participants demonstrated the greatest effectiveness, suggesting an optimal balance between personalized attention and resource utilization. ChatGPT proved more effective for learning declarative knowledge (facts and concepts) than procedural knowledge (skills and processes). Traditional classroom settings combined with ChatGPT showed superior results compared to flipped classroom approaches. Text generation tasks yielded better academic achievement outcomes than code generation tasks. Notably, educational level, role-setting, and learning approach did not show significant moderating effects.

Implications: These findings provide crucial evidence-based guidance for educators and policymakers regarding ChatGPT integration in educational settings. The research demonstrates that ChatGPT is not universally effective across all contexts but requires strategic implementation considering specific moderating factors. The superior effectiveness in social sciences suggests that ChatGPT's natural language processing capabilities align well with humanities-based learning, while its limitations in technical fields highlight the need for specialized AI tools or hybrid approaches. The optimal intervention duration of 5-10 weeks provides practical guidance for curriculum planning, suggesting that moderate exposure periods maximize benefits while avoiding over-reliance. The study supports the development of targeted teaching strategies that leverage ChatGPT's strengths while addressing its limitations across different educational contexts.

Limitations: The study acknowledges several important limitations that may affect the generalizability of findings. Some moderator categories had small sample sizes, restricting analysis of interaction effects between variables. The exclusion of non-English studies may limit cultural and linguistic diversity in the findings. The focus solely on ChatGPT, while excluding other generative AI tools, may not capture the full spectrum of AI educational applications. The analysis was limited to quantitative studies with available effect size data, potentially missing valuable qualitative insights. Additionally, the rapid evolution of AI technology means that findings based on earlier ChatGPT versions may not fully apply to newer iterations.

Limitations: Several methodological constraints should be considered when interpreting these results. The heterogeneity among included studies, while addressed through random-effects modeling, still presents challenges for unified conclusions. The relatively short timeframe since ChatGPT's release (November 2022) limited the available research pool and long-term outcome assessment. Publication bias, though statistically tested and found minimal, remains a potential concern in emerging research areas.

Future Directions: The researchers recommend several important avenues for future investigation. Expanding research to include larger sample sizes would enable more robust analysis of moderator interactions, particularly between sample size and learning approach variables. Incorporating studies in multiple languages would enhance global applicability and cultural understanding of ChatGPT's educational impact. Comparative studies examining different generative AI tools would provide broader insights into AI-assisted learning effectiveness. Longitudinal research investigating long-term retention and skill development would address current gaps in understanding sustained educational benefits. Development of specialized AI tools tailored for specific disciplines, particularly in STEM fields where ChatGPT showed limited effectiveness, represents a crucial research priority. Additionally, qualitative research exploring student and teacher experiences would complement quantitative findings and inform best practices for implementation.

Title and Authors: "The Impact of ChatGPT on Students' Academic Achievement: A Meta-Analysis" by Zhiwei Liu, Haode Zuo, and Yongjing Lu from the College of Mathematical Science, Yangzhou University, Yangzhou, China.

Published On: July 6, 2025 (accepted), with final publication in 2025

Published By: Journal of Computer Assisted Learning, published by John Wiley & Sons Ltd.

Related Link

Comments

Please log in to leave a comment.