article Article Summary
Feb 03, 2025
Blog Image

An explainable machine learning model can accurately detect AI-generated pseudocode in K-12 programming education with 98.97% accuracy, highlighting key differences between student and AI-generated code.

An explainable machine learning model can accurately detect AI-generated pseudocode in K-12 programming education with 98.97% accuracy, highlighting key differences between student and AI-generated code.

Objective: To develop an explainable machine learning approach for detecting ChatGPT-generated pseudocode in K-12 programming education and understand the key differences between student and AI-generated pseudocode.

Methods:

  • Analyzed 7,838 pseudocode submissions from 2,578 high school students (2020-2023)
  • Generated 6,300 pseudocode samples using three ChatGPT models
  • Developed ensemble machine learning model for detection
  • Used SHAP values for model interpretability
  • Conducted human evaluation of misclassified submissions

Key Findings:

  • Model achieved 98.97% accuracy in detecting AI-generated pseudocode
  • AI-generated pseudocode uses more complex verbs and shorter sentences
  • Student submissions use simpler verbs like "input," "put," "get"
  • No significant increase in similarity between student and AI code across GPT versions
  • AI code includes more explicit sequencing and comments

Implications:

  • Provides tool for detecting AI-generated code in K-12 education
  • Helps teachers identify potential academic integrity issues
  • Enables targeted interventions for students using AI tools
  • Supports development of AI-aware programming education

Limitations:

  • Dataset from single online course platform
  • Assumes all student submissions are non-AI generated
  • Model reliance on specific keywords may not generalize
  • Limited to basic pseudocode problems

Future Directions:

  • Examine wider variety of pseudocode problems
  • Study code modified by both AI and humans
  • Develop standardized guidelines for educators
  • Investigate integration with learning management systems

Title and Authors: "What are the differences between student and ChatGPT-generated pseudocode? Detecting AI-generated pseudocode in high school programming using explainable machine learning" by Zifeng Liu, Wanli Xing, Xinyue Jiao, Chenglu Li, and Wangda Zhu

Published On: January 14, 2025

Published By: Education and Information Technologies

Related Link

Comments

Please log in to leave a comment.