ChatGPT generates concept maps comparable in quality to teacher-created ones, significantly reducing creation time while maintaining educational effectiveness for secondary school students.
Objective: The main goal of this study was to investigate ChatGPT's potential in generating concept maps for educational purposes in secondary schools. Specifically, the researchers aimed to determine if ChatGPT could create complete concept maps from text, whether these AI-generated maps improve student performance, and if students could distinguish between AI-generated and teacher-created concept maps. The study sought to address the challenges educators face in creating time-consuming concept maps while exploring how large language models could support structured learning activities.
Methods: The researchers employed a comprehensive three-stage experimental design involving 83 secondary school students (62 males, 20 females, ages 15-20) from a technical school in Italy. Six topics were selected from computer science and history curricula, with both teachers and ChatGPT 3.5 creating concept maps for each topic. To ensure visual consistency, all maps were standardized using PlantUML format. The study utilized four questionnaires: demographics, concept map structural evaluation (measuring general appreciation, representation, usefulness, and defectiveness using 5-point Likert scales), topic-related questions, and a "Reveal the AI" questionnaire. In Stage 1, students evaluated all 12 concept maps without knowing their origins. Stage 2 divided students into two groups, with each group using different combinations of ChatGPT and teacher-generated maps to answer topic-related questions. Stage 3 challenged students to identify which maps were AI-generated. Statistical analysis included Wilcoxon signed-rank tests for paired comparisons and t-tests for independent samples, with qualitative analysis conducted on student feedback.
Key Findings: The study revealed several significant findings regarding ChatGPT's effectiveness in concept map generation. In terms of map quality, ChatGPT outperformed teachers across multiple dimensions: general appreciation (3.99 vs 3.78, p=0.0001), representation (4.02 vs 3.78, p=0.0000), and usefulness (3.86 vs 3.71, p=0.0047). Students perceived ChatGPT-generated maps as more complete, with only 8.23% reporting missing concepts compared to 27.5% for teacher-created maps. Similarly, concepts were better positioned in ChatGPT maps (5.6% wrong positioning vs 8.83% for teacher maps). Regarding student performance, both ChatGPT and teacher-generated maps proved effective learning tools, with mixed results across topics. ChatGPT maps led to superior performance in web applications and project lifecycle topics, while teacher maps excelled in Napoleon's reforms, post-war Italy, and PHP classes. The performance differences were statistically significant but modest (average difference of 0.1509 across significant results). In the recognition task, only 51.80% of students could correctly identify at least four out of six AI-generated maps, with merely 7.22% identifying all six correctly. Qualitative analysis revealed five key themes in student perceptions: categorization and organization (85% of participants), GAI-assisted ideation (80%), visualization and clarity (60%), efficiency (50%), and critical thinking and independence (45%). Students noted that ChatGPT maps were more detailed and verbose, while teacher maps were more concise and focused on core concepts.
Implications: These findings have significant implications for AI integration in educational technology and concept map creation. The study demonstrates that large language models like ChatGPT can serve as effective educational tools, producing concept maps that are comparable to or sometimes superior to those created by domain experts. This has practical implications for educators, as ChatGPT can significantly reduce the time and effort required for concept map creation while maintaining educational quality. The research suggests that AI-generated concept maps can effectively support structured learning activities, particularly in extracting concepts from texts, establishing relationships between concepts, and organizing information hierarchically. The findings also indicate that AI tools can enhance educational inclusivity by providing readily available, customizable learning materials that can be adapted to different learning styles and levels. Furthermore, the study contributes to the growing body of research on human-AI collaboration in education, showing that AI can serve as a valuable supplement to human expertise rather than a replacement. The comparable performance outcomes suggest that AI-generated educational materials can be trusted for classroom use, potentially democratizing access to high-quality educational resources.
Limitations: The study acknowledges several important limitations that affect the generalizability and scope of findings. First, the research was conducted with a relatively small sample size (83 students) from a single technical secondary school in Italy, limiting demographic and geographical diversity. The study focused on only two academic subjects (computer science and history) with six concept maps total, which may not reflect ChatGPT's capabilities across broader subject areas such as mathematics, sciences, or literature. The research utilized ChatGPT version 3.5, which may have limitations compared to newer models like GPT-4 or other emerging large language models. The experimental duration was limited to a single session, preventing assessment of long-term learning outcomes or knowledge retention. The quality assessment focused primarily on four dimensions (general appreciation, representation, usefulness, and defectiveness) but did not thoroughly investigate other important factors such as readability, visual clarity, or cognitive load. Additionally, the study's reliance on self-reported data through questionnaires may introduce response bias, and the quasi-experimental design lacks the randomization control of true experimental studies.
Future Directions: The researchers suggest several promising avenues for future investigation to expand understanding of AI applications in educational concept mapping. Future studies should incorporate larger, more diverse sample populations across multiple educational levels (elementary, middle school, higher education) and geographical regions to improve generalizability. Longitudinal research designs extending beyond single sessions would better capture long-term learning effects and knowledge retention associated with AI-generated concept maps. Research should expand to cover a broader range of academic subjects, including STEM fields, humanities, and arts, to assess ChatGPT's versatility across disciplines. Investigation of newer AI models (GPT-4, Claude, Google Gemini, LLama 3.2) could reveal improvements in concept map quality and educational effectiveness. Future work should explore collaborative concept map creation using AI tools, investigating how human-AI collaboration can enhance educational outcomes. Research into customization capabilities for different learning styles, cognitive levels, and special educational needs would advance inclusive education applications. Studies should also examine the integration of multimedia elements (audio, video, interactive features) in AI-generated concept maps and investigate the development of compensatory tools for students with various learning deficits. Additionally, research into optimal prompting strategies and few-shot learning approaches could enhance AI performance in educational contexts.
Title and Authors: "A closer look at ChatGPT's role in concept map generation for education" by Daniele Schicchi, Carla Limongelli, Vito Monteleone, and Davide Taibi.
Published On: April 30, 2025
Published By: Interactive Learning Environments (Taylor & Francis Group)