article Article Summary
Jun 29, 2025
Blog Image

Multi-modal data analysis reveals that joint visual attention consistently predicts positive collaboration while boredom and high stress persistently hinder student performance across all stages of Design Thinking workshops focused on AI and Machine Learn

Multi-modal data analysis reveals that joint visual attention consistently predicts positive collaboration while boredom and high stress persistently hinder student performance across all stages of Design Thinking workshops focused on AI and Machine Learning.

Objective: The main goal of this study was to examine the "hidden" dynamics of collaborative learning during a Design Thinking (DT) workshop focused on AI and Machine Learning concepts. Specifically, the researchers aimed to: (1) analyze how performance in different DT stages correlates with each other, and (2) investigate the relationship between students' joint affective (emotional) and behavioral responses and their performance outcomes across the five DT stages (Empathize, Define, Ideate, Prototype, and Test).

Methods: The study employed a comprehensive multi-modal data collection approach with 63 students aged 11-15 from three Norwegian middle schools, organized into 29 teams (24 dyads and 5 triads). The intervention consisted of a six-hour DT workshop spread over two days, where students collaboratively created playable games to raise awareness about AI and ML concepts. Data collection included multiple sources: pre- and post-knowledge tests, team artifacts (paper-based worksheets and digital databases), and continuous multi-modal data through physiological sensors (Empatica E4 and EmbracePlus wristbands measuring heart rate variability, electrodermal activity, blood volume pulse, and skin temperature), audio and video recordings from wide-angle cameras, and facial expression analysis. The researchers used advanced processing techniques including deep learning-based face-tracking algorithms, OpenFace for gaze direction calculation, and cross-recurrence analysis to measure joint states among team members. Performance was evaluated using Relative Learning Gain (RLG) scores and custom rubrics designed by educational experts.

Key Findings: The study revealed several significant patterns in collaborative learning dynamics. First, regarding correlations between DT stages, strong positive relationships were found between Empathize→Ideate (r=0.51), Empathize→Prototyping (r=0.71), and Define→Ideate (r=0.62), suggesting that early-stage performance influences later success. Second, concerning multi-modal measurements, Joint Visual Attention (JVA) consistently showed positive correlations with performance across all stages, indicating its importance for collaborative success. Conversely, Joint Emotional State-Boredom and Joint Stress-High consistently predicted negative performance outcomes across all stages. The study identified stage-specific patterns: confusion positively correlated with performance during the Ideate stage, suggesting productive cognitive disequilibrium; high engagement and low stress became important predictors in later stages (Ideate and Prototyping); and frustration played a dual role—positively correlating with performance in the Define stage but negatively in later stages.

Implications: This research significantly advances the field of AI education by providing objective, data-driven insights into collaborative learning processes during constructionist activities. The findings demonstrate the feasibility of using multi-modal learning analytics to assess complex, open-ended educational experiences that traditionally rely on subjective evaluations. The study offers practical guidance for curriculum design, suggesting that maintaining consistency in support materials across DT stages enhances learning outcomes, while the introduction of new digital tools should be carefully scaffolded to manage potential stress and confusion. The identification of persistent positive (joint visual attention) and negative (boredom, high stress) predictors provides educators with specific targets for intervention and support strategies.

Limitations: The study acknowledges several limitations, including its reliance solely on quantitative analysis despite rich qualitative data sources, potential cultural and neurodiversity biases in emotion recognition algorithms, and the specific context of Norwegian middle schools which may limit generalizability. The relatively small sample size (63 students) and focus on a particular DT implementation may not represent all educational contexts. Additionally, the discrete nature of performance measurements versus continuous multi-modal data collection prevented time-series analysis that could reveal delayed effects or temporal dependencies.

Future Directions: The researchers suggest several avenues for future investigation, including incorporating qualitative analysis methods such as thematic analysis to provide richer contextual understanding, expanding the range of measured affective and behavioral responses (e.g., joint motivation, mental effort, achievement-related emotions), applying time-series analysis techniques like cross-correlation to account for temporal influences, and scaling studies to larger populations with diverse educational contexts. Future work should also explore fully digitalized DT stages and investigate longer-term impacts of such interventions on students' AI literacy and collaborative skills.

Title and Authors: "Behind the Scenes: Unpacking Students' Experience during a Collaborative AI Workshop using Multi-Modal Data" by Isabella Possaghi, Feiran Zhang, Kshitij Sharma, and Sofia Papavlasopoulou.

Published on: June 23-26, 2025

Published by: Interaction Design and Children (IDC '25), Reykjavik, Iceland, ACM

Related Link

Comments

Please log in to leave a comment.