Uncovering Multimodal Learning Patterns in Collaborative Learning Utilizing Latent Class Analysis

Yamada Laboratory, Kyushu University

日本語
English
中文

Uncovering Multimodal Learning Patterns in Collaborative Learning Utilizing Latent Class Analysis

2026年02月26日

Hello everyone.

In this article, I would like to introduce the paper I read in our most recent English Literature Seminar and share my thoughts on it.

Paper Title: From Complexity to Parsimony: Integrating Latent Class Analysis to Uncover Multimodal Learning Patterns in Collaborative
Conference: LAK 2025: The 15th International Learning Analytics and Knowledge Conference
Year of Publication: 2025
Pages: 70-81
Authors: Lixiang Yan, Dragan Gasevic, Vanessa Echeverria, Yueqiao Jin, Linxuan Zhao, Roberto Martinez-Maldonado

This study conducts an analysis utilizing Multimodal Learning Analytics (MMLA) by collecting learners’ location information, audio, and physiological data in collaborative learning. Understanding collaborative learning requires a multifaceted perspective, including verbal and non-verbal interactions as well as the use of physical and digital resources. Traditional MMLA research has often focused on a single modality, and comprehensive analysis integrating multiple data modalities has been limited. To address this challenge, this study utilizes Latent Class Analysis (LCA) to integrate different data modalities and identify simpler and more explanatory learning patterns.

As for the research methodology, team activity data were collected from 56 students receiving medical education in a medical simulation environment. The data collection method is very interesting, capturing three different data streams simultaneously. Location data used the Pozyx creator toolkit to track students’ spatial coordinates in real-time, continuously recording their precise positions within the classroom. Audio data were collected using compact wireless headset microphones, synchronizing the voices of students in the same group through a multi-channel audio interface, followed by detailed transcription and analysis. Heart rate was measured as physiological data, used to evaluate arousal levels during simulation activities and synchrony with other participants.

The collected data were converted into distinct behavioral indicators. In task behavior, the study measured collaborative work, individual work, and transitions between tasks in different areas within the simulation environment (e.g., main task space, auxiliary task space). For example, “collaborative work in the main task space” was defined as two students being within 1 meter of each other for more than 10 seconds in an area where important medical procedures are performed. In team communication indicators, important communication patterns for medical teams, such as “task assignment,” “information sharing,” and “confirmation/acknowledgment,” were analyzed from the audio data. Physiological indicators measured heart rate “synchrony” (correlation of heart rate patterns between team members) and “arousal” (heart rate state exceeding the baseline).

To integrate data from different modalities, the authors chose a 60-second interval as the unit of analysis and synchronized the data from each modality. Based on criteria such as a duration of 10 seconds or more for task behavior, at least one occurrence within a 60-second interval for communication indicators, and a duration of 30 seconds or more for physiological indicators, each behavior was encoded as binary data (presence/absence). This approach made it possible to analyze data of different temporal granularities in a unified format.

The results of the analysis revealed that four latent classes each exhibited unique behavioral patterns. “Collaborative Communication” was characterized as a state where behaviors in all categories were present; “Physical Coordination” as collaborative work with little verbal communication; “Remote Interaction” as maintaining verbal communication while being physically distant; and “Personal Engagement” as engaging in auxiliary tasks alone. Furthermore, analysis using ENA (Epistemic Network Analysis) showed that the four-indicator multimodal model had higher explanatory power regarding differences in satisfaction with tasks and collaborative activities compared to data from multiple monomodalities.

The following are my personal thoughts. This study provides detailed insights into the collection and processing of diverse data sources using MMLA; in particular, the methods for synchronizing and integrating different modalities—location, audio, and physiological data—serve as helpful references for practical research design. Furthermore, the approach of extracting four major patterns from data of multiple modalities using LCA also provides a new perspective. I feel that the approach of converting monomodal indicators into multimodal indicators using LCA is innovative for interpreting complex datasets in an easy-to-understand manner. I am also interested in the analytical method called ENA. While I have not used this method myself yet, it seems to be a powerful tool for identifying relationships between multiple data modalities. I would like to try such methods in my own research in the future. One point I found interesting was why non-verbal communication data, such as gestures, were not included in this study. In collaborative learning, especially in environments with many physical activities like medical simulations, I believe data such as gestures and facial expressions also play an important role. Regarding this point, I would like to know in more detail whether it was due to technical constraints or a choice in the research design. I feel that this study provides important implications for future research in collaborative learning as a method for integrating complex multimodal data.

By: Geng Xuewang

PAGE TOP