What’s Happening with Joint Attention in Immersive Collaborative Learning Environments?

Yamada Laboratory, Kyushu University

日本語
English
中文

What’s Happening with Joint Attention in Immersive Collaborative Learning Environments?

2026年02月26日

Hello everyone. I am Xuewang Geng.

In this article, I would like to introduce a paper I recently read in the English Literature Seminar and my thoughts on it.

Paper Title: Unveiling joint attention dynamics: Examining multimodal engagement in an immersive collaborative astronomy simulation
Journal: Computers & Education
Volume: 213
Pages: 1-15
Year of Publication: 2024
Authors: Jina Kang, Yiqiu Zhou, Robin Jephthah Rajarathinam, Yuanru Tan, David Williamson Shaffer
Link: https://doi.org/10.1016/j.compedu.2024.105002

The following is an overview of the content of the paper.

In recent years, developments in technologies such as Augmented Reality (AR) and Virtual Reality (VR) have significantly changed Computer-Supported Collaborative Learning (CSCL) environments. In particular, immersive learning environments provide learners with 3D virtual spaces and are attracting attention as new venues for collaborative learning. Joint attention is defined as coordinating attention toward a member of interest and plays an important role in collaborative learning. From a social constructivist perspective, joint attention promotes the construction of shared meaning among participants, establishes a shared problem-solving space, and effectively improves joint problem-solving by facilitating interaction.

However, while many previous studies have explored the impact of joint attention using eye-tracking data, joint attention in immersive collaborative learning environments where participants can freely navigate a 3D world has not yet been sufficiently investigated. In response, multimodal learning data collected from multiple sensors offers the possibility of better understanding collaborative learning activities. Multimodal data analysis integrates multiple data sources, including linguistic, behavioral, spatial, and physiological aspects of communication and meaning-making, making it possible to analyze not only verbal behaviors but also non-verbal learning behaviors such as gestures. Therefore, this study aims to multi-dimensionally analyze joint attention and conceptual discussion in a collaborative learning environment using an immersive astronomy simulation. Specifically, the following three Research Questions (RQs) were set:

What is the relationship between joint attention and learning outcomes in an immersive astronomy simulation?
How do joint attention, gestures, and turn-taking in collaborative activities influence the group’s conceptual discussion?
How do joint attention and conceptual discussion change during activities between groups with different learning outcomes?

To investigate the above RQs, this study conducted an experiment with 77 undergraduate students in 16 groups enrolled in an introductory astronomy course at a university in the United States. Participants worked on a multi-stage problem-solving task called “Lost at Sea.” In this task, they identified constellations and calculated latitude and longitude to determine the location of a space capsule that had fallen into an unknown ocean. Audio/video recordings, log files, device screen captures, and pre- and post-assessments were used for data collection. To measure joint attention, participants’ head movements, viewpoints within the simulation, and the overlap of field-of-view between devices were evaluated. This identified five levels of joint attention, from Level 1 (inactive) to Level 5 (screens overlapping and staying on the final task scene). Additionally, for measuring conceptual interaction, a coding scheme was developed focusing on verbal and non-verbal interactions, including elements such as introduction of new knowledge, revision of knowledge, confirmation of knowledge, and confusion. To clarify changes in joint attention and conceptual discussion, Ordered Network Analysis (ONA) was used. ONA is a method for modeling learning processes by considering the temporal order of events.

As a result of the analysis, the following findings were obtained. First, no significant difference was confirmed in the time spent on task completion and the duration of joint attention between low-performing and high-performing groups. High-performing groups tended to engage in higher levels of joint attention (especially Level 4 and Level 5) during the problem-solving process, but there was no significant difference compared to low-performing groups. Furthermore, it was suggested that different levels of joint attention have different relationships with the group’s conceptual interaction. For example, Level 4 joint attention (learners are in the same scene but looking at different places) was positively correlated with the introduction of new knowledge and negatively correlated with knowledge confusion, while Level 5 joint attention (looking at the same place in the same scene) was positively correlated with the confirmation of knowledge. Additionally, vignette analysis of the knowledge construction process showed that high-performing groups saw joint attention and conceptual discussion interact more effectively, promoting shared understanding and the adoption of new ideas. In low-performing groups, a lack of joint attention, learners acting independently, and disagreements between knowledge and opinions were frequently confirmed. ONA results revealed that as the task progressed, high-performing groups were more involved in high-level joint attention (Level 5) and conceptual discussion than low-performing groups. These results suggest the importance of task design, tool utilization, integration of multiple data sources, and analysis at different granularities in collaborative learning. For example, it was shown that it is important to design tasks that promote individual exploration and integration of diverse perspectives in early stages, and to provide resources and prompts that support the transition from individual exploration to group discussion.

The following are my thoughts after reading this paper.

This study adopted a mixed-methods approach combining learning process analysis using ONA with qualitative analysis, which was very interesting. In particular, the approach of analyzing at different granularities, such as macro-level (learning process including all tasks), meso-level (per task), and micro-level (per episode such as vignettes), has the potential to be applied to other research as well. For example, while my previous research analyzed overall verbal learning activities, I felt that I could apply the approach of this study to perform more detailed analysis of compound versus simple verb learning behaviors and different granularities between individual words. On the other hand, while this study provides important insights into how joint attention in AR/VR collaborative learning environments should be elucidated, I felt that specific knowledge regarding CSCL and joint attention itself was somewhat limited. In future research, I look forward to verification with larger samples, application in different learning contexts, and providing concrete implications for the design and support of collaborative learning.

PAGE TOP