Yamada Laboratory, Kyushu University

Integrating LLMs into EFL Writing Instruction: Effects on Writing Scores, Self-Regulated Learning Strategies, and Motivation

2026年02月26日

Hello, everyone. I am Tanaka, a first-year Doctoral student. I would like to introduce a paper I read in a recent English Literature Seminar.

  • Paper Title: Integrating large language models into EFL writing instruction: effects on performance, self-regulated learning strategies, and motivation

  • Authors: Ze-Min Liu, Gwo-Jen Hwang, Chuang-Qi Chen, Xiang-Dong Chen & Xin-Dong Ye

  • Journal: Computer Assisted Language Learning

  • Year of Publication: 2024

[Introduction]

Writing is one of the most difficult skills to acquire for English as a Foreign Language (EFL) learners. In recent years, instruction using Self-Regulated Learning (SRL) strategies has been introduced (Teng, 2022). SRL is defined as “the process by which learners autonomously control their motivation, cognition, metacognition, and behavior toward achieving their own goals (Zimmerman, 2000).”

The Cognitive Academic Language Learning Approach (CALLA) has been widely used as a strategy instruction model to promote the development of SRL. However, it has been pointed out that the burden on teachers is significant, and it is difficult to provide individualized support to each learner. Therefore, in this study, a new model called “CALLA-LLM,” which integrates Large Language Models (LLMs) into CALLA, was developed, and its educational effectiveness was verified.

[Literature Review]

Preceding research has shown that SRL strategies are an important factor in enhancing learners’ writing ability and motivation in EFL writing. In the writing process, learners can reduce cognitive load and improve learning efficiency by utilizing SRL strategies such as goal setting, self-monitoring, reflection, and feedback utilization. These strategies are also said to have the effect of increasing learners’ confidence and interest.

On the other hand, there is also the challenge that adaptability and individualization are insufficient in SRL support. With existing support tools, flexible support aligned with the learner’s actual SRL process is difficult, and the development of more individualized support is required.

Furthermore, motivation—particularly self-efficacy and intrinsic interest—is deeply involved in the quality of writing and the continuation of learning. Learners with high self-efficacy tend to engage proactively even with difficult tasks, and higher interest leads to higher quality output. Since SRL strategies and motivation are interrelated, an educational design that supports both simultaneously is necessary. Since LLMs can provide individualized feedback and support strategy modeling in real-time, their effectiveness has been discussed in recent years, and the “Humans in the Loop” framework, which aims for collaboration between AI and teachers, is also drawing attention.

[Research Questions]

This study sets the following three research questions:

  1. How does the CALLA-LLM group affect English writing ability compared to the conventional CALLA?

  2. How does the CALLA-LLM group affect the use of SRL strategies?

  3. How does the CALLA-LLM group affect motivation (self-efficacy and interest) toward writing?

[Research Method]

The target of this study was 65 sixth-grade elementary school students in Taiwan, randomly divided into a CALLA-LLM group (32 members) and a control group (33 members). Both groups received 10 sessions of process writing instruction over five weeks, with the CALLA-LLM group receiving support via a web application utilizing GPT-4.

Lessons were conducted following the CALLA framework, which consists of five phases: preparation, presentation, practice, evaluation, and expansion. For the CALLA-LLM group, prompts and APIs corresponding to each phase were established. For evaluation, writing tasks (3 times), a questionnaire regarding SRL strategies, and a questionnaire measuring writing motivation (self-efficacy and interest) were used, measured at three time points: T0 (pre-instruction), T1 (immediately post-instruction), and T2 (one month later). Statistical analysis was performed using jamovi, confirming normality, homogeneity of variance, and initial group differences, with the Linear Mixed Effects Model (LMM) used as the primary analytical method.

[Results]

The CALLA-LLM group showed higher performance than the control group in all aspects: writing scores, frequency of SRL strategy use, and motivation. Writing scores at T2 were an average of 5.13 points higher than the control group, with particularly significant improvements observed in terms of grammar/vocabulary, structure, and coherence. Regarding SRL strategies, a significant improvement in scores from T0 to T2 was confirmed across all areas: planning, generating, revising, and monitoring.

Regarding motivation, the self-efficacy of the CALLA-LLM group improved significantly at T1 and T2. On the other hand, while interest increased at T1, it was not maintained at T2. This indicates that while instantaneous support by LLM contributes to short-term motivational improvement, it is difficult to maintain intrinsic motivation over the long term.

[Discussion]

The CALLA-LLM group made optimal support possible for learners by fusing the structured scaffolding of CALLA (temporary support by teachers or supporters until the learner can perform tasks independently) with the highly immediate support of LLMs. SRL is a skill fostered within an appropriate instructional environment (Zimmerman, 2000), and in this study, it is thought that the CALLA-LLM model played a role in promoting its development.

CALLA-LLM can respond flexibly to the complex SRL processes in writing and is superior and more adaptive than conventional methods in that it provides real-time support based on the learner’s actual behavior (Lim et al., 2021). Furthermore, the CALLA-LLM model achieves step-by-step and individualized support, and it is inferred that the primary SRL strategies—planning, generating, revising, and monitoring—functioned effectively in coordination (Flower & Hayes, 1981).

On the other hand, while it was shown that LLMs are effective for improving learners’ self-efficacy, there are limits to continuously enhancing intrinsic interest. Future research will require innovations such as the introduction of motivation profiles and the use of gamification for the purpose of maintaining intrinsic motivation. Furthermore, support by LLM is not always appropriate, and it was reported that some learners felt confusion or dissatisfaction. In such situations, cognitive and affective interventions by teachers played an important role. From this point as well, the importance of support design through collaboration between AI and teachers—namely the “Humans in the Loop” framework—was once again highlighted.

Limitations of this study include that the generalizability of results is limited because the survey targets were confined to one elementary school with relatively high English proficiency. Additionally, it is based only on quantitative data via questionnaires, leaving qualitative aspects such as teacher intervention unexamined. Future verification in more diverse educational environments and long-term research focusing on the sustainability of motivation are required.

This study empirically demonstrates the effectiveness of a new educational model that enhances learners’ writing ability, SRL strategies, and motivation through the collaboration of LLMs and teachers, providing insights that open up the possibilities of LLM utilization in future EFL writing education.

[Personal Thoughts]

The reason I selected this paper is that it has many points in common with my own research theme, and I believed I could organize and identify the insights and challenges revealed by preceding research. Another objective is to cultivate the “ability to read preceding research with a critical perspective,” as guided by Professor Yamada.

I considered this study novel in that it developed “CALLA-LLM,” a new model integrating LLM into CALLA, expanded the conventional framework of SRL strategy instruction, and verified the effectiveness of a new individually optimized support model. On the other hand, as indicated in the “Limitations,” a challenge is the lack of qualitative data regarding the learning process, such as learners’ reactions to CALLA-LLM and the nature of teacher interventions. For example, while it is stated that “there were scenes where learners felt confusion or dissatisfaction,” there is no detailed description of what specific problems occurred or what kind of cognitive/emotional support the teacher provided. Furthermore, while this study targets elementary school students, it is not clear whether those challenges stem from age-related factors or English proficiency.

Toward the realization of “Humans in the Loop,” which this study emphasizes, I believe qualitatively verifying the mechanisms of interaction between LLMs, learners, and teachers is also an important challenge for the future. Furthermore, I felt the explanation of the theoretical and practical background—such as why elementary school students were selected as targets and why the CALLA model was used as the foundation—was insufficient.

In the seminar discussion, we also debated questions such as “SRL strategies are measured by questionnaires, but does it not lack the perspective of ‘reflection’ in SRL?” and “What support is necessary to sustain intrinsic interest in writing?”

I hope to apply the insights and challenges gained from this study to my own future research.

By: Tanaka

PAGE TOP