Potential and challenges of a rubric-based assessment framework for academic writing

Accepted Paper

Jaeho Lee (Waseda University) Ikuko Ijuin (Tokyo University of Foreign Studies)

Paper long abstract

This study, supported by a Grant-in-Aid for Scientific Research (KAKENHI), aims to develop a system that supports the assessment, teaching, and learning of academic writing in Japanese. Opportunities for Japanese language teachers to gain experience in writing assessment during the teacher preparation stage are limited, and they are expected to develop this competence through practice in actual educational settings as they grow professionally. However, because learner texts exhibit a wide range of developmental characteristics, many teachers express concerns about the appropriateness of their own assessments.

In this study, we employed a rubric for writing assessment developed through empirical research on university instructors and examined the reliability and inter-rater agreement based on ratings assigned by two Japanese language teachers. Using 147 opinion essays extracted from the Livable Country Corpus, which consists of texts written by Japanese learners in Europe, the teachers evaluated three aspects of content organization (clarity of claim, support, and logical structure) and two aspects of linguistic expression (accuracy and appropriateness) on a five-point scale. These ratings were analyzed together with features of the learner texts.

To verify the reliability of the assessment, Cronbach’s alpha was calculated, yielding a high internal consistency (α = .937). However, inter-rater agreement varied widely among the five aspects: clarity of claim (31.2%), support (45.6%), logical structure (46.3%), accuracy (42.9%), and appropriateness (27.9%). Spearman’s correlation coefficients showed overall moderate correlations, with relatively higher correlations for content-related aspects and lower correlations for language-related aspects. A closer examination of the two aspects with the lowest agreement (clarity of claim and appropriateness) revealed that Rater A tended to give more lenient scores on clarity of claim, whereas Rater B tended to be more lenient on appropriateness, suggesting an influence of rater beliefs. Additionally, analyses by proficiency level indicated that score variability was particularly pronounced at the intermediate level.

This presentation discusses the implications of these findings for the effectiveness and challenges of rubric-based assessment, as well as insights for pedagogical practices in writing instruction.

*This work was supported by JSPS KAKENHI Grant Number 24K00078, 23K21939.

Abstract in Japanese (if needed):

アカデミックライティングのための評価用ルーブリックの可能性と課題李　在鎬（早稲田大学）伊集院　郁子（東京外国語大学）本研究は科研費の支援を受け、日本語によるアカデミックライティングの評価および教育・学習を支援する環境の構築を目指すものである。ライティング評価は教員養成段階で経験する機会が限られており、日本語教師として成長する過程で、教育現場での実践を通して自ら学ぶことが期待されている。しかし、学習者の文章には発達段階ごとに多様な特性が見られるため、自身の評価が適切なのか不安を抱く教師も多い。本研究では、大学教員による作文評価の実証的研究に基づいて開発したルーブリックを用い、2名の日本語教師が5観点を5段階で評価したデータに基づき、評価の信頼性と一致度の特徴を検討した。具体的には、ヨーロッパの日本語学習者の意見文を収録した「住みやすい国コーパス」から抽出した147編に対し、内容構成の3観点（主張・根拠・論理構成）と言語表現の2観点（正確さ・適切さ）を評価し、作文の特徴と合わせて分析した。まず、評価の信頼性を検証するため Cronbach の α 係数を確認したところ、α = .937 と高い内的整合性が得られた。一方、一致度は主張31.2%、根拠45.6%、論理構成46.3%、正確さ42.9%、適切さ27.9% と観点により差が大きかった。スピアマンの相関係数では全体として中程度の相関であったが、内容構成では比較的高く、言語表現では低い傾向が見られた。さらに、一致度が低かった「主張」と「適切さ」を詳しく検討したところ、主張では評価者A、適切さでは評価者Bが甘めに評価する傾向が確認され、評価者のビリーフの影響が示唆された。また、習熟度別の分析では、中級レベルでばらつきが特に大きいことが明らかになった。評価がばらつく観点について、教員間の評価をすり合わせ、学習者に明示する必要があるだろう。本発表では、以上の結果を踏まえ、ルーブリック評価の有効性と課題、さらには作文指導への示唆について論じる。（799字） *謝辞：本研究はJSPS科研費 24K00078、23K21939の助成を受けたものです。

Contribution AJE002
Association of Japanese Language Education: 2
Session 1 Friday 28 August, 2026, 11:00-12:30