OPTIMIZING SCORING RELIABILITY FOR CREATIVE MATHEMATICAL PROBLEM-SOLVING ASSESSMENTS: A GENERALIZABILITY THEORY APPROACH

Authors

  • AUTTAPON PALADPHOM
  • PRAKITTIYA TUKSINO
  • ANUCHA SOMABUT

Keywords:

Generalizability Theory, creative problem-solving, constructed-response test, inter-rater reliability, essay test.

Abstract

Rater-induced error significantly challenges the scoring reliability of creative mathematical problem-solving assessments. This study applied Generalizability Theory to analyze score variance from 140 students and 3 raters across three scoring designs. The Generalizability (G) study revealed the person-by-rater interaction as the largest error source (35.50-35.90%), highlighting inconsistent rater judgments. A Decision (D) study showed that increasing raters from one to three substantially improved reliability (relative G-coefficient: .45 to .71). Notably, a design where each rater specializes in scoring specific items (p x (i:r)) yielded the highest absolute reliability (.69). These findings provide empirical guidance for designing effective scoring procedures to enhance the reliability of complex skill assessments.

Downloads

How to Cite

PALADPHOM, A., TUKSINO, P., & SOMABUT, A. (2025). OPTIMIZING SCORING RELIABILITY FOR CREATIVE MATHEMATICAL PROBLEM-SOLVING ASSESSMENTS: A GENERALIZABILITY THEORY APPROACH . TPM – Testing, Psychometrics, Methodology in Applied Psychology, 32(2 - June), 1074–1080. Retrieved from https://tpmap.org/submission/index.php/tpm/article/view/1646

Issue

Section

Articles