Conundrum and Considerations in Cognitive Diagnostic Assessment for Language Proficiency Evaluation

Muhamad Firdaus  Mohd Noh; Mohd Effendi Ewan bin  Mohd Matore

doi:10.22610/imbr.v16i2(I).3690

Muhamad Firdaus Mohd Noh UNIVERSITI KEBANGSAAN MALAYSIA
Mohd Effendi Ewan bin Mohd Matore Universiti Kebangsaan Malaysia

DOI: https://doi.org/10.22610/imbr.v16i2(I).3690

Keywords: Cognitive diagnostic assessment, cognitive diagnostic models, cognitive diagnostic approaches, language proficiency evaluation, language assessment

Abstract

Since its first appearance in the field of language testing, cognitive diagnostic assessment (CDA) has attracted attention for its ability to extract the intricacies of students' cognitive abilities. However limited research has discussed the issues in the implementation of CDA. Therefore, this article offers an overview of CDA's implementation in language proficiency evaluation. The article also engages in a comprehensive discussion on the conundrum and considerations within CDA, particularly the ongoing debate between distinct classifications of cognitive diagnostic models. It elaborates on the distinctions between the models and their implications for assessment depth and diagnostic insights. Additionally, this article delves into the clash between retrofitting existing items and developing new diagnostic items, highlighting the strategic considerations in each approach. Apart from that, the contentious issue of validating Q-matrices, crucial in CDA, is thoroughly examined, presenting the battle between expert-based and empirical validation methods. The persistent challenges in CDA have profound implications for both theoretical frameworks and practical applications. The theoretical debate not only influences our understanding of cognitive processes but also shapes the conceptualization of diagnostic information extraction. In practical terms, decisions regarding item development, retrofitting strategies, and Q-matrix validation methods directly impact the effectiveness of CDA in providing targeted interventions and personalized learning strategies in real-world educational contexts. Future research directions are also presented, emphasizing the need for more development of entirely new diagnostic items, hybrid CDMs, and adaptive cognitive diagnostic assessments. Practical recommendations are provided for practitioners, encouraging a strategic approach based on specific assessment goals.

Downloads

References

Alavi, M., & Ranjbaran, F. (2018). Constructing and validating a Q-Matrix for cognitive diagnostic analysis of a reading comprehension test battery. Journal of English Language Teaching and Learning, 21(12), 1–15.

Chen, H., & Chen, J. (2015). Exploring reading comprehension skill relationships through the G-DINA model. Educational Psychology, 36(6), 1049–1064. https://doi.org/10.1080/01443410.2015.1076764 DOI: https://doi.org/10.1080/01443410.2015.1076764

Chen, H., & Chen, J. (2016). Retrofitting non-cognitive-diagnostic reading assessment under the Generalized DINA Model framework. Language Assessment Quarterly, 13(3), 218–230. https://doi.org/10.1080/15434303.2016.1210610 DOI: https://doi.org/10.1080/15434303.2016.1210610

Chen, Y., Li, X., Liu, J., & Ying, Z. (2018). Regularized latent class analysis with application in cognitive diagnosis. Psychometrika, 82, 660–692. DOI: https://doi.org/10.1007/s11336-016-9545-6

Chen, Y., Liu, J., Xu, G., & Ying, Z. (2015). Statistical analysis of Q-matrix based diagnostic classification models. Journal of the American Statistical Association, 110(510), 850–866. https://doi.org/10.1080/01621459.2014.934827 DOI: https://doi.org/10.1080/01621459.2014.934827

Clark, T., & Endres, H. (2021). Computer-based diagnostic assessment of high school students’ grammar skills with automated feedback–an international trial. Assessment in Education: Principles, Policy and Practice, 28(5–6), 602–632. https://doi.org/10.1080/0969594X.2021.1970513 DOI: https://doi.org/10.1080/0969594X.2021.1970513

Cui, Y. (2012). Estimating classification consistency and accuracy for cognitive diagnostic assessment. Journal of Educational Measurement, 49(1), 19–38. DOI: https://doi.org/10.1111/j.1745-3984.2011.00158.x

de la Torre, J., & Chiu, C. Y. (2016). A general method of empirical Q-matrix validation. Psychometrika, 81(2), 253–273. https://doi.org/10.1007/s11336-015-9467-8 DOI: https://doi.org/10.1007/s11336-015-9467-8

DeCarlo, L. T. (2012). Recognizing uncertainty in the Q-matrix via a Bayesian Extension of the DINA Model. Applied Psychological Measurement, 36(6), 447–468. https://doi.org/10.1177/0146621612449069 DOI: https://doi.org/10.1177/0146621612449069

de la Torre, J. (2011). The Generalized DINA Model framework. Psychometrika, 76(3), 510–510. https://doi.org/10.1007/s11336-011-9214-8 DOI: https://doi.org/10.1007/s11336-011-9214-8

Desmarais, M. C., & Naceur, R. (2013). A matrix factorization method for mapping items to skills and for enhancing expert-based Q-matrices. Artificial Intelligence in Education: 16th International Conference, 441–450. DOI: https://doi.org/10.1007/978-3-642-39112-5_45

Doe, C. (2014). Diagnostic English Language Needs Assessment (DELNA). Language Testing, 31(4), 537–543. https://doi.org/10.1177/0265532214538225 DOI: https://doi.org/10.1177/0265532214538225

Dong, Y., Ma, X., Wang, C., & Gao, X. (2021). An optimal choice of cognitive diagnostic model for second language listening comprehension test. Frontiers in Psychology, 12(April), 1–12. https://doi.org/10.3389/fpsyg.2021.608320 DOI: https://doi.org/10.3389/fpsyg.2021.608320

Effatpanah, F., & Baghaei, P. (2019). Diagnosing EFL learners’ writing ability: A diagnostic classification modeling analysis. Language Testing in Asia, 9(12), 1–23. DOI: https://doi.org/10.1186/s40468-019-0090-y

Henson, R. A., Templin, J. L., & Willse, J. T. (2009). Defining a family of cognitive diagnosis models using log-linear models with latent variables. Psychometrika, 74(2), 191–210. https://doi.org/10.1007/s11336-008-9089-5 DOI: https://doi.org/10.1007/s11336-008-9089-5

Jang, E. E. (2009a). Cognitive diagnostic assessment of L2 reading comprehension ability: Validity arguments for Fusion Model application to LanguEdge assessment. Language Testing, 26(1), 31–73. https://doi.org/10.1177/0265532208097336 DOI: https://doi.org/10.1177/0265532208097336

Jang, E. E. (2009b). Demystifying a Q-Matrix for making diagnostic inferences about L2 reading skills. Language Assessment Quarterly, 6(3), 210–238. https://doi.org/10.1080/15434300903071817 DOI: https://doi.org/10.1080/15434300903071817

Javidanmehr, Z., & Sarab, M. R. A. (2017). Cognitive diagnostic assessment: Issues and considerations. International Journal of Language Testing, 7(2), 73–98.

Junker, B. W., & Sijtsma, K. (2001). Cognitive assessment models with few assumptions, and connections with nonparametric item response theory. Applied Psychological Measurement, 25(3), 258–272. https://doi.org/10.1177/01466210122032064 DOI: https://doi.org/10.1177/01466210122032064

Kim, Y. (2019). Developing and validating empirically-derived diagnostic descriptors in ESL academic writing. The Journal of Asia TEFL, 16(3), 906–926. DOI: https://doi.org/10.18823/asiatefl.2019.16.3.9.906

Lee, Y. W., & Sawaki, Y. (2009). Cognitive diagnosis approaches to language assessment: An overview. Language Assessment Quarterly, 6(3), 172–189. https://doi.org/10.1080/15434300902985108 DOI: https://doi.org/10.1080/15434300902985108

Li, H., & Hunter, C. V. (2015). The selection of cognitive diagnostic models for a reading comprehension test. Language Testing, 33(3), 391–409. https://doi.org/10.1177/0265532215590848 DOI: https://doi.org/10.1177/0265532215590848

Li, H., & Suen, H. K. (2013). Constructing and validating a Q-matrix for cognitive diagnostic analyses of a reading test. Educational Assessment, 18(1), 1–25. https://doi.org/10.1080/10627197.2013.761522 DOI: https://doi.org/10.1080/10627197.2013.761522

Li, L., An, Y., Ren, J., & Wei, X. (2021). Research on the cognitive diagnosis of Chinese listening comprehension ability based on the G-DINA Model. Frontiers in Psychology, 12(September), 1–15. https://doi.org/10.3389/fpsyg.2021.714568 DOI: https://doi.org/10.3389/fpsyg.2021.714568

Li, Y., Zhen, M., & Liu, J. (2021). Validating a reading assessment within the cognitive diagnostic assessment framework: Q-matrix construction and model comparisons for different primary grades. Frontiers in Psychology, 12(December), 1–13. https://doi.org/10.3389/fpsyg.2021.786612 DOI: https://doi.org/10.3389/fpsyg.2021.786612

Liu, R., Huggins-Manley, A. C., & Bradshaw, L. (2017). The impact of Q-matrix designs on diagnostic classification accuracy in the presence of attribute hierarchies. Educational and Psychological Measurement, 77(2), 220–240. https://doi.org/10.1177/0013164416645636 DOI: https://doi.org/10.1177/0013164416645636

Ma, W., Iaconangelo, C., & de la Torre, J. (2016). Model similarity, model selection, and attribute classification. Applied Psychological Measurement, 40(3), 200–217. https://doi.org/10.1177/0146621615621717 DOI: https://doi.org/10.1177/0146621615621717

Mei, H., & Chen, H. (2022). Cognitive diagnosis in language assessment: A thematic review. RELC Journal, 1–9. https://doi.org/10.1177/00336882221122357 DOI: https://doi.org/10.1177/00336882221122357

Meng, Y., & Fu, H. (2023). Modeling mediation in the dynamic assessment of listening ability from the cognitive diagnostic perspective. Modern Language Journal, 107, 137–160. https://doi.org/10.1111/modl.12820 DOI: https://doi.org/10.1111/modl.12820

Meng, Y., Wang, Y., & Zhao, N. (2023). Cognitive diagnostic assessment of EFL learners’ listening barriers through incorrect responses. Frontiers in Psychology, 14, 1–11. https://doi.org/10.3389/fpsyg.2023.1126106 DOI: https://doi.org/10.3389/fpsyg.2023.1126106

Min, S., Cai, H., & He, L. (2022). Application of Bi-factor MIRT and Higher-order CDM Models to an in-house EFL listening test for diagnostic purposes. Language Assessment Quarterly, 19(2), 189–213. https://doi.org/10.1080/15434303.2021.1980571 DOI: https://doi.org/10.1080/15434303.2021.1980571

Mirzaei, A., Vincheh, M. H., & Hashemian, M. (2020). Retrofitting the IELTS reading section with a general cognitive diagnostic model in an Iranian EAP context. Studies in Educational Evaluation, 64, 1–10. https://doi.org/10.1016/j.stueduc.2019.100817 DOI: https://doi.org/10.1016/j.stueduc.2019.100817

Mizumoto, A., & Webb, S. A. (2017). Developing and evaluating a computerized adaptive testing version of the Word Part Levels Test. Language Testing, 36(1), 1–23. https://doi.org/10.1177/0265532217725776 DOI: https://doi.org/10.1177/0265532217725776

Mohammed, A., Kareem, A., Dawood, S., Alghazali, T., Khlaif, Q., Sabti, A. A., & Sabit, S. H. (2023). A cognitive diagnostic assessment study of the Reading Comprehension Section of the Preliminary English Test (PET). International Journal of Language Testing, 13, 1–20.

Nallasamy, R., & Khairani, A. Z. Bin. (2022). Development and validation of reading comprehension assessments by using GDINA Model. Malaysian Journal of Social Sciences and Humanities (MJSSH), 7(2), 1–13. https://doi.org/10.47405/mjssh.v7i2.1278 DOI: https://doi.org/10.47405/mjssh.v7i2.1278

Panahi, A., & Mohebbi, H. (2022). Cognitive diagnostic assessment of IELTS Listening: Providing feedback from its internal structure. Language Teaching Research Quarterly, 29, 147–160. https://doi.org/10.32038/ltrq.2022.29.10 DOI: https://doi.org/10.32038/ltrq.2022.29.10

Poolsawad, K., Kanjanawasee, S., & Wudthayagorn, J. (2015). Development of an English communicative competence diagnostic approach. Procedia - Social and Behavioral Sciences, 191, 759–763. https://doi.org/10.1016/j.sbspro.2015.04.462 DOI: https://doi.org/10.1016/j.sbspro.2015.04.462

Ranjbaran, F., & Alavi, S. M. (2017). Developing a reading comprehension test for cognitive diagnostic assessment: A RUM analysis. Studies in Educational Evaluation, 55, 167–179. https://doi.org/10.1016/j.stueduc.2017.10.007 DOI: https://doi.org/10.1016/j.stueduc.2017.10.007

Ravand, H. (2016). Application of a Cognitive Diagnostic Model to a high-stakes reading comprehension test. Journal of Psychoeducational Assessment, 34(8), 782–799. https://doi.org/10.1177/0734282915623053 DOI: https://doi.org/10.1177/0734282915623053

Ravand, H., & Baghaei, P. (2020). Diagnostic Classification Models: Recent developments, practical issues, and prospects. International Journal of Testing, 20(1), 24–56. https://doi.org/10.1080/15305058.2019.1588278 DOI: https://doi.org/10.1080/15305058.2019.1588278

Ravand, H., & Robitzsch, A. (2018). Cognitive diagnostic model of best choice: A study of reading comprehension. Educational Psychology, 38(10), 1255–1277. https://doi.org/10.1080/01443410.2018.1489524 DOI: https://doi.org/10.1080/01443410.2018.1489524

Read, J., & Von Radow, J. (2013). A university post-entry English language assessment: Charting the changes. International Journal of English Studies, 13(2), 89–110. https://doi.org/10.6018/ijes.13.2.185931 DOI: https://doi.org/10.6018/ijes.13.2.185931

Rupp, A. A., Templin, J., & Henson, R. A. (2010). Diagnostic Measurement: Theory, Methods, and Applications. In T. D. Little (Ed.), Measurement: Interdisciplinary Research and Perspectives (Issue 1). The Guilford Press. https://doi.org/10.1080/15366367.2018.1434349 DOI: https://doi.org/10.1080/15366367.2018.1434349

Safari, F., & Ahmadi, A. (2023). Developing and evaluating an empirically-based diagnostic checklist for assessing second language integrated writing. Journal of Second Language Writing, 60, 1–15. https://doi.org/10.1016/j.jslw.2023.101007 DOI: https://doi.org/10.1016/j.jslw.2023.101007

Sen, S., & Cohen, A. S. (2021). Sample size requirements for applying Diagnostic Classification Models. Frontiers in Psychology, 11, 1–16. https://doi.org/10.3389/fpsyg.2020.621251 DOI: https://doi.org/10.3389/fpsyg.2020.621251

Sessoms, J., & Henson, R. A. (2018). Applications of Diagnostic Classification Models: A literature review and critical commentary. Measurement: Interdisciplinary Research and Perspectives, 16(1), 1–17. https://doi.org/10.1080/15366367.2018.1435104 DOI: https://doi.org/10.1080/15366367.2018.1435104

Shahmirzadi, N., & Marashi, H. (2023). Cognitive diagnostic assessment of reading comprehension for high- stakes tests: Using GDINA model. Language Testing in Focus: An International Journal, 8(8), 1–16. https://doi.org/10.32038/ltf.2023.08.01 DOI: https://doi.org/10.32038/ltf.2023.08.01

Shi, X., Ma, X., Du, W., & Gao, X. (2023). Diagnosing Chinese EFL learners’ writing ability using polytomous cognitive diagnostic models. Language Testing, 41(1), 1–26. https://doi.org/10.1177/02655322231162840 DOI: https://doi.org/10.1177/02655322231162840

Tabatabaee-yazdi, M., & Samir, A. (2023). On the identifiability of Cognitive Diagnostic Models: Diagnosing students’ translation ability. Journal of Language & Education, 9(1), 138–157. DOI: https://doi.org/10.17323/jle.2023.12262

Tatsuoka, K. K. (1983). Rule Space: An approach for dealing with misconceptions based on Item Response Theory. Journal of Educational Measurement, 20(4), 345–354. DOI: https://doi.org/10.1111/j.1745-3984.1983.tb00212.x

Templin, J., & Bradshaw, L. (2013). Measuring the reliability of Diagnostic Classification Model examinee estimates. Journal of Classification, 30, 251–275. https://doi.org/10.1007/s00357-013 DOI: https://doi.org/10.1007/s00357-013-9129-4

Templin, J., & Henson, R. (2006). Measurement of psychological disorders using Cognitive Diagnosis Models. Psychological Methods, 11(3). DOI: https://doi.org/10.1037/1082-989X.11.3.287

Terzi, R., & Sen, S. (2019). A nondiagnostic assessment for diagnostic purposes: Q-matrix validation and Item-Based Model git evaluation for the TIMSS 2011 Assessment. SAGE Open, 9(1), 1–11. https://doi.org/10.1177/2158244019832684 DOI: https://doi.org/10.1177/2158244019832684

Thi, D. T. D., & Loye, N. (2019). Cognitive diagnostic analyses of the Progress in International Reading Literacy Study (PIRLS) 2011 results. Mesure et Évaluation En Éducation, 42, 127–166. DOI: https://doi.org/10.7202/1084131ar

Toprak, T. E., & Çakir, A. (2018). Where the rivers merge: Cognitive diagnostic approaches to educational assessment. Kuramsal E?itimbilim, 11(2), 244–260. https://doi.org/10.30831/akukeg.363915 DOI: https://doi.org/10.30831/akukeg.363915

Toprak, T. E., & Cakir, A. (2020). Examining the L2 reading comprehension ability of adult ELLs: Developing a diagnostic test within the cognitive diagnostic assessment framework. Language Testing, 38(1), 106–131. https://doi.org/10.1177/0265532220941470 DOI: https://doi.org/10.1177/0265532220941470

Toprak-yildiz, T. E. (2021). An international comparison using cognitive diagnostic assessment: Fourth graders’ diagnostic profile of reading skills on PIRLS 2016. Studies in Educational Evaluation, 70, 1–10. https://doi.org/10.1016/j.stueduc.2021.101057 DOI: https://doi.org/10.1016/j.stueduc.2021.101057

Wang, D., Cai, Y., & Tu, D. (2021). Q-matrix estimation methods for Cognitive Diagnosis Models: Based on Partial Known Q-Matrix. Multivariate Behavioral Research, 56(3), 514–526. https://doi.org/10.1080/00273171.2020.1746901 DOI: https://doi.org/10.1080/00273171.2020.1746901

Yi, Y. (2016). Probing the relative importance of different attributes in L2 reading and listening comprehension items: An application of cognitive diagnostic models. Language Testing, 34(3), 1–9. https://doi.org/10.1177/0265532216646141 DOI: https://doi.org/10.1177/0265532216646141

Zhan, P., Jiao, H., Liao, D., & Li, F. (2019). A longitudinal higher-order Diagnostic Classification Model. Journal of Educational and Behavioral Statistics, 44(3), 251–281. https://doi.org/10.3102/1076998619827593 DOI: https://doi.org/10.3102/1076998619827593

Zhang, S., Liu, J., & Ying, Z. (2023). Statistical applications to cognitive diagnostic testing. Annual Review of Statistics and Its Application, 10, 651–678. DOI: https://doi.org/10.1146/annurev-statistics-033021-111803

Conundrum and Considerations in Cognitive Diagnostic Assessment for Language Proficiency Evaluation

Abstract

Downloads

References

Important Links