TY - GEN
T1 - Assessment in Conversational Intelligent Tutoring Systems
T2 - 24th International Conference on Artificial Intelligence in Education , AIED 2023
AU - Carmon, Colin M.
AU - Hu, Xiangen
AU - Graesser, Arthur C.
N1 - Publisher Copyright:
© 2023, The Author(s), under exclusive license to Springer Nature Switzerland AG.
PY - 2023/6/30
Y1 - 2023/6/30
N2 - This research investigates the ability of semantic text models to assess student responses during tutoring compared with expert human judges. Recent interest in text similarity has led to a proliferation of models that can potentially be used for assessing student responses; however, whether these models perform as well as traditional distributional semantic models like Latent Semantic Analysis for student response assessment in automatic short answer grading is unclear. We assessed 5166 response pairings of 219 participants across 118 electronics questions and scored each with 13 different computational text models, including models that use regular expressions, distributional semantics, word embeddings, contextual embeddings, and combinations of these features. We show a few semantic text models performing comparably to Latent Semantic Analysis, and in some cases outperforming the model. Furthermore, combination models outperformed individual models in agreement with human judges. Choosing appropriate computational techniques and optimizing the text model may continue to improve the accuracy, recall, weighted agreement and therefore, the effectiveness of conversational ITSs.
AB - This research investigates the ability of semantic text models to assess student responses during tutoring compared with expert human judges. Recent interest in text similarity has led to a proliferation of models that can potentially be used for assessing student responses; however, whether these models perform as well as traditional distributional semantic models like Latent Semantic Analysis for student response assessment in automatic short answer grading is unclear. We assessed 5166 response pairings of 219 participants across 118 electronics questions and scored each with 13 different computational text models, including models that use regular expressions, distributional semantics, word embeddings, contextual embeddings, and combinations of these features. We show a few semantic text models performing comparably to Latent Semantic Analysis, and in some cases outperforming the model. Furthermore, combination models outperformed individual models in agreement with human judges. Choosing appropriate computational techniques and optimizing the text model may continue to improve the accuracy, recall, weighted agreement and therefore, the effectiveness of conversational ITSs.
KW - Agents
KW - Automatic short answer grading
KW - AutoTutor
KW - Computational linguistics
KW - Context embeddings
KW - Dialogue
KW - Distributional semantics
KW - Embeddings
KW - Intelligent tutoring systems
KW - Natural language processing
UR - https://www.scopus.com/pages/publications/85164922428
U2 - 10.1007/978-3-031-36336-8_19
DO - 10.1007/978-3-031-36336-8_19
M3 - Conference article published in proceeding or book
AN - SCOPUS:85164922428
SN - 9783031363351
T3 - Communications in Computer and Information Science
SP - 121
EP - 129
BT - Artificial Intelligence in Education. Posters and Late Breaking Results, Workshops and Tutorials, Industry and Innovation Tracks, Practitioners, Doctoral Consortium and Blue Sky - 24th International Conference, AIED 2023, Proceedings
A2 - Wang, Ning
A2 - Rebolledo-Mendez, Genaro
A2 - Dimitrova, Vania
A2 - Matsuda, Noboru
A2 - Santos, Olga C.
PB - Springer Science and Business Media Deutschland GmbH
Y2 - 3 July 2023 through 7 July 2023
ER -