Quality Assurance of Generative Dialog Models in an Evolving Conversational Agent Used for Swedish Language Practice

Submitted by grigby1 on Fri, 05/12/2023 - 10:08am

Title	Quality Assurance of Generative Dialog Models in an Evolving Conversational Agent Used for Swedish Language Practice
Publication Type	Conference Paper
Year of Publication	2022
Authors	Borg, Markus, Bengtsson, Johan, Österling, Harald, Hagelborn, Alexander, Gagner, Isabella, Tomaszewski, Piotr
Conference Name	2022 IEEE/ACM 1st International Conference on AI Engineering – Software Engineering for AI (CAIN)
Keywords	action research, AI quality, Context modeling, conversational agent, conversational agents, generative dialog model, Human Behavior, Interviews, machine learning, Metrics, natural language processing, pubcrawl, quality assurance, requirements engineering, Scalability, software engineering, Software Testing, Testing
Abstract	Due to the migration megatrend, efficient and effective second-language acquisition is vital. One proposed solution involves AI-enabled conversational agents for person-centered interactive language practice. We present results from ongoing action research targeting quality assurance of proprietary generative dialog models trained for virtual job interviews. The action team elicited a set of 38 requirements for which we designed corresponding automated test cases for 15 of particular interest to the evolving solution. Our results show that six of the test case designs can detect meaningful differences between candidate models. While quality assurance of natural language processing applications is complex, we provide initial steps toward an automated framework for machine learning model selection in the context of an evolving conversational agent. Future work will focus on model selection in an MLOps setting.
DOI	10.1145/3522664.3528592
Citation Key	borg_quality_2022

Groups:

Science of Security VO