TY - JOUR
T1 - Web-Questionnaire-Based Corpus Creation Under Assumption of Human as Speech Targets
AU - Shima, Kazuaki
AU - She, Jinhua
AU - Obuchi, Yasunari
AU - Iliyasu, Abdullah M.
N1 - Publisher Copyright:
© Fuji Technology Press Ltd.
PY - 2022/7
Y1 - 2022/7
N2 - This paper presents a method that uses a web questionnaire to create a corpus containing spontaneous utterances of natural ideas, which may contain grammatical mistakes. In an experimental implementation of the method, the subjects were informed that they were receiving nursing care from a person, and they were required to answer a web-based questionnaire in which their responses were recorded as speech utterances. Compared to the Wizard of Oz approach and interview-based corpus-creation methods, the presented method simplifies the collection of utterances. Furthermore, we conducted a two-fold assessment to verify the effectiveness of the presented method. First, the approach exhibited a significant reduction in workload compared to interview-style utterance collection. Second, we compared the variety of expressions collected when subjects were informed that they were talking to a person with those collected when they were informed that they were communicating with a nursing robot. The results indicate that, although the number of utterances was larger for a robot than for a person, in terms of other metrics such as time efficiency index, the total number of morphemes, the average number of morphemes per utterance, the number of unique morphemes, and coefficient of variation, the utterances were larger for a human speech target than for a robot.
AB - This paper presents a method that uses a web questionnaire to create a corpus containing spontaneous utterances of natural ideas, which may contain grammatical mistakes. In an experimental implementation of the method, the subjects were informed that they were receiving nursing care from a person, and they were required to answer a web-based questionnaire in which their responses were recorded as speech utterances. Compared to the Wizard of Oz approach and interview-based corpus-creation methods, the presented method simplifies the collection of utterances. Furthermore, we conducted a two-fold assessment to verify the effectiveness of the presented method. First, the approach exhibited a significant reduction in workload compared to interview-style utterance collection. Second, we compared the variety of expressions collected when subjects were informed that they were talking to a person with those collected when they were informed that they were communicating with a nursing robot. The results indicate that, although the number of utterances was larger for a robot than for a person, in terms of other metrics such as time efficiency index, the total number of morphemes, the average number of morphemes per utterance, the number of unique morphemes, and coefficient of variation, the utterances were larger for a human speech target than for a robot.
KW - corpus
KW - morpheme
KW - natural utterance
KW - spontaneity
KW - web questionnaire
UR - http://www.scopus.com/inward/record.url?scp=85135255219&partnerID=8YFLogxK
U2 - 10.20965/jaciii.2022.p0513
DO - 10.20965/jaciii.2022.p0513
M3 - Article
AN - SCOPUS:85135255219
SN - 1343-0130
VL - 26
SP - 513
EP - 520
JO - Journal of Advanced Computational Intelligence and Intelligent Informatics
JF - Journal of Advanced Computational Intelligence and Intelligent Informatics
IS - 4
ER -