Web-Questionnaire-Based Corpus Creation Under Assumption of Human as Speech Targets

Kazuaki Shima, Jinhua She, Yasunari Obuchi, Abdullah M. Iliyasu

Research output: Contribution to journalArticlepeer-review

2 Scopus citations

Abstract

This paper presents a method that uses a web questionnaire to create a corpus containing spontaneous utterances of natural ideas, which may contain grammatical mistakes. In an experimental implementation of the method, the subjects were informed that they were receiving nursing care from a person, and they were required to answer a web-based questionnaire in which their responses were recorded as speech utterances. Compared to the Wizard of Oz approach and interview-based corpus-creation methods, the presented method simplifies the collection of utterances. Furthermore, we conducted a two-fold assessment to verify the effectiveness of the presented method. First, the approach exhibited a significant reduction in workload compared to interview-style utterance collection. Second, we compared the variety of expressions collected when subjects were informed that they were talking to a person with those collected when they were informed that they were communicating with a nursing robot. The results indicate that, although the number of utterances was larger for a robot than for a person, in terms of other metrics such as time efficiency index, the total number of morphemes, the average number of morphemes per utterance, the number of unique morphemes, and coefficient of variation, the utterances were larger for a human speech target than for a robot.

Original languageEnglish
Pages (from-to)513-520
Number of pages8
JournalJournal of Advanced Computational Intelligence and Intelligent Informatics
Volume26
Issue number4
DOIs
StatePublished - Jul 2022

Keywords

  • corpus
  • morpheme
  • natural utterance
  • spontaneity
  • web questionnaire

Fingerprint

Dive into the research topics of 'Web-Questionnaire-Based Corpus Creation Under Assumption of Human as Speech Targets'. Together they form a unique fingerprint.

Cite this