Matching of Japanese Listening Test Dialogues and Anime Scene Dialogues based on Zero-shot Attribute Classification

KES2024_NI_ver4-1

Information

論文タイトル:Matching of Japanese Listening Test Dialogues and Anime Scene Dialogues based on Zero-shot Attribute Classification

著者:Yangdi Ni, Junjie Shan, and Yoko Nishihara

概要:Zero-shot classification methods do not require extra training processes, but their classification effectiveness differs from the input label set expected to be classified. This paper proposes a method to find label sets with the best classification effectiveness and classify and match dialogues with them by zero-shot classification methods. We investigated two classification methods in this paper: (1) text embedding-based cosine similarity and (2) end-to-end pre-trained zero-shot model. We collected 250 listening test dialogues from each level of the past Japanese Language Proficiency Test (JLPT) and manually classified them by three attributes: (1) dialogue location, (2) speaker’s relationship, and (3) dialogue style. We used these listening test dialogues to test the effectiveness of zero-shot classification under different input label sets. After comparing 212 label sets by RMSE (Root Mean Square Error), we identified seven label sets with the best classification effectiveness. In the evaluation experiment, 314,930 anime scenes were classified with the seven label sets. We matched anime dialogue scenes and past listening test dialogues with their label under different zero-shot classification methods and different numbers of attributes. We calculated the word cover rate and the text similarity between matched anime dialogue scenes and listening dialogues. The result shows that, compared with the random-sampling baseline, the proposed method using text embedding-based cosine similarity can reduce the number of anime scene candidates to 18.7% and result in a 0.82% increase in word cover rate and a 0.0285 increase in text similarity. In contrast, the end-to-end zero-shot model could reduce anime scene candidates to 15.2% and increase the word cover rate and text similarity with 2.13% and 0.0054, respectively.

書誌情報:KES2024

発表日:2024年9月11日