Machine Learning

Human and Machine Speaker Recognition Based on Short Trivial Events

Tagged: ,

This topic contains 0 replies, has 1 voice, and was last updated by  arXiv 1 year, 2 months ago.


  • arXiv
    5 pts

    Human and Machine Speaker Recognition Based on Short Trivial Events

    Trivial events are ubiquitous in human to human conversations, e.g., cough, laugh and sniff. Compared to regular speech, these trivial events are usually short and unclear, thus generally regarded as not speaker discriminative and so are largely ignored by present speaker recognition research. However, these trivial events are highly valuable in some particular circumstances such as forensic examination, as they are less subjected to intentional change, so can be used to discover the genuine speaker from disguised speech. In this paper, we collect a trivial event speech database that involves 75 speakers and 6 types of events, and report preliminary speaker recognition results on this database, by both human listeners and machines. Particularly, the deep feature learning technique recently proposed by our group is utilized to analyze and recognize the trivial events, which leads to acceptable equal error rates (EERs) despite the extremely short durations (0.2-0.5 seconds) of these events. Comparing different types of events, ‘hmm’ seems more speaker discriminative.

    Human and Machine Speaker Recognition Based on Short Trivial Events
    by Miao Zhang, Xiaofei Kang, Yanqing Wang, Lantian Li, Zhiyuan Tang, Haisheng Dai, Dong Wang
    https://arxiv.org/pdf/1711.05443v2.pdf

You must be logged in to reply to this topic.