Blind diarization of recorded calls with arbitrary number of speakers
Abstract:
In a method of diarization of audio data, audio data is segmented into a plurality of utterances. Each utterance is represented as an utterance model representative of a plurality of feature vectors. The utterance models are clustered. A plurality of speaker models are constructed from the clustered utterance models. A hidden Markov model is constructed of the plurality of speaker models. A sequence of identified speaker models is decoded.
Information query
Patent Agency Ranking
0/0