SPEAKER DIARIZATION SUPPORTING EPISODICAL CONTENT

    公开(公告)号:US20240160849A1

    公开(公告)日:2024-05-16

    申请号:US18550429

    申请日:2022-04-27

    CPC classification number: G06F40/30

    Abstract: Embodiments are disclosed for speaker diarization supporting episodical content. In an embodiment, a method comprises: receiving media data including one or more utterances; dividing the media data into a plurality of blocks; identifying segments of each block of the plurality of blocks associated with a single speaker; extracting embeddings for the identified segments in accordance with a machine learning model, wherein extracting embeddings for identified segments further comprises statistically combining extracted embeddings for identified segments that correspond to a respective continuous utterance associated with a single speaker; clustering the embeddings for the identified segments into clusters; and assigning a speaker label to each of the embeddings for the identified segments in accordance with a result of the clustering. In some embodiments, a voiceprint is used to identify a speaker and the speaker identity for a speaker label.

Patent Agency Ranking