Speaker separation based on real-time latent speaker state characterization

发明授权

US11790921B2 Speaker separation based on real-time latent speaker state characterization 有权

请登陆查看更多内容

专利标题： Speaker separation based on real-time latent speaker state characterization
申请号： US17169843

申请日： 2021-02-08
公开(公告)号： US11790921B2

公开(公告)日： 2023-10-17
发明人: Valentin Alain Jean Perret , Nándor Kedves , Nicolas Lucien Perony
申请人： OTO Systems Inc.
申请人地址： US NY New York
专利权人： OTO Systems Inc.
当前专利权人： OTO Systems Inc.
当前专利权人地址： US NY New York
代理机构： Schwegman Lundberg & Woessner, P.A.
主分类号： G10L17/06
IPC分类号： G10L17/06 ; G10L17/02 ; G10L17/04 ; G10L17/18 ; G06N3/04 ; G06N3/08 ; G06N3/049 ; G10L21/0272 ; G06N3/045

Speaker separation based on real-time latent speaker state characterization

摘要：

Systems, methods, and non-transitory computer-readable media can obtain a stream of audio waveform data that represents speech involving a plurality of speakers. As the stream of audio waveform data is obtained, a plurality of audio chunks can be determined. An audio chunk can be associated with one or more identity embeddings. The stream of audio waveform data can be segmented into a plurality of segments based on the plurality of audio chunks and respective identity embeddings associated with the plurality of audio chunks. A segment can be associated with a speaker included in the plurality of speakers. Information describing the plurality of segments associated with the stream of audio waveform data can be provided.

信息查询

Espacenet

IPC分类:

G	物理
G10	乐器；声学
G10L	语音分析或合成；语音识别；语音或声音处理；语音或音频编码或解码
G10L17/00	讲话者辨认或验证
G10L17/06	.决策方法，模式适配策略