Intermediate data for inter-device speech processing

发明授权

US11721347B1 Intermediate data for inter-device speech processing 有权

请登陆查看更多内容

专利标题： Intermediate data for inter-device speech processing
申请号： US17362301

申请日： 2021-06-29
公开(公告)号： US11721347B1

公开(公告)日： 2023-08-08
发明人: Stanislaw Ignacy Pasko , Pawel Zelazko , Cagdas Bak , Eli Joshua Fidler , Michal Kowalczuk , Andrew Oberlin , Ariya Rastrow
申请人： Amazon Technologies, Inc.
申请人地址： US WA Seattle
专利权人： Amazon Technologies, Inc.
当前专利权人： Amazon Technologies, Inc.
当前专利权人地址： US WA Seattle
代理机构： Pierce Atwood LLP
主分类号： G10L17/26
IPC分类号： G10L17/26 ; G10L15/183 ; G10L15/34 ; G10L15/22

Intermediate data for inter-device speech processing

摘要：

Some speech processing systems may handle some commands on-device rather than sending the audio data to a second device or system for processing. The first device may have limited speech processing capabilities sufficient for handling common language and/or commands, while the second device (e.g., an edge device and/or a remote system) may call on additional language models, entity libraries, skill components, etc. to perform additional tasks. An intermediate data generator may facilitate dividing speech processing operations between devices by generating a stream of data that includes a first-pass ASR output (e.g., a word or sub-word lattice) and other characteristics of the audio data such as whisper detection, speaker identification, media signatures, etc. The second device can perform the additional processing using the data stream; e.g., without using the audio data. Thus, privacy may be enhanced by processing the audio data locally without sending it to other devices/systems.

信息查询

Espacenet

IPC分类:

G	物理
G10	乐器；声学
G10L	语音分析或合成；语音识别；语音或声音处理；语音或音频编码或解码
G10L17/00	讲话者辨认或验证
G10L17/26	.特殊语音特征的识别，例如测谎器的使用；动物声音识别