Patent search ap:("Amazon Technologies Page Inc.") AND inv:"Cagdas Bak"

1.

发明公开
INTERMEDIATE DATA FOR INTER-DEVICE SPEECH PROCESSING 审中-公开

公开(公告)号：US20240029743A1

公开(公告)日：2024-01-25

申请号：US18206231

申请日：2023-06-06

Applicant: Amazon Technologies, Inc.

Inventor： Stanislaw Ignacy Pasko , Pawel Zelazko , Cagdas Bak , Eli Joshua Fidler , Michal Kowalczuk , Andrew Oberlin , Ariya Rastrow

IPC: G10L17/26 , G10L15/183 , G10L15/34 , G10L15/22

CPC classification number: G10L17/26 , G10L15/183 , G10L15/34 , G10L15/22

Abstract: Some speech processing systems may handle some commands on-device rather than sending the audio data to a second device or system for processing. The first device may have limited speech processing capabilities sufficient for handling common language and/or commands, while the second device (e.g., an edge device and/or a remote system) may call on additional language models, entity libraries, skill components, etc. to perform additional tasks. An intermediate data generator may facilitate dividing speech processing operations between devices by generating a stream of data that includes a first-pass ASR output (e.g., a word or sub-word lattice) and other characteristics of the audio data such as whisper detection, speaker identification, media signatures, etc. The second device can perform the additional processing using the data stream; e.g., without using the audio data. Thus, privacy may be enhanced by processing the audio data locally without sending it to other devices/systems.

2.

发明授权
Intermediate data for inter-device speech processing 有权

公开(公告)号：US11721347B1

公开(公告)日：2023-08-08

申请号：US17362301

申请日：2021-06-29

Applicant: Amazon Technologies, Inc.

Inventor： Stanislaw Ignacy Pasko , Pawel Zelazko , Cagdas Bak , Eli Joshua Fidler , Michal Kowalczuk , Andrew Oberlin , Ariya Rastrow

IPC: G10L17/26 , G10L15/183 , G10L15/34 , G10L15/22

CPC classification number: G10L17/26 , G10L15/183 , G10L15/22 , G10L15/34

Abstract: Some speech processing systems may handle some commands on-device rather than sending the audio data to a second device or system for processing. The first device may have limited speech processing capabilities sufficient for handling common language and/or commands, while the second device (e.g., an edge device and/or a remote system) may call on additional language models, entity libraries, skill components, etc. to perform additional tasks. An intermediate data generator may facilitate dividing speech processing operations between devices by generating a stream of data that includes a first-pass ASR output (e.g., a word or sub-word lattice) and other characteristics of the audio data such as whisper detection, speaker identification, media signatures, etc. The second device can perform the additional processing using the data stream; e.g., without using the audio data. Thus, privacy may be enhanced by processing the audio data locally without sending it to other devices/systems.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification