-
公开(公告)号:US11562739B2
公开(公告)日:2023-01-24
申请号:US16786629
申请日:2020-02-10
Applicant: Amazon Technologies, Inc.
Inventor: Andrew Smith , Christopher Schindler , Karthik Ramakrishnan , Rohit Prasad , Michael George , Rafal Kuklinski
IPC: G10L15/20 , G10L13/033 , G10L13/10 , G10L15/18
Abstract: Techniques for ensuring content output to a user conforms to a quality of the user's speech, even when a speechlet or skill ignores the speech's quality, are described. When a system receives speech, the system determines an indicator of the speech's quality (e.g., whispered, shouted, fast, slow, etc.) and persists the indicator in memory. When the system receives output content from a speechlet or skill, the system checks whether the output content is in conformity with the speech quality indicator. If the content conforms to the speech quality indicator, the system may cause the content to be output to the user without further manipulation. But, if the content does not conform to the speech quality indicator, the system may manipulate the content to render it in conformity with the speech quality indicator and output the manipulated content to the user.
-
2.
公开(公告)号:US09218806B1
公开(公告)日:2015-12-22
申请号:US13892167
申请日:2013-05-10
Applicant: Amazon Technologies, Inc.
IPC: G10L15/02
Abstract: Features are disclosed for selecting and using multiple transforms associated with a particular remote device for use in automatic speech recognition (“ASR”). Each transform may be based on statistics that have been generated from processing utterances that share some characteristic (e.g., acoustic characteristics, time frame within which the utterances where processed, etc.). When an utterance is received from the remote device, a particular transform or set of transforms may be selected for use in speech processing based on data obtained from the remote device, speech processing of a portion of the utterance, speech processing of prior utterances, etc. The transform or transforms used in processing the utterances may then be updated based on the results of the speech processing.
Abstract translation: 公开了用于选择和使用与特定远程设备相关联的用于自动语音识别(“ASR”)的多个变换的特征。 每个变换可以基于已经从共享一些特征的处理话语(例如,声学特性,其中处理的话语的时间框架等)产生的统计信息。 当从远程设备接收到话语时,可以基于从远程设备获得的数据,话音的一部分的语音处理,先前语音的语音处理等,选择特定的变换或变换集合用于语音处理 然后可以基于语音处理的结果更新用于处理话语的变换或变换。
-
公开(公告)号:US10600408B1
公开(公告)日:2020-03-24
申请号:US15933676
申请日:2018-03-23
Applicant: Amazon Technologies, Inc.
Inventor: Andrew Smith , Christopher Schindler , Karthik Ramakrishnan , Rohit Prasad , Michael George , Rafal Kuklinski
IPC: G10L15/20 , G10L13/033 , G10L13/10 , G10L15/18
Abstract: Techniques for ensuring content output to a user conforms to a quality of the user's speech, even when a speechlet or skill ignores the speech's quality, are described. When a system receives speech, the system determines an indicator of the speech's quality (e.g., whispered, shouted, fast, slow, etc.) and persists the indicator in memory. When the system receives output content from a speechlet or skill, the system checks whether the output content is in conformity with the speech quality indicator. If the content conforms to the speech quality indicator, the system may cause the content to be output to the user without further manipulation. But, if the content does not conform to the speech quality indicator, the system may manipulate the content to render it in conformity with the speech quality indicator and output the manipulated content to the user.
-
公开(公告)号:US12111162B1
公开(公告)日:2024-10-08
申请号:US17845439
申请日:2022-06-21
Applicant: Amazon Technologies, Inc.
Inventor: Phillip Oliver Kriett , Quico Pepijn Spaen , Georgios Patsakis , Diwakar Tiwari , Akhand Pratap Singh , Ivan Borges Oliveira , Andrew V. Goldberg , Philip Mark Kaminsky , Karthik Ramakrishnan , Manik Kumar
IPC: G01C21/34
CPC classification number: G01C21/3415 , G01C21/3446
Abstract: One challenge for middle-mile route planning is that the set of loads changes significantly between daily planning and execution. Systems and methods are provided for optimizing a transportation plan for a transportation network based on these load changes. The disclosed system re-optimizes a solution by starting from a previously existing plan and previously generated columns (e.g., candidate routes). The disclosed techniques significantly improve the compute time of the system to generate transportation plans that are optimized accordingly to an optimization parameter. The system takes into account the current execution status associated with a given entry of the plan to determine whether the entry should be re-optimized. Entries corresponding to tours that have already commenced, may be at least partially ignored for re-optimization consideration. The disclosed techniques enable state-aware, adaptive re-optimization for even tours that are in-progress or have been tendered.
-
公开(公告)号:US20200251104A1
公开(公告)日:2020-08-06
申请号:US16786629
申请日:2020-02-10
Applicant: Amazon Technologies, Inc.
Inventor: Andrew Smith , Christopher Schindler , Karthik Ramakrishnan , Rohit Prasad , Michael George , Rafal Kuklinski
IPC: G10L15/20 , G10L15/18 , G10L13/10 , G10L13/033
Abstract: Techniques for ensuring content output to a user conforms to a quality of the user's speech, even when a speechlet or skill ignores the speech's quality, are described. When a system receives speech, the system determines an indicator of the speech's quality (e.g., whispered, shouted, fast, slow, etc.) and persists the indicator in memory. When the system receives output content from a speechlet or skill, the system checks whether the output content is in conformity with the speech quality indicator. If the content conforms to the speech quality indicator, the system may cause the content to be output to the user without further manipulation. But, if the content does not conform to the speech quality indicator, the system may manipulate the content to render it in conformity with the speech quality indicator and output the manipulated content to the user.
-
6.
公开(公告)号:US08996372B1
公开(公告)日:2015-03-31
申请号:US13664363
申请日:2012-10-30
Applicant: Amazon Technologies, Inc.
Inventor: Hugh Secker-Walker , Bjorn Hoffmeister , Ryan Thomas , Stan Salvador , Karthik Ramakrishnan
CPC classification number: G10L15/34
Abstract: Speech recognition may be improved using data derived from an utterance. In some embodiments, audio data is received by a user device. Adaptation data may be retrieved from a data store accessible by the user device. The audio data and the adaptation data may be transmitted to a server device. The server device may use the audio data to calculate second adaptation data. The second adaptation data may be transmitted to the user device. Synchronously or asynchronously, the server device may perform speech recognition using the audio data and the second adaptation data and transmit speech recognition results back to the user device.
Abstract translation: 可以使用从话语导出的数据来改善语音识别。 在一些实施例中,音频数据由用户设备接收。 可以从用户设备可访问的数据存储器中检索适配数据。 音频数据和适配数据可以被发送到服务器设备。 服务器设备可以使用音频数据来计算第二自适应数据。 第二适配数据可以被发送到用户设备。 同步或异步地,服务器设备可以使用音频数据和第二自适应数据来执行语音识别,并将语音识别结果发送回用户设备。
-
公开(公告)号:US12254548B1
公开(公告)日:2025-03-18
申请号:US18082709
申请日:2022-12-16
Applicant: Amazon Technologies, Inc.
Inventor: Gourav Datta , Vivek Yadav , Yue Wu , Ayush Jaiswal , Rajiv M Reddy , Prateek Singhal , Karthik Ramakrishnan , Premkumar Natarajan
Abstract: A system configured to perform style-aware listener animation. By representing different listening styles (e.g., facial expressions) using an embedding space, a single model can be trained to generate unique facial animations for a number of distinct listeners. Thus, individual listening styles can be associated with a listener identifier, enabling the system to (i) animate a plurality of different listeners with unique nonverbal behavior and/or (ii) select a particular listener identifier or desired type of listener style with which to animate. This enables the model to be generalized to new listeners to generate additional listener facial responses without needing training data for each new listener. The model may process a listener representation style or listener identifier, along with input data corresponding to a speaker talking, to generate unique facial animation responsive to the speech.
-
公开(公告)号:US20230290346A1
公开(公告)日:2023-09-14
申请号:US18098235
申请日:2023-01-18
Applicant: Amazon Technologies, Inc.
Inventor: Andrew Smith , Christopher Schindler , Karthik Ramakrishnan , Rohit Prasad , Michael George , Rafal Kuklinski
IPC: G10L15/20 , G10L13/033 , G10L13/10 , G10L15/18
CPC classification number: G10L15/20 , G10L13/033 , G10L13/10 , G10L15/1807
Abstract: Techniques for ensuring content output to a user conforms to a quality of the user's speech, even when a speechlet or skill ignores the speech's quality, are described. When a system receives speech, the system determines an indicator of the speech's quality (e.g., whispered, shouted, fast, slow, etc.) and persists the indicator in memory. When the system receives output content from a speechlet or skill, the system checks whether the output content is in conformity with the speech quality indicator. If the content conforms to the speech quality indicator, the system may cause the content to be output to the user without further manipulation. But, if the content does not conform to the speech quality indicator, the system may manipulate the content to render it in conformity with the speech quality indicator and output the manipulated content to the user.
-
-
-
-
-
-
-