Content output management based on speech quality

    公开(公告)号:US11562739B2

    公开(公告)日:2023-01-24

    申请号:US16786629

    申请日:2020-02-10

    Abstract: Techniques for ensuring content output to a user conforms to a quality of the user's speech, even when a speechlet or skill ignores the speech's quality, are described. When a system receives speech, the system determines an indicator of the speech's quality (e.g., whispered, shouted, fast, slow, etc.) and persists the indicator in memory. When the system receives output content from a speechlet or skill, the system checks whether the output content is in conformity with the speech quality indicator. If the content conforms to the speech quality indicator, the system may cause the content to be output to the user without further manipulation. But, if the content does not conform to the speech quality indicator, the system may manipulate the content to render it in conformity with the speech quality indicator and output the manipulated content to the user.

    Generation and use of multiple speech processing transforms
    2.
    发明授权
    Generation and use of multiple speech processing transforms 有权
    多语音处理转换的生成与使用

    公开(公告)号:US09218806B1

    公开(公告)日:2015-12-22

    申请号:US13892167

    申请日:2013-05-10

    CPC classification number: G10L15/02 G10L15/30 G10L15/32

    Abstract: Features are disclosed for selecting and using multiple transforms associated with a particular remote device for use in automatic speech recognition (“ASR”). Each transform may be based on statistics that have been generated from processing utterances that share some characteristic (e.g., acoustic characteristics, time frame within which the utterances where processed, etc.). When an utterance is received from the remote device, a particular transform or set of transforms may be selected for use in speech processing based on data obtained from the remote device, speech processing of a portion of the utterance, speech processing of prior utterances, etc. The transform or transforms used in processing the utterances may then be updated based on the results of the speech processing.

    Abstract translation: 公开了用于选择和使用与特定远程设备相关联的用于自动语音识别(“ASR”)的多个变换的特征。 每个变换可以基于已经从共享一些特征的处理话语(例如,声学特性,其中处理的话语的时间框架等)产生的统计信息。 当从远程设备接收到话语时,可以基于从远程设备获得的数据,话音的一部分的语音处理,先前语音的语音处理等,选择特定的变换或变换集合用于语音处理 然后可以基于语音处理的结果更新用于处理话语的变换或变换。

    Content output management based on speech quality

    公开(公告)号:US10600408B1

    公开(公告)日:2020-03-24

    申请号:US15933676

    申请日:2018-03-23

    Abstract: Techniques for ensuring content output to a user conforms to a quality of the user's speech, even when a speechlet or skill ignores the speech's quality, are described. When a system receives speech, the system determines an indicator of the speech's quality (e.g., whispered, shouted, fast, slow, etc.) and persists the indicator in memory. When the system receives output content from a speechlet or skill, the system checks whether the output content is in conformity with the speech quality indicator. If the content conforms to the speech quality indicator, the system may cause the content to be output to the user without further manipulation. But, if the content does not conform to the speech quality indicator, the system may manipulate the content to render it in conformity with the speech quality indicator and output the manipulated content to the user.

    Techniques for reactive route planning

    公开(公告)号:US12111162B1

    公开(公告)日:2024-10-08

    申请号:US17845439

    申请日:2022-06-21

    CPC classification number: G01C21/3415 G01C21/3446

    Abstract: One challenge for middle-mile route planning is that the set of loads changes significantly between daily planning and execution. Systems and methods are provided for optimizing a transportation plan for a transportation network based on these load changes. The disclosed system re-optimizes a solution by starting from a previously existing plan and previously generated columns (e.g., candidate routes). The disclosed techniques significantly improve the compute time of the system to generate transportation plans that are optimized accordingly to an optimization parameter. The system takes into account the current execution status associated with a given entry of the plan to determine whether the entry should be re-optimized. Entries corresponding to tours that have already commenced, may be at least partially ignored for re-optimization consideration. The disclosed techniques enable state-aware, adaptive re-optimization for even tours that are in-progress or have been tendered.

    CONTENT OUTPUT MANAGEMENT BASED ON SPEECH QUALITY

    公开(公告)号:US20200251104A1

    公开(公告)日:2020-08-06

    申请号:US16786629

    申请日:2020-02-10

    Abstract: Techniques for ensuring content output to a user conforms to a quality of the user's speech, even when a speechlet or skill ignores the speech's quality, are described. When a system receives speech, the system determines an indicator of the speech's quality (e.g., whispered, shouted, fast, slow, etc.) and persists the indicator in memory. When the system receives output content from a speechlet or skill, the system checks whether the output content is in conformity with the speech quality indicator. If the content conforms to the speech quality indicator, the system may cause the content to be output to the user without further manipulation. But, if the content does not conform to the speech quality indicator, the system may manipulate the content to render it in conformity with the speech quality indicator and output the manipulated content to the user.

    Using adaptation data with cloud-based speech recognition
    6.
    发明授权
    Using adaptation data with cloud-based speech recognition 有权
    使用基于云的语音识别的适应数据

    公开(公告)号:US08996372B1

    公开(公告)日:2015-03-31

    申请号:US13664363

    申请日:2012-10-30

    CPC classification number: G10L15/34

    Abstract: Speech recognition may be improved using data derived from an utterance. In some embodiments, audio data is received by a user device. Adaptation data may be retrieved from a data store accessible by the user device. The audio data and the adaptation data may be transmitted to a server device. The server device may use the audio data to calculate second adaptation data. The second adaptation data may be transmitted to the user device. Synchronously or asynchronously, the server device may perform speech recognition using the audio data and the second adaptation data and transmit speech recognition results back to the user device.

    Abstract translation: 可以使用从话语导出的数据来改善语音识别。 在一些实施例中,音频数据由用户设备接收。 可以从用户设备可访问的数据存储器中检索适配数据。 音频数据和适配数据可以被发送到服务器设备。 服务器设备可以使用音频数据来计算第二自适应数据。 第二适配数据可以被发送到用户设备。 同步或异步地,服务器设备可以使用音频数据和第二自适应数据来执行语音识别,并将语音识别结果发送回用户设备。

    Listener animation
    7.
    发明授权

    公开(公告)号:US12254548B1

    公开(公告)日:2025-03-18

    申请号:US18082709

    申请日:2022-12-16

    Abstract: A system configured to perform style-aware listener animation. By representing different listening styles (e.g., facial expressions) using an embedding space, a single model can be trained to generate unique facial animations for a number of distinct listeners. Thus, individual listening styles can be associated with a listener identifier, enabling the system to (i) animate a plurality of different listeners with unique nonverbal behavior and/or (ii) select a particular listener identifier or desired type of listener style with which to animate. This enables the model to be generalized to new listeners to generate additional listener facial responses without needing training data for each new listener. The model may process a listener representation style or listener identifier, along with input data corresponding to a speaker talking, to generate unique facial animation responsive to the speech.

    CONTENT OUTPUT MANAGEMENT BASED ON SPEECH QUALITY

    公开(公告)号:US20230290346A1

    公开(公告)日:2023-09-14

    申请号:US18098235

    申请日:2023-01-18

    CPC classification number: G10L15/20 G10L13/033 G10L13/10 G10L15/1807

    Abstract: Techniques for ensuring content output to a user conforms to a quality of the user's speech, even when a speechlet or skill ignores the speech's quality, are described. When a system receives speech, the system determines an indicator of the speech's quality (e.g., whispered, shouted, fast, slow, etc.) and persists the indicator in memory. When the system receives output content from a speechlet or skill, the system checks whether the output content is in conformity with the speech quality indicator. If the content conforms to the speech quality indicator, the system may cause the content to be output to the user without further manipulation. But, if the content does not conform to the speech quality indicator, the system may manipulate the content to render it in conformity with the speech quality indicator and output the manipulated content to the user.

Patent Agency Ranking