Dialog management system
    2.
    发明授权

    公开(公告)号:US11804225B1

    公开(公告)日:2023-10-31

    申请号:US17375458

    申请日:2021-07-14

    CPC classification number: G10L15/22 G10L15/1815 G10L15/30 G10L2015/223

    Abstract: Techniques for conversation recovery in a dialog management system are described. A system may determine, using dialog models, that a predicted action to be performed by a skill component is likely to result in an undesired response or that the skill component is unable to respond to a user input of a dialog session. Rather than informing the user that the skill component is unable to respond, the system may send data to the skill component to enable the skill component to determine a correct action responsive to the user input. The data may include an indication of the predicted action and/or entity data corresponding to the user input. The system may receive, from the skill component, response data corresponding to the user input, and may use the response data to update a dialog context for the dialog session and an inference engine of the dialog management system.

    VOICE PROFILE UPDATING
    3.
    发明申请

    公开(公告)号:US20210304774A1

    公开(公告)日:2021-09-30

    申请号:US17228950

    申请日:2021-04-13

    Abstract: Techniques for updating voice profiles used to perform user recognition are described. A system may use clustering techniques to update voice profiles. When the system receives audio data representing a spoken user input, the system may store the audio data. Periodically, the system may recall, from storage, audio data (representing previous user inputs). The system may identify clusters of the audio data, with each cluster including similar or identical speech characteristics. The system may determine a cluster is substantially similar to an existing voice profile. If this occurs, the system may create an updated voice profile using the original voice profile and the cluster of audio data.

    User presence detection
    5.
    发明授权

    公开(公告)号:US10121494B1

    公开(公告)日:2018-11-06

    申请号:US15474603

    申请日:2017-03-30

    Abstract: A speech-capture device can capture audio data during wakeword monitoring and use the audio data to determine if a user is present nearby the device, even if no wakeword is spoken. Audio such as speech, human originating sounds (e.g., coughing, sneezing), or other human related noises (e.g., footsteps, doors closing) can be used to detect audio. Audio frames are individually scored as to whether a human presence is detected in the particular audio frames. The scores are then smoothed relative to nearby frames to create a decision for a particular frame. Presence information can then be sent according to a periodic schedule to a remote device to create a presence “heartbeat” that regularly identifies whether a user is detected proximate to a speech-capture device.

    DEEP MULTI-CHANNEL ACOUSTIC MODELING
    10.
    发明申请

    公开(公告)号:US20200349928A1

    公开(公告)日:2020-11-05

    申请号:US16932049

    申请日:2020-07-17

    Abstract: Techniques for speech processing using a deep neural network (DNN) based acoustic model front-end are described. A new modeling approach directly models multi-channel audio data received from a microphone array using a first model (e.g., multi-channel DNN) that takes in raw signals and produces a first feature vector that may be used similarly to beamformed features generated by an acoustic beamformer. A second model (e.g., feature extraction DNN) processes the first feature vector and transforms it to a second feature vector having a lower dimensional representation. A third model (e.g., classification DNN) processes the second feature vector to perform acoustic unit classification and generate text data. These three models may be jointly optimized for speech processing (as opposed to individually optimized for signal enhancement), enabling improved performance despite a reduction in microphones and a reduction in bandwidth consumption during real-time processing.

Patent Agency Ranking