LANGUAGE MODEL BIASING MODULATION
    41.
    发明申请

    公开(公告)号:US20190237063A1

    公开(公告)日:2019-08-01

    申请号:US16381167

    申请日:2019-04-11

    申请人: Google LLC

    摘要: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for modulating language model biasing. In some implementations, context data is received. A likely context associated with a user is determined based on at least a portion of the context data. One or more language model biasing parameters based at least on the likely context associated with the user is selected. A context confidence score associated with the likely context based on at least a portion of the context data is determined. One or more language model biasing parameters based at least on the context confidence score is adjusted. A baseline language model based at least on the one or more of the adjusted language model biasing parameters is biased. The baseline language model is provided for use by an automated speech recognizer (ASR).

    Automatic speech recognition (ASR) utilizing GPS and sensor data

    公开(公告)号:US10360910B2

    公开(公告)日:2019-07-23

    申请号:US15687228

    申请日:2017-08-25

    摘要: An automatic speech recognition (ASR) system is disclosed that compensates for different noise environments and types of speech. The ASR system may be implemented as part of an action camera that collects status data, such as geographic location data and/or sensor data. The ASR system may perform speech recognition using an acoustic model and a speech recognition model, which are trained for operation in specific noise environments and/or for specific types of speech. The computing device may categorize a current status of the action camera, as indicated by the status data, into an action profile, which may represent a particular activity (e.g., running, cycling, etc.) or state of the computing device. The computing device may dynamically switch the acoustic model and/or the speech recognition model to compensate for anticipated changes in the noise environment and speech based upon the action profile to facilitate the recognition of various action camera functions.

    Speech/Dialog Enhancement Controlled by Pupillometry

    公开(公告)号:US20190057694A1

    公开(公告)日:2019-02-21

    申请号:US15998796

    申请日:2018-08-16

    发明人: Arijit Biswas

    摘要: The present disclosure relates to methods for processing a decoded audio signal and for selectively applying speech/dialog enhancement to the decoded audio signal. The present disclosure also relates to a method of operating a headset for computer-mediated reality. A method of processing a decoded audio signal comprises obtaining a measure of a cognitive load of a listener that listens to a rendering of the audio signal, determining whether speech/dialog enhancement shall be applied based on the obtained measure of the cognitive load, and performing speech/dialog enhancement based on the determination. A method of operating a headset for computer-mediated reality comprises obtaining eye-tracking data of a wearer of the headset, determining a measure of a cognitive load of the wearer of the headset based on the eye-tracking data, and outputting an indication of the cognitive load of the wearer of the headset. The present disclosure further relates to corresponding apparatus and systems, and to methods of operating such apparatus and systems.

    OPTIMUM CONTROL METHOD BASED ON MULTI-MODE COMMAND OF OPERATION-VOICE, AND ELECTRONIC DEVICE TO WHICH SAME IS APPLIED

    公开(公告)号:US20190019515A1

    公开(公告)日:2019-01-17

    申请号:US16134511

    申请日:2018-09-18

    申请人: VTOUCH CO., LTD.

    摘要: A control method for allowing a user to specify an electronic device and switch it to a speech recognition mode is provided. With the optimum control method and the electronic device utilizing the method, a voice command may be transmitted to the electronic device more quickly and effectively regardless of the circumstances, and the electronic device may be specified through gesture recognition to enable transmission of the voice command, so that the voice command may be effectively executed without needing a user to learn or memorize a name or the like of the electronic device in advance for speech recognition. Further, it is possible to more accurately recognize a gesture that is a preliminary step for transmitting a voice command to the electronic device, thereby improving the recognition rate and preventing malfunction.

    Dynamic language model
    49.
    发明授权

    公开(公告)号:US10140362B2

    公开(公告)日:2018-11-27

    申请号:US15231066

    申请日:2016-08-08

    申请人: Google LLC

    摘要: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for speech recognition. One of the methods includes receiving a base language model for speech recognition including a first word sequence having a base probability value; receiving a voice search query associated with a query context; determining that a customized language model is to be used when the query context satisfies one or more criteria associated with the customized language model; obtaining the customized language model, the customized language model including the first word sequence having an adjusted probability value being the base probability value adjusted according to the query context; and converting the voice search query to a text search query based on one or more probabilities, each of the probabilities corresponding to a word sequence in a group of one or more word sequences, the group including the first word sequence having the adjusted probability value.