-
公开(公告)号:US20190237063A1
公开(公告)日:2019-08-01
申请号:US16381167
申请日:2019-04-11
申请人: Google LLC
IPC分类号: G10L15/07 , G10L15/24 , G10L15/197 , G10L15/183
CPC分类号: G10L15/07 , G10L15/183 , G10L15/197 , G10L15/24
摘要: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for modulating language model biasing. In some implementations, context data is received. A likely context associated with a user is determined based on at least a portion of the context data. One or more language model biasing parameters based at least on the likely context associated with the user is selected. A context confidence score associated with the likely context based on at least a portion of the context data is determined. One or more language model biasing parameters based at least on the context confidence score is adjusted. A baseline language model based at least on the one or more of the adjusted language model biasing parameters is biased. The baseline language model is provided for use by an automated speech recognizer (ASR).
-
公开(公告)号:US10360910B2
公开(公告)日:2019-07-23
申请号:US15687228
申请日:2017-08-25
IPC分类号: G10L15/00 , G10L15/22 , H04W52/02 , G10L15/18 , G10L15/24 , G10L15/20 , G10L15/06 , H04N5/232 , H04M1/725 , G10L15/08 , G06F3/16 , G10L15/183 , G10L15/02
摘要: An automatic speech recognition (ASR) system is disclosed that compensates for different noise environments and types of speech. The ASR system may be implemented as part of an action camera that collects status data, such as geographic location data and/or sensor data. The ASR system may perform speech recognition using an acoustic model and a speech recognition model, which are trained for operation in specific noise environments and/or for specific types of speech. The computing device may categorize a current status of the action camera, as indicated by the status data, into an action profile, which may represent a particular activity (e.g., running, cycling, etc.) or state of the computing device. The computing device may dynamically switch the acoustic model and/or the speech recognition model to compensate for anticipated changes in the noise environment and speech based upon the action profile to facilitate the recognition of various action camera functions.
-
公开(公告)号:US20190189122A1
公开(公告)日:2019-06-20
申请号:US16301058
申请日:2017-02-21
申请人: SONY CORPORATION
发明人: Saki YOKOYAMA
CPC分类号: G10L15/22 , G06F3/013 , G06F3/0488 , G06F3/04883 , G06F3/16 , G06F16/00 , G06F17/2223 , G06F17/24 , G06F17/2765 , G06F17/2863 , G10L15/00 , G10L15/02 , G10L15/24 , G10L15/265 , G10L15/30 , G10L2015/223
摘要: [Object] To provide an information processing device and information processing method that are capable of emending a sentence by inputting voice.[Solution] The information processing device includes: a transmission unit configured to transmit voice information including an emendatory command and an emendation target of a sentence; and a reception unit configured to receive a process result based on the emendatory command and the emendation target.
-
公开(公告)号:US20190057694A1
公开(公告)日:2019-02-21
申请号:US15998796
申请日:2018-08-16
发明人: Arijit Biswas
IPC分类号: G10L15/22 , G10L25/78 , G10L15/24 , G10L21/0364 , G06F3/01
摘要: The present disclosure relates to methods for processing a decoded audio signal and for selectively applying speech/dialog enhancement to the decoded audio signal. The present disclosure also relates to a method of operating a headset for computer-mediated reality. A method of processing a decoded audio signal comprises obtaining a measure of a cognitive load of a listener that listens to a rendering of the audio signal, determining whether speech/dialog enhancement shall be applied based on the obtained measure of the cognitive load, and performing speech/dialog enhancement based on the determination. A method of operating a headset for computer-mediated reality comprises obtaining eye-tracking data of a wearer of the headset, determining a measure of a cognitive load of the wearer of the headset based on the eye-tracking data, and outputting an indication of the cognitive load of the wearer of the headset. The present disclosure further relates to corresponding apparatus and systems, and to methods of operating such apparatus and systems.
-
45.
公开(公告)号:US20190019515A1
公开(公告)日:2019-01-17
申请号:US16134511
申请日:2018-09-18
申请人: VTOUCH CO., LTD.
发明人: Seokjoong KIM , Chunghoon KIM , So Yeon KIM
摘要: A control method for allowing a user to specify an electronic device and switch it to a speech recognition mode is provided. With the optimum control method and the electronic device utilizing the method, a voice command may be transmitted to the electronic device more quickly and effectively regardless of the circumstances, and the electronic device may be specified through gesture recognition to enable transmission of the voice command, so that the voice command may be effectively executed without needing a user to learn or memorize a name or the like of the electronic device in advance for speech recognition. Further, it is possible to more accurately recognize a gesture that is a preliminary step for transmitting a voice command to the electronic device, thereby improving the recognition rate and preventing malfunction.
-
公开(公告)号:US10176807B2
公开(公告)日:2019-01-08
申请号:US15604358
申请日:2017-05-24
发明人: Manuel Roman , Mara Clair Segal , Dwipal Desai , Andrew E. Rubin
摘要: A home assistant device assisting with the setup of devices is described. An assistant device can determine setup instructions for devices. The setup instructions for one device can be determined to include a step requesting information related to the setup of another device. The setup of the devices can be ordered based on that determination. The setup instructions can then be provided.
-
47.
公开(公告)号:US10170111B2
公开(公告)日:2019-01-01
申请号:US15422024
申请日:2017-02-01
IPC分类号: G06F15/00 , G10L15/22 , G10L15/24 , H04N21/414 , B60K35/00 , G06F3/048 , H04N21/422 , H04N21/442 , H04N21/45
摘要: A vehicular infotainment system, a vehicle and a method of controlling interaction between a vehicle driver and an infotainment system. A multimedia device, a human-machine interface and sensors are used to collect in-vehicle driver characteristic data and extra-vehicle driving conditions. The system additionally includes—or is otherwise coupled to—a computer to convert one or both of traffic pattern data and vehicular positional data into a driver elevated cognitive load profile. In addition, the computer converts the driver characteristic data into a driver mood profile. The system can process these profiles to selectively adjust one or both of the amount of time needed to accept audio commands from a driver and the amount of time needed to provide an audio response to the driver in situations where the system determines the presence of at least one of the elevated cognitive load and a driver mood.
-
公开(公告)号:US20180374477A1
公开(公告)日:2018-12-27
申请号:US15634863
申请日:2017-06-27
申请人: Google Inc.
CPC分类号: G10L15/22 , G06T7/60 , G10L15/00 , G10L15/02 , G10L15/063 , G10L15/24 , G10L25/78 , G10L2015/226 , G10L2015/227
摘要: An example method includes receiving, by a computing system, an indication of one or more audible sounds that are detected by a first sensing device, the one or more audible sounds originating from a user; determining, by the computing system and based at least in part on an indication of one or more signals detected by a second sensing device, a distance between the user and the second sensing device; determining, by the computing system and based at least in part on the indication of the one or more audible sounds, one or more acoustic features that are associated with the one or more audible sounds; and determining, by the computing system, and based at least in part on the one or more acoustic features and the distance between the user and the second sensing device, one or more words that correspond to the audible sounds.
-
公开(公告)号:US10140362B2
公开(公告)日:2018-11-27
申请号:US15231066
申请日:2016-08-08
申请人: Google LLC
IPC分类号: G06K9/00 , G06F17/30 , G10L15/26 , G10L15/197 , G10L15/00 , G10L15/14 , G10L15/24 , G10L15/22 , G10L15/08 , G10L15/06
摘要: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for speech recognition. One of the methods includes receiving a base language model for speech recognition including a first word sequence having a base probability value; receiving a voice search query associated with a query context; determining that a customized language model is to be used when the query context satisfies one or more criteria associated with the customized language model; obtaining the customized language model, the customized language model including the first word sequence having an adjusted probability value being the base probability value adjusted according to the query context; and converting the voice search query to a text search query based on one or more probabilities, each of the probabilities corresponding to a word sequence in a group of one or more word sequences, the group including the first word sequence having the adjusted probability value.
-
公开(公告)号:US20180329679A1
公开(公告)日:2018-11-15
申请号:US16044114
申请日:2018-07-24
CPC分类号: G06F3/167 , G06F3/005 , G06F3/017 , G06F3/0304 , G06F2203/0381 , G06N99/005 , G10L15/04 , G10L15/1822 , G10L15/22 , G10L15/24 , G10L15/265 , G10L2015/227
摘要: A method and system are provided. The method includes receiving, by a microphone and camera, user utterances indicative of user commands and associated user gestures for the user utterances. The method further includes parsing, by a hardware-based recognizer, sample utterances and the user utterances into verb parts and noun parts. The method also includes recognizing, by a hardware-based recognizer, the user utterances and the associated user gestures based on the sample utterances and descriptions of associated supporting gestures for the sample utterances. The recognizing step includes comparing the verb parts and the noun parts from the user utterances individually and as pairs to the verb parts and the noun parts of the sample utterances. The method additionally includes selectively performing a given one of the user commands responsive to a recognition result.
-
-
-
-
-
-
-
-
-