-
公开(公告)号:US11776530B2
公开(公告)日:2023-10-03
申请号:US16650161
申请日:2017-11-15
Applicant: INTEL CORPORATION
Inventor: Gabriel Amores , Guillermo Perez , Moshe Wasserblat , Michael Deisher , Loic Dufresne de Virel
IPC: G10L15/06 , G10L15/16 , G10L15/183 , G10L15/07 , G10L15/065
CPC classification number: G10L15/063 , G10L15/065 , G10L15/075 , G10L15/16 , G10L15/183 , G10L2015/0631 , G10L2015/0635
Abstract: An apparatus for speech model with personalization via ambient context harvesting, is described herein. The apparatus includes a microphone, context harvesting module, confidence module, and training module. The context harvesting module is to determine a context associated with the captured audio signals. A confidence module is to determine a confidence of the context as applied to the audio signals. A training module is to train a neural network in response to the confidence being above a predetermined threshold.
-
公开(公告)号:US10418028B2
公开(公告)日:2019-09-17
申请号:US15813322
申请日:2017-11-15
Applicant: Intel Corporation
Inventor: Oren Shamir , Oren Pereg , Moshe Wasserblat , Jonathan Mamou , Michel Assayag
Abstract: Technologies for detecting an end of a sentence in automatic speech recognition are disclosed. An automatic speech recognition device may acquire speech data, and identify phonemes and words of the speech data. The automatic speech recognition device may perform a syntactic parse based on the recognized words, and determine an end of a sentence based on the syntactic parse. For example, if the syntactic parse indicates that a certain set of consecutive recognized words form a syntactically complete and correct sentence, the automatic speech recognition device may determine that there is an end of a sentence at the end of that set of words.
-
公开(公告)号:US20180075841A1
公开(公告)日:2018-03-15
申请号:US15813322
申请日:2017-11-15
Applicant: Intel Corporation
Inventor: Oren Shamir , Oren Pereg , Moshe Wasserblat , Jonathan Mamou , Michel Assayag
IPC: G10L15/04 , G10L15/18 , G10L25/39 , G10L25/87 , G10L15/187
CPC classification number: G10L15/04 , G10L15/18 , G10L15/1822 , G10L15/187 , G10L25/39 , G10L25/87 , G10L2015/025
Abstract: Technologies for detecting an end of a sentence in automatic speech recognition are disclosed. An automatic speech recognition device may acquire speech data, and identify phonemes and words of the speech data. The automatic speech recognition device may perform a syntactic parse based on the recognized words, and determine an end of a sentence based on the syntactic parse. For example, if the syntactic parse indicates that a certain set of consecutive recognized words form a syntactically complete and correct sentence, the automatic speech recognition device may determine that there is an end of a sentence at the end of that set of words.
-
公开(公告)号:US20240038218A1
公开(公告)日:2024-02-01
申请号:US18447846
申请日:2023-08-10
Applicant: Intel Corporation
Inventor: Gabriel Amores , Guillermo Perez , Moshe Wasserblat , Michael Deisher , Loic Dufrensne de Virel
IPC: G10L15/06 , G10L15/16 , G10L15/183 , G10L15/07 , G10L15/065
CPC classification number: G10L15/063 , G10L15/16 , G10L15/183 , G10L15/075 , G10L15/065 , G10L2015/0631 , G10L2015/0635
Abstract: An apparatus for speech model with personalization via ambient context harvesting, is described herein. The apparatus includes a microphone, context harvesting module, confidence module, and training module. The context harvesting module is to determine a context associated with the captured audio signals. A confidence module is to determine a confidence of the context as applied to the audio signals. A training module is to train a neural network in response to the confidence being above a predetermined threshold.
-
公开(公告)号:US20180082680A1
公开(公告)日:2018-03-22
申请号:US15272078
申请日:2016-09-21
Applicant: Intel Corporation
Inventor: Oren Pereg , Moshe Wasserblat , Jonathan Mamou , Michel Assayag
IPC: G10L15/197 , G10L15/18 , G06F17/27
CPC classification number: G10L15/197 , G06F17/274 , G10L15/02 , G10L15/1822 , G10L15/19 , G10L15/20 , G10L15/22
Abstract: A system and method for syntactic re-ranking of possible transcriptions generated by automatic speech recognition are disclosed. A computer system accesses acoustic data for a recorded spoken language and generates a plurality of potential transcriptions for the acoustic data. The computer system scores the plurality of potential transcriptions to create an initial likelihood score for the plurality of potential transcriptions. For a particular potential transcription in the plurality of transcriptions, the computer system generates a syntactical likelihood score. The computer system creates an adjusted score for the particular potential transcription by combining the initial likelihood score and the syntactic likelihood score for the particular potential transcription.
-
公开(公告)号:US09837069B2
公开(公告)日:2017-12-05
申请号:US14979142
申请日:2015-12-22
Applicant: Intel Corporation
Inventor: Oren Shamir , Oren Pereg , Moshe Wasserblat , Jonathan Mamou , Michel Assayag
IPC: G10L15/04 , G10L15/187 , G10L15/18 , G10L25/87 , G10L25/39
CPC classification number: G10L15/04 , G10L15/18 , G10L15/1822 , G10L15/187 , G10L25/39 , G10L25/87 , G10L2015/025
Abstract: Technologies for detecting an end of a sentence in automatic speech recognition are disclosed. An automatic speech recognition device may acquire speech data, and identify phonemes and words of the speech data. The automatic speech recognition device may perform a syntactic parse based on the recognized words, and determine an end of a sentence based on the syntactic parse. For example, if the syntactic parse indicates that a certain set of consecutive recognized words form a syntactically complete and correct sentence, the automatic speech recognition device may determine that there is an end of a sentence at the end of that set of words.
-
公开(公告)号:US20160182940A1
公开(公告)日:2016-06-23
申请号:US14581548
申请日:2014-12-23
Applicant: Intel Corporation
Inventor: Michel Assayag , Shahar Taite , Moshe Wasserblat , Tomer Rider , Oren Pereg , Alexander Sivak
IPC: H04N21/431 , H04N21/45 , H04N21/4223 , H04N21/61 , H04N21/81 , H04N21/258 , H04N21/414
CPC classification number: H04N21/4312 , H04N21/25841 , H04N21/41407 , H04N21/4223 , H04N21/4524 , H04N21/4722 , H04N21/6131 , H04N21/6181 , H04N21/6582 , H04N21/8133
Abstract: Various systems and methods for providing a repositionable video display on a mobile device, to emulate the effect of user-controlled binoculars, are described herein. In one example, one or more high resolution video sources (such as UltraHD video cameras) obtain video that is wirelessly broadcasted to mobile devices. The mobile device processes the broadcast based on the approximate location of the spectator's mobile device, relative to a scene within the field of view of the mobile device. The location of the mobile device may be derived from a combination of network monitoring, camera inputs, object recognition, and the like. Accordingly, the spectator can obtain a virtual magnification of a scene from an external video source displayed on the spectator's mobile device.
Abstract translation: 这里描述了用于在移动设备上提供可重新定位的视频显示以模拟用户控制的双筒望远镜的效果的各种系统和方法。 在一个示例中,一个或多个高分辨率视频源(例如UltraHD摄像机)获得无线广播到移动设备的视频。 移动设备相对于移动设备的视野内的场景,基于观众的移动设备的大致位置处理广播。 移动设备的位置可以从网络监控,摄像机输入,对象识别等的组合中导出。 因此,观众可以从观众的移动装置上显示的外部视频源获得场景的虚拟放大率。
-
公开(公告)号:US10242670B2
公开(公告)日:2019-03-26
申请号:US15272078
申请日:2016-09-21
Applicant: Intel Corporation
Inventor: Oren Pereg , Moshe Wasserblat , Jonathan Mamou , Michel Assayag
Abstract: A system and method for syntactic re-ranking of possible transcriptions generated by automatic speech recognition are disclosed. A computer system accesses acoustic data for a recorded spoken language and generates a plurality of potential transcriptions for the acoustic data. The computer system scores the plurality of potential transcriptions to create an initial likelihood score for the plurality of potential transcriptions. For a particular potential transcription in the plurality of transcriptions, the computer system generates a syntactical likelihood score. The computer system creates an adjusted score for the particular potential transcription by combining the initial likelihood score and the syntactic likelihood score for the particular potential transcription.
-
9.
公开(公告)号:US09858923B2
公开(公告)日:2018-01-02
申请号:US14864456
申请日:2015-09-24
Applicant: INTEL CORPORATION
Inventor: Moshe Wasserblat , Oren Pereg , Michel Assayag , Alexander Sivak , Shahar Taite , Tomer Rider
CPC classification number: G10L15/1815 , G10L15/063 , G10L15/10 , G10L15/14 , G10L15/183 , G10L15/32 , G10L2015/0631 , G10L2015/085
Abstract: Generally, this disclosure provides systems, devices, methods and computer readable media for adaptation of language models and semantic tracking to improve automatic speech recognition (ASR). A system for recognizing phrases of speech from a conversation may include an ASR circuit configured to transcribe a user's speech to a first estimated text sequence, based on a generalized language model. The system may also include a language model matching circuit configured to analyze the first estimated text sequence to determine a context and to select a personalized language model (PLM), from a plurality of PLMs, based on that context. The ASR circuit may further be configured to re-transcribe the speech based on the selected PLM to generate a lattice of paths of estimated text sequences, wherein each of the paths of estimated text sequences comprise one or more words and an acoustic score associated with each of the words.
-
公开(公告)号:US20160379630A1
公开(公告)日:2016-12-29
申请号:US14750757
申请日:2015-06-25
Applicant: Intel Corporation
Inventor: Michel Assayag , Moshe Wasserblat , Oren Pereg , Shahar Taite , Alexander Sivak , Tomer Rider
CPC classification number: G10L15/22 , G10L15/30 , G10L2015/223 , G10L2015/227
Abstract: Various systems and methods for providing speech recognition services are described herein. A user device for providing speech recognition services includes a speech module to maintain a speech recognition model of a user of the user device; a user interaction module to detect an initiation of an interaction between the user and a target device; and a transmission module to transmit the speech recognition model to the target device, the target device to use the speech recognition model to enhance a speech recognition process executed by the target device during the interaction between the user and the target device.
Abstract translation: 本文描述了用于提供语音识别服务的各种系统和方法。 用于提供语音识别服务的用户设备包括用于维护用户设备的用户的语音识别模型的语音模块; 用户交互模块,用于检测用户与目标设备之间的交互的发起; 以及将所述语音识别模型发送到所述目标设备的传输模块,所述目标设备使用所述语音识别模型来增强在所述用户与所述目标设备之间的交互期间由所述目标设备执行的语音识别过程。
-
-
-
-
-
-
-
-
-