Method and apparatus for recognizing continuous speech using search space restriction based on phoneme recognition
    1.
    发明授权
    Method and apparatus for recognizing continuous speech using search space restriction based on phoneme recognition 有权
    基于音素识别的搜索空间限制识别连续语音的方法和装置

    公开(公告)号:US08032374B2

    公开(公告)日:2011-10-04

    申请号:US11950130

    申请日:2007-12-04

    IPC分类号: G10L15/04

    CPC分类号: G10L15/187 G10L2015/025

    摘要: Provided are an apparatus and method for recognizing continuous speech using search space restriction based on phoneme recognition. In the apparatus and method, a search space can be primarily reduced by restricting connection words to be shifted at a boundary between words based on the phoneme recognition result. In addition, the search space can be secondarily reduced by rapidly calculating a degree of similarity between the connection word to be shifted and the phoneme recognition result using a phoneme code and shifting the corresponding phonemes to only connection words having degrees of similarity equal to or higher than a predetermined reference value. Therefore, the speed and performance of the speech recognition process can be improved in various speech recognition services.

    摘要翻译: 提供了一种使用基于音素识别的搜索空间限制来识别连续语音的装置和方法。 在该装置和方法中,可以通过基于音素识别结果来限制在字之间的边界处被移位的连接字来主要减少搜索空间。 此外,通过使用音素码快速计算要移位的连接字和音素识别结果之间的相似度的程度,可以二次减小搜索空间,并将相应的音素移位到仅具有等于或更高相似度的相似度的连接词 比预定的参考值。 因此,可以在各种语音识别服务中提高语音识别处理的速度和性能。

    METHOD AND APPARATUS FOR RECOGNIZING CONTINUOUS SPEECH USING SEARCH SPACE RESTRICTION BASED ON PHONEME RECOGNITION
    2.
    发明申请
    METHOD AND APPARATUS FOR RECOGNIZING CONTINUOUS SPEECH USING SEARCH SPACE RESTRICTION BASED ON PHONEME RECOGNITION 有权
    使用基于语音识别的搜索空间限制来识别连续语音的方法和装置

    公开(公告)号:US20080133239A1

    公开(公告)日:2008-06-05

    申请号:US11950130

    申请日:2007-12-04

    IPC分类号: G10L15/04

    CPC分类号: G10L15/187 G10L2015/025

    摘要: Provided are an apparatus and method for recognizing continuous speech using search space restriction based on phoneme recognition. In the apparatus and method, a search space can be primarily reduced by restricting connection words to be shifted at a boundary between words based on the phoneme recognition result. In addition, the search space can be secondarily reduced by rapidly calculating a degree of similarity between the connection word to be shifted and the phoneme recognition result using a phoneme code and shifting the corresponding phonemes to only connection words having degrees of similarity equal to or higher than a predetermined reference value. Therefore, the speed and performance of the speech recognition process can be improved in various speech recognition services.

    摘要翻译: 提供了一种使用基于音素识别的搜索空间限制来识别连续语音的装置和方法。 在该装置和方法中,可以通过基于音素识别结果来限制在字之间的边界处被移位的连接字来主要减少搜索空间。 此外,通过使用音素码快速计算要移位的连接字和音素识别结果之间的相似度的程度,可以二次减小搜索空间,并将相应的音素移位到仅具有等于或更高相似度的相似度的连接词 比预定的参考值。 因此,可以在各种语音识别服务中提高语音识别处理的速度和性能。

    System and method for recognizing environmental sound
    3.
    发明授权
    System and method for recognizing environmental sound 有权
    识别环境声音的系统和方法

    公开(公告)号:US09443511B2

    公开(公告)日:2016-09-13

    申请号:US13285971

    申请日:2011-10-31

    CPC分类号: G10L15/10 G10L15/20 G10L25/00

    摘要: A method for recognizing an environmental sound in a client device in cooperation with a server is disclosed. The client device includes a client database having a plurality of sound models of environmental sounds and a plurality of labels, each of which identifies at least one sound model. The client device receives an input environmental sound and generates an input sound model based on the input environmental sound. At the client device, a similarity value is determined between the input sound model and each of the sound models to identify one or more sound models from the client database that are similar to the input sound model. A label is selected from labels associated with the identified sound models, and the selected label is associated with the input environmental sound based on a confidence level of the selected label.

    摘要翻译: 公开了一种与服务器协作来识别客户端设备中的环境声音的方法。 客户端设备包括具有环境声音的多个声音模型的客户数据库和多个标签,每个标签识别至少一个声音模型。 客户端设备接收输入环境声音,并根据输入的环境声音生成输入声音模型。 在客户机设备处,在输入声音模型和每个声音模型之间确定相似性值,以从客户端数据库识别类似于输入声音模型的一个或多个声音模型。 从与识别的声音模型相关联的标签中选择标签,并且所选择的标签基于所选标签的置信水平与输入的环境声音相关联。

    SOUND RECOGNITION METHOD AND SYSTEM
    4.
    发明申请
    SOUND RECOGNITION METHOD AND SYSTEM 有权
    声音识别方法和系统

    公开(公告)号:US20120226497A1

    公开(公告)日:2012-09-06

    申请号:US13371966

    申请日:2012-02-13

    IPC分类号: G10L15/00

    CPC分类号: G10L15/08

    摘要: A method for generating an anti-model of a sound class is disclosed. A plurality of candidate sound data is provided for generating the anti-model. A plurality of similarity values between the plurality of candidate sound data and a reference sound model of a sound class is determined. An anti-model of the sound class is generated based on at least one candidate sound data having the similarity value within a similarity threshold range.

    摘要翻译: 公开了一种用于产生声级的反模型的方法。 多个候选声音数据被提供用于产生反模型。 确定多个候选声音数据与声音类别的参考声音模型之间的多个相似度值。 基于具有相似性阈值范围内的相似度值的至少一个候选声音数据,生成声音类别的反模型。

    TEXT REGION DETECTION SYSTEM AND METHOD
    5.
    发明申请
    TEXT REGION DETECTION SYSTEM AND METHOD 有权
    文本区域检测系统和方法

    公开(公告)号:US20120224765A1

    公开(公告)日:2012-09-06

    申请号:US13324282

    申请日:2011-12-13

    IPC分类号: G06K9/62 G06K9/34

    摘要: A method for detecting a text region in an image is disclosed. The method includes detecting a candidate text region from an input image. A set of oriented gradient images is generated from the candidate text region, and one or more detection window images of the candidate text region are captured. A sum of oriented gradients is then calculated for a region in one of the oriented gradient images. It is classified whether each detection window image contains text by comparing the associated sum of oriented gradients and a threshold. Based on the classifications of the detection window images, it is determined whether the candidate text region is a true text region.

    摘要翻译: 公开了一种用于检测图像中的文本区域的方法。 该方法包括从输入图像中检测候选文本区域。 从候选文本区域生成一组定向梯度图像,并且捕获候选文本区域的一个或多个检测窗口图像。 然后针对一个定向梯度图像中的区域计算定向梯度的总和。 通过比较相关联的归一化梯度和阈值来分类每个检测窗口图像是否包含文本。 基于检测窗口图像的分类,确定候选文本区域是否是真实文本区域。

    METHOD AND APPARATUS FOR GROUPING CLIENT DEVICES BASED ON CONTEXT SIMILARITY
    6.
    发明申请
    METHOD AND APPARATUS FOR GROUPING CLIENT DEVICES BASED ON CONTEXT SIMILARITY 审中-公开
    基于上下文相似性对客户端设备进行分组的方法和装置

    公开(公告)号:US20120224711A1

    公开(公告)日:2012-09-06

    申请号:US13371057

    申请日:2012-02-10

    IPC分类号: G06F15/16 H04B3/00

    摘要: A method for grouping a plurality of client devices is disclosed. The method includes receiving sound descriptors from the plurality of client devices. The sound descriptors are extracted from the environmental sound. Each of the sound descriptors is transmitted to a server, which determines a similarity of the sound descriptors received from the client devices. The server groups the plurality of client devices into at least one similar context group based on the similarity of the sound descriptors.

    摘要翻译: 公开了一种用于分组多个客户端设备的方法。 该方法包括从多个客户端设备接收声音描述符。 声音描述符是从环境声音中提取的。 每个声音描述符被发送到服务器,其确定从客户端设备接收的声音描述符的相似性。 服务器基于声音描述符的相似性将多个客户端设备分组成至少一个类似的上下文组。

    SYSTEM AND METHOD FOR RECOGNIZING ENVIRONMENTAL SOUND
    7.
    发明申请
    SYSTEM AND METHOD FOR RECOGNIZING ENVIRONMENTAL SOUND 有权
    用于识别环境声音的系统和方法

    公开(公告)号:US20120224706A1

    公开(公告)日:2012-09-06

    申请号:US13285971

    申请日:2011-10-31

    IPC分类号: H04R29/00

    CPC分类号: G10L15/10 G10L15/20 G10L25/00

    摘要: A method for recognizing an environmental sound in a client device in cooperation with a server is disclosed. The client device includes a client database having a plurality of sound models of environmental sounds and a plurality of labels, each of which identifies at least one sound model. The client device receives an input environmental sound and generates an input sound model based on the input environmental sound. At the client device, a similarity value is determined between the input sound model and each of the sound models to identify one or more sound models from the client database that are similar to the input sound model. A label is selected from labels associated with the identified sound models, and the selected label is associated with the input environmental sound based on a confidence level of the selected label.

    摘要翻译: 公开了一种与服务器协作来识别客户端设备中的环境声音的方法。 客户端设备包括具有环境声音的多个声音模型的客户数据库和多个标签,每个标签识别至少一个声音模型。 客户端设备接收输入环境声音,并根据输入的环境声音生成输入声音模型。 在客户机设备处,在输入声音模型和每个声音模型之间确定相似性值,以从客户端数据库识别类似于输入声音模型的一个或多个声音模型。 从与识别的声音模型相关联的标签中选择标签,并且所选择的标签基于所选标签的置信水平与输入的环境声音相关联。

    Text region detection system and method
    9.
    发明授权
    Text region detection system and method 有权
    文本区域检测系统及方法

    公开(公告)号:US08867828B2

    公开(公告)日:2014-10-21

    申请号:US13324282

    申请日:2011-12-13

    IPC分类号: G06K9/62 G06K9/32

    摘要: A method for detecting a text region in an image is disclosed. The method includes detecting a candidate text region from an input image. A set of oriented gradient images is generated from the candidate text region, and one or more detection window images of the candidate text region are captured. A sum of oriented gradients is then calculated for a region in one of the oriented gradient images. It is classified whether each detection window image contains text by comparing the associated sum of oriented gradients and a threshold. Based on the classifications of the detection window images, it is determined whether the candidate text region is a true text region.

    摘要翻译: 公开了一种用于检测图像中的文本区域的方法。 该方法包括从输入图像中检测候选文本区域。 从候选文本区域生成一组定向梯度图像,并且捕获候选文本区域的一个或多个检测窗口图像。 然后针对一个定向梯度图像中的区域计算定向梯度的总和。 通过比较相关联的归一化梯度和阈值来分类每个检测窗口图像是否包含文本。 基于检测窗口图像的分类,确定候选文本区域是否是真实文本区域。

    Sound recognition method and system
    10.
    发明授权
    Sound recognition method and system 有权
    声音识别方法和系统

    公开(公告)号:US09224388B2

    公开(公告)日:2015-12-29

    申请号:US13371966

    申请日:2012-02-13

    IPC分类号: G10L15/00 G10L15/08

    CPC分类号: G10L15/08

    摘要: A method for generating an anti-model of a sound class is disclosed. A plurality of candidate sound data is provided for generating the anti-model. A plurality of similarity values between the plurality of candidate sound data and a reference sound model of a sound class is determined. An anti-model of the sound class is generated based on at least one candidate sound data having the similarity value within a similarity threshold range.

    摘要翻译: 公开了一种用于产生声级的反模型的方法。 多个候选声音数据被提供用于产生反模型。 确定多个候选声音数据与声音类别的参考声音模型之间的多个相似度值。 基于具有相似性阈值范围内的相似度值的至少一个候选声音数据,生成声音类别的反模型。