Analyzing documents using stored templates
    1.
    发明授权
    Analyzing documents using stored templates 失效
    使用存储的模板分析文档

    公开(公告)号:US08422786B2

    公开(公告)日:2013-04-16

    申请号:US12732278

    申请日:2010-03-26

    IPC分类号: G06K9/34 G06K9/00

    摘要: A method, a system and a computer program product for analyzing a document are disclosed. In response to receiving the document, the document is partitioned into a plurality of segments using a set of pre-defined attributes. The plurality of segments of the document is mapped with corresponding segments of at least one template selected from a set of stored templates. A first template from the set of stored templates is selected and a group of segments in the first template is identified by computing at least one of a structural similarity and a textual similarity associated with the group of segments compared with the plurality of segments of the document. A subset of segments from the group of segments is aligned with corresponding segments from the plurality of segments of the document. A set of scores is computed using a set of pre-defined criteria, in response to the mapping. The document is analyzed based on the computed set of scores.

    摘要翻译: 公开了一种用于分析文档的方法,系统和计算机程序产品。 响应于接收文档,使用一组预定义属性将文档划分成多个片段。 文档的多个段被映射到从一组存储的模板中选择的至少一个模板的相应段。 选择来自所存储的模板集合的第一模板,并且通过与文档的多个段相比较,通过计算与该组段相关联的结构相似性和文本相似度中的至少一个来识别第一模板中的一组段 。 来自该组段的段的子集与来自文档的多个段的对应段对齐。 响应于映射,使用一组预定义标准来计算一组分数。 根据计算得出的分数对文件进行分析。

    ANALYZING DOCUMENTS USING STORED TEMPLATES
    2.
    发明申请
    ANALYZING DOCUMENTS USING STORED TEMPLATES 失效
    使用存储模板分析文档

    公开(公告)号:US20110235909A1

    公开(公告)日:2011-09-29

    申请号:US12732278

    申请日:2010-03-26

    IPC分类号: G06K9/34

    摘要: A method, a system and a computer program product for analyzing a document are disclosed. In response to receiving the document, the document is partitioned into a plurality of segments using a set of pre-defined attributes. The plurality of segments of the document is mapped with corresponding segments of at least one template selected from a set of stored templates. A first template from the set of stored templates is selected and a group of segments in the first template is identified by computing at least one of a structural similarity and a textual similarity associated with the group of segments compared with the plurality of segments of the document. A subset of segments from the group of segments is aligned with corresponding segments from the plurality of segments of the document. A set of scores is computed using a set of pre-defined criteria, in response to the mapping. The document is analyzed based on the computed set of scores.

    摘要翻译: 公开了一种用于分析文档的方法,系统和计算机程序产品。 响应于接收文档,使用一组预定义属性将文档划分成多个片段。 文档的多个段被映射到从一组存储的模板中选择的至少一个模板的相应段。 选择来自所存储的模板集合的第一模板,并且通过与文档的多个段相比较,通过计算与该组段相关联的结构相似性和文本相似度中的至少一个来识别第一模板中的一组段 。 来自该组段的段的子集与来自文档的多个段的对应段对齐。 响应于映射,使用一组预定义标准来计算一组分数。 根据计算得出的分数对文件进行分析。

    COMPUTER-IMPLEMENTED INFORMATION REUSE
    3.
    发明申请
    COMPUTER-IMPLEMENTED INFORMATION REUSE 有权
    计算机实现的信息重用

    公开(公告)号:US20130103682A1

    公开(公告)日:2013-04-25

    申请号:US13277307

    申请日:2011-10-20

    IPC分类号: G06F17/30

    摘要: Embodiments of the present invention relate to an approach for reusing information/knowledge. Specifically, embodiments of the present invention provide an approach for retrieving previously stored data to satisfy queries (e.g., jobs/tickets) for solutions to problems while maintaining privacy/security of the data as well as ensuring the quality of the results. In a typical embodiment, a query for a solution to a problem is received and details are extracted therefrom. Using the details, a search is performed on a set of data stored in at least one computer storage device. Based on the search, a set of results will be generated and classified into a set of categories. In any event, the quality of each of the set of results will be assessed based on the usefulness of the set of results.

    摘要翻译: 本发明的实施例涉及一种重用信息/知识的方法。 具体地,本发明的实施例提供了一种用于检索先前存储的数据以满足针对问题的解决方案的查询(例如,作业/票证)的方法,同时保持数据的隐私/安全性以及确保结果的质量。 在典型的实施例中,接收到对问题的解决方案的查询,并从中提取细节。 使用细节,对存储在至少一个计算机存储设备中的一组数据执行搜索。 基于搜索,将生成一组结果并将其分类为一组类别。 无论如何,将根据一组结果的有用性来评估每组结果的质量。

    Computer-implemented information reuse
    4.
    发明授权
    Computer-implemented information reuse 有权
    计算机实现的信息重用

    公开(公告)号:US08768921B2

    公开(公告)日:2014-07-01

    申请号:US13277307

    申请日:2011-10-20

    IPC分类号: G06F17/30 G06F11/07

    摘要: Embodiments of the present invention relate to an approach for reusing information/knowledge. Specifically, embodiments of the present invention provide an approach for retrieving previously stored data to satisfy queries (e.g., jobs/tickets) for solutions to problems while maintaining privacy/security of the data as well as ensuring the quality of the results. In a typical embodiment, a query for a solution to a problem is received and details are extracted therefrom. Using the details, a search is performed on a set of data stored in at least one computer storage device. Based on the search, a set of results will be generated and classified into a set of categories. In any event, the quality of each of the set of results will be assessed based on the usefulness of the set of results.

    摘要翻译: 本发明的实施例涉及一种重用信息/知识的方法。 具体地,本发明的实施例提供了一种用于检索先前存储的数据以满足针对问题的解决方案的查询(例如,作业/票证)的方法,同时保持数据的隐私/安全性以及确保结果的质量。 在典型的实施例中,接收到对问题的解决方案的查询,并从中提取细节。 使用细节,对存储在至少一个计算机存储设备中的一组数据执行搜索。 基于搜索,将生成一组结果并将其分类为一组类别。 无论如何,将根据一组结果的有用性来评估每组结果的质量。

    Automatic Speech and Concept Recognition
    5.
    发明申请
    Automatic Speech and Concept Recognition 失效
    自动语音和概念识别

    公开(公告)号:US20130046539A1

    公开(公告)日:2013-02-21

    申请号:US13210471

    申请日:2011-08-16

    IPC分类号: G10L15/22

    CPC分类号: G10L15/197 G10L15/193

    摘要: A method, an apparatus and an article of manufacture for automatic speech recognition. The method includes obtaining at least one language model word and at least one rule-based grammar word, determining an acoustic similarity of at least one pair of language model word and rule-based grammar word, and increasing a transition cost to the at least one language model word based on the acoustic similarity of the at least one language model word with the at least one rule-based grammar word to generate a modified language model for automatic speech recognition.

    摘要翻译: 一种用于自动语音识别的方法,装置和制品。 该方法包括获得至少一个语言模型词和至少一个基于规则的语法词,确定至少一对语言模型词和基于规则的语法单词的声学相似度,以及增加至少一个 基于所述至少一个语言模型词与所述至少一个基于规则的语法词的声学相似性来生成用于自动语音识别的修改语言模型的语言模型词。

    System and a Method for Generating Semantically Similar Sentences for Building a Robust SLM
    6.
    发明申请
    System and a Method for Generating Semantically Similar Sentences for Building a Robust SLM 有权
    用于生成语义类似句子的系统和方法,用于构建稳健的SLM

    公开(公告)号:US20130018649A1

    公开(公告)日:2013-01-17

    申请号:US13181923

    申请日:2011-07-13

    IPC分类号: G06F17/27

    摘要: A system and method are described for generating semantically similar sentences for a statistical language model. A semantic class generator determines for each word in an input utterance a set of corresponding semantically similar words. A sentence generator computes a set of candidate sentences each containing at most one member from each set of semantically similar words. A sentence verifier grammatically tests each candidate sentence to determine a set of grammatically correct sentences semantically similar to the input utterance. Also note that the generated semantically similar sentences are not restricted to be selected from an existing sentence database.

    摘要翻译: 描述了用于为统计语言模型生成语义上类似的句子的系统和方法。 语义类生成器确定输入语义中的每个单词一组相应的语义上相似的单词。 句子生成器从每个语义上相似的单词集合中计算出一组候选句子,每个候选句子最多包含一个成员。 句子验证器语法测试每个候选句子以确定一组语法上正确的句子,其语义上类似于输入的话语。 还要注意,生成的语义上相似的句子不限于从现有句子数据库中选择。

    System and a method for generating semantically similar sentences for building a robust SLM
    7.
    发明授权
    System and a method for generating semantically similar sentences for building a robust SLM 有权
    系统和一种用于生成语义上相似的句子来构建稳健的SLM的方法

    公开(公告)号:US09135237B2

    公开(公告)日:2015-09-15

    申请号:US13181923

    申请日:2011-07-13

    IPC分类号: G06F17/27 G10L15/26 G06F17/28

    摘要: A system and method are described for generating semantically similar sentences for a statistical language model. A semantic class generator determines for each word in an input utterance a set of corresponding semantically similar words. A sentence generator computes a set of candidate sentences each containing at most one member from each set of semantically similar words. A sentence verifier grammatically tests each candidate sentence to determine a set of grammatically correct sentences semantically similar to the input utterance. Also note that the generated semantically similar sentences are not restricted to be selected from an existing sentence database.

    摘要翻译: 描述了用于为统计语言模型生成语义上类似的句子的系统和方法。 语义类生成器确定输入语义中的每个单词一组相应的语义上相似的单词。 句子生成器从每个语义上相似的单词集合中计算出一组候选句子,每个候选句子最多包含一个成员。 句子验证器语法测试每个候选句子以确定一组语法上正确的句子,其语义上类似于输入的话语。 还要注意,生成的语义上相似的句子不限于从现有句子数据库中选择。

    Automatic speech and concept recognition
    8.
    发明授权
    Automatic speech and concept recognition 失效
    自动语音和概念识别

    公开(公告)号:US08676580B2

    公开(公告)日:2014-03-18

    申请号:US13210471

    申请日:2011-08-16

    CPC分类号: G10L15/197 G10L15/193

    摘要: A method, an apparatus and an article of manufacture for automatic speech recognition. The method includes obtaining at least one language model word and at least one rule-based grammar word, determining an acoustic similarity of at least one pair of language model word and rule-based grammar word, and increasing a transition cost to the at least one language model word based on the acoustic similarity of the at least one language model word with the at least one rule-based grammar word to generate a modified language model for automatic speech recognition.

    摘要翻译: 一种用于自动语音识别的方法,装置和制品。 该方法包括获得至少一个语言模型词和至少一个基于规则的语法词,确定至少一对语言模型词和基于规则的语法单词的声学相似度,以及增加至少一个 基于所述至少一个语言模型词与所述至少一个基于规则的语法词的声学相似性来生成用于自动语音识别的修改语言模型的语言模型词。