DOCUMENT UNDERSTANDING USING CONDITIONAL RANDOM FIELDS

    公开(公告)号:US20180129944A1

    公开(公告)日:2018-05-10

    申请号:US15344771

    申请日:2016-11-07

    CPC classification number: G06F16/93 G06F16/9024 G06N7/005 G06N20/00

    Abstract: A multi-page document is represented as a graph in which extracted page objects of the document, such as text blocks, are represented by nodes that are connected by intra-page edges and/or cross-page edges. The nodes and edges of the graph are associated with respective sets of features, the edge features distinguishing between intra-page and cross-page edges. A trained first model jointly predicts class labels for page objects, based on node and edge features. Page labels for the pages may be predicted, based on the page object predictions, optionally enforcing a constraint, such a maximum of one class label for a given class, per page. The pages can be assigned a respective category, based on the predicted classes of the page objects and respective features. Information based on the predictions is output, such as one or more of the page object class labels, the page labels, and information based thereon.

    PREDICTING TRANSLATIONAL PREFERENCES
    2.
    发明申请
    PREDICTING TRANSLATIONAL PREFERENCES 有权
    预测翻译优先

    公开(公告)号:US20170046333A1

    公开(公告)日:2017-02-16

    申请号:US14825652

    申请日:2015-08-13

    CPC classification number: G06F17/2818 G06F17/289

    Abstract: A system and method predict an optimal machine translation system for a first of a set of users. The method includes, for each of the users, providing a respective user profile which includes rankings for at least some machine translation systems from a set of machine translation systems. The user profile of the first user is updated, based on the user profiles of at least a subset of the other users. The updating includes generating at least one missing ranking. An optimal translation system for the first user from the set of machine translation systems is predicted, based on the updated user profile computed for the first user.

    Abstract translation: 系统和方法为一组用户中的第一组预测最佳机器翻译系统。 该方法包括对于每个用户,提供相应的用户简档,其包括来自一组机器翻译系统的至少一些机器翻译系统的排名。 基于其他用户的至少一个子集的用户简档来更新第一用户的用户简档。 更新包括产生至少一个丢失的排名。 基于为第一用户计算的更新的用户简档来预测来自该机器翻译系统的集合的第一用户的最佳翻译系统。

    CONFIDENTIALITY PRESERVING DOCUMENT ANALYSIS SYSTEM AND METHOD
    3.
    发明申请
    CONFIDENTIALITY PRESERVING DOCUMENT ANALYSIS SYSTEM AND METHOD 有权
    保密文件分析系统和方法

    公开(公告)号:US20140101456A1

    公开(公告)日:2014-04-10

    申请号:US13648462

    申请日:2012-10-10

    CPC classification number: G06F17/2258 G06F17/2247 G06F17/2252

    Abstract: A method and system for document processing allow a service provider to process a document without having access the textual content of the document. The system includes memory which receives an encoded source document from an associated client system. The encoded source document includes structural information and encoded content information. The encoded content information includes a plurality of encoded tokens generated by individually encoding each of a plurality of text tokens of the source document. The structural information includes location information for each of the plurality of text tokens. A processing module processes the encoded document to generate a modified document, without decoding the encoded tokens. A transmission module transmits the modified document to an associated client system whereby the client system is able to generate a transformed document based on the modified document and the plurality of text tokens.

    Abstract translation: 用于文档处理的方法和系统允许服务提供者处理文档而不访问文档的文本内容。 系统包括从相关联的客户端系统接收编码的源文档的存储器。 编码的源文档包括结构信息和编码的内容信息。 编码内容信息包括通过对源文档的多个文本令牌进行单独编码而生成的多个编码令牌。 结构信息包括用于多个文本令牌中的每一个的位置信息。 处理模块处理编码文档以生成修改的文档,而不对编码的令牌进行解码。 传输模块将修改的文档发送到相关联的客户端系统,由此客户端系统能够基于修改的文档和多个文本令牌来生成转换的文档。

    ENFORCING POLICIES OVER LINKED XML RESOURCES
    4.
    发明申请
    ENFORCING POLICIES OVER LINKED XML RESOURCES 失效
    执行链接XML资源的政策

    公开(公告)号:US20140101203A1

    公开(公告)日:2014-04-10

    申请号:US13645957

    申请日:2012-10-05

    CPC classification number: G06F17/30908

    Abstract: A system and method generate an ontology of linked resources. The method includes providing a policy comprising at least one logical rule which is to hold across an ontology of linked resources and initializing a set of resources with an initial subset of the set of resources, each resource in the initial subset being identified by a respective link. Each of the resources in the subset is processed, which includes populating the ontology with a corresponding member of a resource class, for a resource that is valid against a schema, asserting the member's class as a class specific to the schema of the validated resource in the ontology and providing a dependency specification for extracting links within the resource, each extracted link identifying one of the set of resources. A link property is asserted in the ontology for a link between the resource of the subset containing an extracted link and the resource identified by the extracted link and the ontology populated with a member of the resource class for each newly identified resource. A verification that the at least one logical rule holds across the set of resources in the ontology is performed.

    Abstract translation: 系统和方法生成链接资源的本体。 该方法包括提供包括至少一个逻辑规则的策略,该逻辑规则跨越链接的资源的本体保持并且利用该资源集合的初始子集来初始化一组资源,初始子集中的每个资源由相应的链接 。 对子集中的每个资源进行处理,其中包括为资源类别的相应成员填充本体,对于对模式有效的资源,将该成员的类称为特定于已验证资源的模式的类 本体并且提供用于提取资源内的链接的依赖性规范,每个提取的链接标识该资源集合中的一个。 链接属性在本体中被断言,用于包含提取的链接的子集的资源与由提取的链接标识的资源之间的链接以及用每个新识别的资源的资源类的成员填充的本体。 执行至少一个逻辑规则在本体中的资源集合之间的验证。

    Detection of numbered captions
    5.
    发明授权
    Detection of numbered captions 有权
    检测编号字幕

    公开(公告)号:US09008425B2

    公开(公告)日:2015-04-14

    申请号:US13752722

    申请日:2013-01-29

    CPC classification number: G06K9/00 G06K9/00463

    Abstract: A method of detection of numbered captions in a document includes receiving a document including a sequence of document pages and identifying illustrations on pages of the document. For each identified illustration, associated text is identified. An imitation page is generated for each of the identified illustrations, each imitation page comprising a single illustration and its associated text. For a sequence of the imitation pages, a sequence of terms is identified. Each term is derived from a text fragment of the associate text of a respective imitation page. The terms of a sequence complying with at least one predefined numbering scheme which defines a form and an incremental state of the terms in a sequence. The terms of the identified sequence of terms are construed as being at least a part of a numbered caption for a respective illustration in the document.

    Abstract translation: 在文档中检测编号字幕的方法包括接收包括文档页序列的文档和在文档的页面上识别图示。 对于每个识别的图示,识别相关联的文本。 为每个识别的图示生成仿制页面,每个模仿页面包括单个图示及其相关联的文本。 对于仿制页面的序列,确定了一系列术语。 每个术语都来源于相应模仿页面的关联文本的文本片段。 符合至少一个预定义编号方案的序列的术语,其定义序列中的术语的形式和增量状态。 所识别的术语序列的术语被解释为文档中相应图示的编号标题的至少一部分。

    SYSTEM AND METHOD FOR UPDATING AN ELECTRONIC CALENDAR
    6.
    发明申请
    SYSTEM AND METHOD FOR UPDATING AN ELECTRONIC CALENDAR 审中-公开
    用于更新电子日历的系统和方法

    公开(公告)号:US20130073662A1

    公开(公告)日:2013-03-21

    申请号:US13677584

    申请日:2012-11-15

    CPC classification number: G06F15/16 G06Q10/107

    Abstract: A computer implemented system and method are disclosed for updating an electronic calendar. The method includes receiving an electronic message in a natural language in which a change in role is expressed and, with a natural language processor implemented by a computer processor, automatically detecting the change in role within the email message, optionally storing the change in role in a contacts database, and proposing updates for entries in an electronic calendar based on the detected change in role.

    Abstract translation: 公开了一种用于更新电子日历的计算机实现的系统和方法。 该方法包括以自然语言接收电子消息,其中表达角色的改变,并且利用由计算机处理器实现的自然语言处理器自动检测电子邮件消息内的角色变化,可选地将角色的变化存储在 联系人数据库,并根据检测到的角色更改提出电子日历中条目的更新。

    PREDICTING THE QUALITY OF AUTOMATIC TRANSLATION OF AN ENTIRE DOCUMENT
    7.
    发明申请
    PREDICTING THE QUALITY OF AUTOMATIC TRANSLATION OF AN ENTIRE DOCUMENT 有权
    预测整个文档的自动翻译质量

    公开(公告)号:US20160124944A1

    公开(公告)日:2016-05-05

    申请号:US14532238

    申请日:2014-11-04

    CPC classification number: G06F17/2854

    Abstract: A system and method predict the translation quality of a translated input document. The method includes receiving an input document pair composed of a plurality of sentence pairs, each sentence pair including a source sentence in a source language and a machine translation of the source language sentence to a target language sentence. For each of the sentence pairs, a representation of the sentence pair is generated, based on a set of features extracted for the sentence pair. Using a generative model, a representation of the input document pair is generated, based on the sentence pair representations. A translation quality of the translated input document is computed, based on the representation of the input document pair.

    Abstract translation: 系统和方法预测翻译输入文档的翻译质量。 该方法包括接收由多个句子对组成的输入文档对,每个句子对包括源语言的源语句和源语言句子的机器翻译到目标语言句子。 对于每个句子对,基于为该句子对提取的一组特征来生成句子对的表示。 使用生成模型,基于句子对表示来生成输入文档对的表示。 基于输入文档对的表示来计算翻译的输入文档的翻译质量。

    PRIVACY-PRESERVING EVIDENCE IN ALPR APPLICATIONS
    8.
    发明申请
    PRIVACY-PRESERVING EVIDENCE IN ALPR APPLICATIONS 有权
    在ALPR应用中隐私保护证据

    公开(公告)号:US20150172056A1

    公开(公告)日:2015-06-18

    申请号:US14108477

    申请日:2013-12-17

    Abstract: A system and method for preserving privacy of evidence are provided. In the method, an encrypted first image is generated by encrypting a first image acquired at a first location with a symmetric cryptographic key that is based on first information such as a license plate number extracted from the first image and first metadata associated with the first image, such as a time at which the first image was acquired. When a link is established between a second image and the first image, for example, through visual signature matching, the symmetric cryptographic key can be reconstructed, without having access to the first image, but based instead on the first metadata and information extracted from the second image. The reconstructed symmetric cryptographic key can then be used for decryption of the encrypted image to establish evidence that the license plate number was indeed extracted from the first image.

    Abstract translation: 提供了一种保护证据隐私的系统和方法。 在该方法中,通过使用基于第一信息(例如从第一图像提取的车牌号码)和与第一图像相关联的第一元数据的对称密码密钥加密在第一位置处获取的第一图像来生成加密的第一图像 ,例如获取第一图像的时间。 当在第二图像和第一图像之间建立链接时,例如通过视觉签名匹配,可以重构对称加密密钥,而无需访问第一图像,而是基于第一元数据和从第 第二张图片。 然后可以将重建的对称加密密钥用于加密图像的解密,以建立从第一图像确实提取牌照号码的证据。

    DETECTION OF NUMBERED CAPTIONS
    9.
    发明申请
    DETECTION OF NUMBERED CAPTIONS 有权
    检查编号

    公开(公告)号:US20140212038A1

    公开(公告)日:2014-07-31

    申请号:US13752722

    申请日:2013-01-29

    CPC classification number: G06K9/00 G06K9/00463

    Abstract: A method of detection of numbered captions in a document includes receiving a document including a sequence of document pages and identifying illustrations on pages of the document. For each identified illustration, associated text is identified. An imitation page is generated for each of the identified illustrations, each imitation page comprising a single illustration and its associated text. For a sequence of the imitation pages, a sequence of terms is identified. Each term is derived from a text fragment of the associate text of a respective imitation page. The terms of a sequence complying with at least one predefined numbering scheme which defines a form and an incremental state of the terms in a sequence. The terms of the identified sequence of terms are construed as being at least a part of a numbered caption for a respective illustration in the document.

    Abstract translation: 在文档中检测编号字幕的方法包括接收包括文档页序列的文档和在文档的页面上识别图示。 对于每个识别的图示,识别相关联的文本。 为每个识别的图示生成仿制页面,每个模仿页面包括单个图示及其相关联的文本。 对于仿制页面的序列,确定了一系列术语。 每个术语都来源于相应模仿页面的关联文本的文本片段。 符合至少一个预定义编号方案的序列的术语,其定义序列中的术语的形式和增量状态。 所识别的术语序列的术语被解释为文档中相应图示的编号标题的至少一部分。

    System and method for updating an electronic calendar
    10.
    发明授权
    System and method for updating an electronic calendar 有权
    更新电子日历的系统和方法

    公开(公告)号:US09436649B2

    公开(公告)日:2016-09-06

    申请号:US13677584

    申请日:2012-11-15

    CPC classification number: G06F15/16 G06Q10/107

    Abstract: A computer implemented system and method are disclosed for updating an electronic calendar. The method includes receiving an electronic message in a natural language in which a change in role is expressed and, with a natural language processor implemented by a computer processor, automatically detecting the change in role within the email message, optionally storing the change in role in a contacts database, and proposing updates for entries in an electronic calendar based on the detected change in role.

    Abstract translation: 公开了一种用于更新电子日历的计算机实现的系统和方法。 该方法包括以自然语言接收电子消息,其中表达角色的改变,并且利用由计算机处理器实现的自然语言处理器自动检测电子邮件消息内的角色变化,可选地将角色的变化存储在 联系人数据库,并根据检测到的角色更改提出电子日历中条目的更新。

Patent Agency Ranking