SYSTEMS AND METHODS FOR INFORMATION INTEGRATION THROUGH CONTEXT-BASED ENTITY DISAMBIGUATION
    1.
    发明申请
    SYSTEMS AND METHODS FOR INFORMATION INTEGRATION THROUGH CONTEXT-BASED ENTITY DISAMBIGUATION 审中-公开
    通过基于语境的实体退出进行信息整合的系统和方法

    公开(公告)号:US20110106807A1

    公开(公告)日:2011-05-05

    申请号:US12917384

    申请日:2010-11-01

    IPC分类号: G06F17/30

    CPC分类号: G06F16/288

    摘要: Described within are systems and methods for disambiguating entities, by generating entity profiles and extracting information from multiple documents to generate a set of entity profiles, determining equivalence within the set of entity profiles using similarity matching algorithms, and integrating the information in the correlated entity profiles. Additionally, described within are systems and methods for representing entities in a document in a Resource Description Framework and leveraging the features to determine the similarity between a plurality of entities. An entity may include a person, place, location, or other entity type.

    摘要翻译: 描述的是用于消除实体的歧义的系统和方法,通过生成实体简档并从多个文档提取信息以生成一组实体简档,使用相似性匹配算法确定该组实体简档内的等同性,并将该信息集成在相关实体简档中 。 另外,在内部描述的是用于表示资源描述框架中的文档中的实体的系统和方法,并利用特征来确定多个实体之间的相似性。 实体可以包括个人,地点,位置或其他实体类型。

    CONTEXT AWARE BACK-TRANSLITERATION AND TRANSLATION OF NAMES AND COMMON PHRASES USING WEB RESOURCES
    2.
    发明申请
    CONTEXT AWARE BACK-TRANSLITERATION AND TRANSLATION OF NAMES AND COMMON PHRASES USING WEB RESOURCES 有权
    背景知识使用网页资源进行翻译和翻译名称和常用词典

    公开(公告)号:US20110137636A1

    公开(公告)日:2011-06-09

    申请号:US12959309

    申请日:2010-12-02

    IPC分类号: G06F17/28

    CPC分类号: G06F17/2863

    摘要: Described within are systems and methods for transliterating and translating source non-Romanized language text strings from a plurality of electronic sources to Romanized target language text strings by converting the source non-Romanized language text strings to a standard document encoding format, splitting the source non-Romanized language text strings into smaller units, transforming the smaller units into entity profiles, processing the entities profiles with data from external databases, translating the entities in the entity profiles into a Romanized target language, and outputting the entities into a plurality of data formats for external systems.

    摘要翻译: 本文描述的是通过将源非罗马化语言文本字符串转换为标准文档编码格式,将来源非罗马化语言文本字符串从多个电子源转换为罗马化目标语言文本字符串的系统和方法, 将罗曼化语言文本字符串转换成较小单元,将较小的单元转换为实体简档,使用来自外部数据库的数据处理实体简档,将实体简档中的实体转换为罗马化目标语言,并将实体输出为多个数据格式 用于外部系统。