Multilingual document retrieval system and method using semantic vector
matching
    1.
    发明授权
    Multilingual document retrieval system and method using semantic vector matching 失效
    多语言文件检索系统和使用语义向量匹配的方法

    公开(公告)号:US6006221A

    公开(公告)日:1999-12-21

    申请号:US696701

    申请日:1996-08-14

    IPC分类号: G06F17/30

    摘要: A document retrieval system where a user can enter a query, including a natural language query, in a desired one of a plurality of supported languages, and retrieve documents from a database that includes documents in at least one other language of the plurality of supported languages. The user need not have any knowledge of the other languages. Each document in the database is subjected to a set of processing steps to generate a language-independent conceptual representation of the subject content of the document. This is normally done before the query is entered. The query is also subjected to a (possibly different) set of processing steps to generate a language-independent conceptual representation of the subject content of the query. The documents and queries can also be subjected to additional analysis to provide additional term-based representations, such as the extraction of information-rich terms and phrases (such as proper nouns). Documents are matched to queries based on the conceptual-level contents of the document and query, and, optionally, on the basis of the term-based representation. The query's representation is then compared to each document's representation to generate a measure of relevance of the document to the query.

    摘要翻译: 一种文档检索系统,其中用户可以以多种支持的语言中期望的一种输入查询,包括自然语言查询,以及从数据库中检索包含多种支持语言的至少一种其他语言的文档的文档 。 用户不需要任何其他语言的知识。 数据库中的每个文档都受到一组处理步骤,以生成与文档的主题内容的语言无关的概念表示。 这通常在输入查询之前完成。 查询也受到(可能不同的)一组处理步骤,以生成查询的主题内容的与语言无关的概念表示。 文档和查询也可以进行额外的分析,以提供额外的基于术语的表示,例如提取信息丰富的术语和短语(例如专有名词)。 文档与基于文档和查询的概念级内容的查询匹配,并且可选地,基于基于术语的表示。 然后将查询的表示与每个文档的表示进行比较,以生成文档与查询的相关性的度量。

    User interface and other enhancements for natural language information
retrieval system and method
    2.
    发明授权
    User interface and other enhancements for natural language information retrieval system and method 失效
    自然语言信息检索系统和方法的用户界面等增强功能

    公开(公告)号:US6026388A

    公开(公告)日:2000-02-15

    申请号:US696702

    申请日:1996-08-14

    IPC分类号: G06F17/30

    摘要: Techniques for generating sophisticated representations of the contents of both queries and documents in a retrieval system by using natural language processing (NLP) techniques to represent, index, and retrieve texts at the multiple levels (e.g., the morphological, lexical, syntactic, semantic, discourse, and pragmatic levels) at which humans construe meaning in writing. The user enters a query and the system processes the query to generate an alternative representation, which includes conceptual-level abstraction and representations based on complex nominals (CNs), proper nouns (PNs), single terms, text structure, and logical make-up of the query, including mandatory terms. After processing the query, the system displays query information to the user, indicating the system's interpretation and representation of the content of the query. The user is then given an opportunity to provide input, in response to which the system modifies the alternative representation of the query. Once the user has provided desired input, the possibly modified representation of the query is matched to the relevant document database, and measures of relevance generated for the documents. A set of documents is presented to the user, who is given an opportunity to select some or all of the documents, typically on the basis of such documents being of particular relevance. The user then initiates the generation of a query representation based on the alternative representations of the selected document(s).

    摘要翻译: 通过使用自然语言处理(NLP)技术在多个级别(例如,形态学,词汇,语法,语义,语义和语义)来表示,索引和检索文本,在检索系统中生成查询和文档的内容的复杂表示的技术, 话语和语用层面),人类以书面形式解释意义。 用户输入查询,系统处理查询以生成替代表示,其中包括基于复杂名词(CN),专有名词(PN),单词,文本结构和逻辑构成的概念级抽象和表示 的查询,包括强制性条款。 处理查询后,系统向用户显示查询信息,指示系统对查询内容的解释和表示。 然后给用户提供提供输入的机会,响应于该系统修改查询的替代表示。 一旦用户提供了所需的输入,查询的可能修改的表示与相关文档数据库相匹配,并为文档生成相关度量。 一组文件被呈现给用户,谁被给予选择一些或所有文件的机会,通常基于这些文件特别相关。 然后,用户基于所选择的文档的替代表示来启动查询表示的生成。