Dictionary and index creating system and document retrieval system
    21.
    发明授权
    Dictionary and index creating system and document retrieval system 失效
    词典和索引创建系统和文档检索系统通过创建一个正则表达式字典和一个单词索引

    公开(公告)号:US06169999A

    公开(公告)日:2001-01-02

    申请号:US09059567

    申请日:1998-04-14

    申请人: Yuji Kanno

    发明人: Yuji Kanno

    IPC分类号: G06F314

    摘要: A high-speed document retrieval system creates a regular expression dictionary and a word index on the basis of a retrieval document and a word dictionary to conduct retrieval to a document through the regular expression dictionary and the word index at a high speed. A regular expression dictionary expressing a set of character strings having the same length is created from a word dictionary. In terms of a character string included in a retrieval document and matching with a regular expression in the regular expression dictionary, an index element is recorded in a word index when there is no different index element which allows an observing index element to be deducible, which eventually produces a word index capable of achieving a high-speed full-text retrieval without the noticeable increase in the index capacity. The document retrieval system performs the retrieval of the retrieval document through the use of the word dictionary, the regular expression dictionary and the word index, so that a high-speed full-text retrieval is possible without the impairment of retrieval efficiency even if the retrieval character string is covered with words having a small number of characters and making less overlap.

    摘要翻译: 高速文档检索系统基于检索文档和单词字典,通过正则表达式字典和词索引高速地对文档进行检索,创建正则表达式字典和词索引。 从单词字典创建表达一组具有相同长度的字符串的正则表达式字典。 根据包含在检索文档中的字符串并与正则表达式字典中的正则表达式进行匹配,当没有不同的索引元素时,索引元素被记录在单词索引中,该元素允许观察索引元素被推导, 最终产生能够实现高速全文检索的词索引,而索引能力却没有明显增加。 文件检索系统通过使用单词字典,正则表达式字典和词索引来执行检索文档的检索,从而即使检索有效,也可以进行高速全文检索,而不影响检索效率 字符串用少量字符的字来覆盖,并减少重叠。

    Method and apparatus for expanding similar character strings
    22.
    发明授权
    Method and apparatus for expanding similar character strings 失效
    用于扩展相似字符串的方法和装置

    公开(公告)号:US5835892A

    公开(公告)日:1998-11-10

    申请号:US626108

    申请日:1996-04-03

    申请人: Yuji Kanno

    发明人: Yuji Kanno

    IPC分类号: G06F17/30 G10L9/00

    CPC分类号: G06F17/30985

    摘要: A table of derivation elements and a state-transition table indicating applicable strings of derivation types are produced according to pronunciation expanding rules in a similar character string expanding apparatus. Each of the derivation elements is composed of a derived sound derived from a key sound placed at a key position of a question pronunciation character string, a sound position of the derived sound in each of character strings expanded from the question pronunciation character string and one or more derivation types indicating how the derived sound placed at the sound position is derived from the key sound placed at the key position. In a character string retrieving apparatus, strings of derivation types are produced one by one by arranging derivation types of the table of derivation elements in order of the sound position, and it is judged whether or not each of the strings of derivation types agrees with one of the applicable strings to judge whether or not each of the strings of derivation types satisfies the pronunciation expanding rules. Thereafter, trademark numbers corresponding to the strings of derivation types satisfying the pronunciation expanding rules are retrieved. Therefore, because any character strings similar in pronunciation to the question pronunciation character string is not directly used, the trademark numbers indicating trademarks similar to the question pronunciation character string can be retrieved at high speed.

    摘要翻译: 根据类似的字符串扩展装置中的发音扩展规则产生导出元素表和指示适用的导出类型的状态转换表。 每个推导元素由从发音问题字符串的关键位置处的键声音导出的导出声音,从问题发音字符串扩展的每个字符串中的导出声音的声音位置和一个或 更多的导出类型指示放置在声音位置处的导出声音如何从放置在键位置的键声音导出。 在字符串检索装置中,通过按照声音位置的顺序排列导出元素表的导出类型逐个地生成派生类型的串,并且判断每个派生类型的串是否与一个一致 以判断每个派生类型的字符串是否满足发音扩展规则。 此后,检索与满足发音扩展规则的推导类型的字符串对应的商标号。 因此,由于没有直接使用与发音字符串相似的任何字符串,因此可以高速检索指示与问题发音字符串类似的商标号。