Character recognition method and apparatus
    1.
    发明授权
    Character recognition method and apparatus 失效
    字符识别方法和装置

    公开(公告)号:US5768451A

    公开(公告)日:1998-06-16

    申请号:US867774

    申请日:1997-06-02

    CPC分类号: G06K9/723 G06K2209/01

    摘要: A character recognition method is arranged to supplement an erroneously recognized character with a linguistic knowledge. In this method, the extraction of a candidate based on the search of a word dictionary occupies a large part of the operation. To speed up the extraction of a candidate word, therefore, the method is provided to search the dictionary by using a group of candidate characters or a dictionary header for processing an inflected form of a verb. Further, the present method is provided for calculating a word matching cost for improving an efficiency of modifying an erroneous recognition. The word search is done by using a "hybrid method" arranged of "candidate-character-driven word extraction" and "dictionary-driven word extraction". Moreover, the word-dictionary is arranged to have a header word composed of an inflectional ending of a verb and an auxiliary verb or a particle added to the tail of the inflectional ending. The present method attaches much importance to the difference of the matching cost about a character with a totally high confidence ratio than the difference of the matching cost about a character with a totally low confidence ratio.

    摘要翻译: 一种字符识别方法是用语言知识来补充错误识别的字符。 在这种方法中,基于搜索单词字典的候选者的提取占据了大部分的操作。 因此,为了加速候选词的提取,提供了通过使用一组候选字符或用于处理动词的变形形式的字典头来搜索字典的方法。 此外,本方法用于计算用于提高错误识别修改效率的字匹配成本。 词搜索通过使用“候选字符驱动的单词提取”和“字典驱动的单词提取”排列的“混合方法”来完成。 此外,词典被设置成具有由动词和辅助动词的变形结尾组成的标题词,或者添加到变形结尾的尾部的粒子。 本方法非常重视与具有完全低置信率的字符的匹配成本的差异,具有完全高置信率的字符的匹配成本的差异。

    Document retrieval system and search server
    2.
    发明授权
    Document retrieval system and search server 有权
    文件检索系统和搜索服务器

    公开(公告)号:US07277881B2

    公开(公告)日:2007-10-02

    申请号:US09942905

    申请日:2001-08-31

    IPC分类号: G06F7/00 G06F17/30

    摘要: Ordering is properly performed for document databases registered in an associative search server. In an associative search server capable of performing an associative search by correlating a plurality of document databases, the history of the associative search is stored as an associative search recording table by associative search recording table storing means. By using this associative search recording table, a showing order of document databases presented by document database selecting means is properly set by showing order changing means. Alternatively, by registration fee calculating means, calculation is properly carried out as to registration fees of the document database registered in the associative search server.

    摘要翻译: 对关联搜索服务器中注册的文档数据库进行顺序执行。 在能够通过关联多个文档数据库执行关联搜索的关联搜索服务器中,通过关联搜索记录表存储装置将关联搜索的历史存储为关联搜索记录表。 通过使用该关联搜索记录表,通过显示顺序改变装置适当地设置由文档数据库选择装置呈现的文档数据库的显示顺序。 或者,通过注册费用计算装置,对登记在关联搜索服务器中的文档数据库的注册费进行适当的计算。

    Search system
    3.
    发明授权
    Search system 失效
    搜索系统

    公开(公告)号:US07065518B2

    公开(公告)日:2006-06-20

    申请号:US10197874

    申请日:2002-07-19

    IPC分类号: G06F17/30

    摘要: A system for displaying the results of a search provided by one of two different search systems enabling continuous searching. One search system includes a search takeover data production command used to output search takeover data articles from the search. The other search system includes a search takeover data reading command used to read search takeover data. A document identifier correspondence table associates the identifiers specified in the search takeover data. When a user clicks a search system transfer instruction button in one search system, the search takeover data producing command is executed to produce search takeover data which is passed to the other search system. The latter search system regards the list of identifiers of articles which was passed by the search takeover data reading command as the search results, and thus operates continuously.

    摘要翻译: 一种用于显示由两个不同搜索系统之一提供的搜索结果的系统,其能够进行连续搜索。 一个搜索系统包括用于从搜索输出搜索接管数据文章的搜索接管数据生成命令。 另一搜索系统包括用于读取搜索接管数据的搜索接管数据读取命令。 文档标识符对应表将搜索接管数据中指定的标识符相关联。 当用户在一个搜索系统中单击搜索系统传输指令按钮时,执行搜索接管数据产生命令以产生被传递到另一搜索系统的搜索接管数据。 后一种搜索系统将通过搜索接管数据读取命令传递的文章的标识符列表作为搜索结果,从而连续工作。

    Search system and search method
    4.
    发明申请
    Search system and search method 审中-公开
    搜索系统和搜索方法

    公开(公告)号:US20060179041A1

    公开(公告)日:2006-08-10

    申请号:US11211729

    申请日:2005-08-26

    IPC分类号: G06F17/30

    CPC分类号: G06F16/367

    摘要: Both a first kind of terms and a second kind of terms are designated. A user desires to obtain a relationship between these terms. By employing relations between these terms having been previously stored in a storage in advance, the manner in which these terms are correlated is dynamically displayed, while nodes and edges are gradually increased. In this manner, relations are easily found for concepts (terms) that seem not to be correlated, and an efficient search can also be performed.

    摘要翻译: 指定第一种术语和第二种术语。 用户希望获得这些术语之间的关系。 通过预先将这些术语预先存储在这些术语之间的关系,动态地显示这些术语相关的方式,同时节点和边缘逐渐增加。 以这种方式,对于似乎不相关的概念(术语)很容易找到关系,并且还可以执行有效的搜索。

    Data processing system using base sequence-relating data
    5.
    发明申请
    Data processing system using base sequence-relating data 审中-公开
    数据处理系统采用基本序列相关数据

    公开(公告)号:US20060041389A1

    公开(公告)日:2006-02-23

    申请号:US10535407

    申请日:2003-11-19

    IPC分类号: G06F19/00

    CPC分类号: G16B20/00 G16B30/00

    摘要: A system for processing information for providing semantic information and/or information associated with the semantic information useful for each individual organism through effective utilization of differences in nucleotide sequence-related information among individual organisms is constructed. This system comprises steps of (a) receiving nucleotide sequence-related information concerning a predetermined individual and (b) identifying, from a memory comprising a nucleotide sequence-related information group for each individual including a plurality of sets to which positional information representing a position in a nucleotide sequence and nucleotide sequence-related information corresponding to the positional information are mutually related, a nucleotide sequence-related information group including nucleotide sequence-related information that has consistency of the received nucleotide sequence-related information.

    摘要翻译: 构建用于处理信息的系统,用于通过有效利用个体生物体中的核苷酸序列相关信息的差异来提供与用于每个单独生物体的语义信息相关联的语义信息和/或信息。 该系统包括以下步骤:(a)接收关于预定个体的核苷酸序列相关信息,以及(b)从存储器中识别包括与每个个体相关的核苷酸序列相关信息组,所述核苷酸序列相关信息组包括多个组,位置信息表示位置 与位置信息对应的核苷酸序列和核苷酸序列相关信息是相互关联的,核苷酸序列相关信息组包括与所接受的核苷酸序列相关信息一致的核苷酸序列相关信息。

    Word importance calculation method, document retrieving interface, word dictionary making method
    6.
    发明授权
    Word importance calculation method, document retrieving interface, word dictionary making method 失效
    词重要性计算方法,文档检索界面,词典制作方法

    公开(公告)号:US06850937B1

    公开(公告)日:2005-02-01

    申请号:US09642771

    申请日:2000-08-22

    IPC分类号: G06F17/30

    摘要: A known method for selecting words (or word sequences), which is an important aspect of information retrieval, involves the problems of inability to eliminate high-frequency common words and of often arbitrary setting of the threshold value for dividing important and unimportant words. These problems are solved by normalizing the difference between the word distribution in a subset of all documents containing a word to be extracted (or a subset of said document set) and the word distribution in the set of all documents with the number of words in the said subset of all documents containing the word as a parameter, and the accuracy of support information retrieval is thereby enhanced.

    摘要翻译: 用于选择信息检索的重要方面的单词(或词序列)的已知方法涉及无法消除高频通用字的问题,并且通常用于划分重要和不重要的词的阈值的任意设置。 这些问题通过归一化包含要提取的单词的所有文档的子集中的单词分布之间的差异(或所述文档集的子集)和所有文档的集合中的单词分布之间的差异来解决,其中单词的数量 所述包含该单词作为参数的所有文档的子集,从而增强了支持信息检索的准确性。

    Method and system for predicting functions of compound
    7.
    发明申请
    Method and system for predicting functions of compound 审中-公开
    化合物功能预测方法及系统

    公开(公告)号:US20060106544A1

    公开(公告)日:2006-05-18

    申请号:US11072311

    申请日:2005-03-07

    IPC分类号: G06F19/00

    CPC分类号: G16C20/30 G16B5/00 G16C20/50

    摘要: Feature of a compound is predicted by using information on interactions between substances. A database of interactions between compounds and genes/proteins is constructed on the base of information collected from bibliographic databases, gene/protein databases, and disease databases, and an interaction network is prepared by mapping the collected information to thereby enable prediction of the features of a compound.

    摘要翻译: 化合物的特征是通过使用关于物质之间的相互作用的信息来预测的。 基于从书目数据库,基因/蛋白质数据库和疾病数据库收集的信息构建化合物和基因/蛋白质之间相互作用的数据库,并且通过映射所收集的信息来制备相互作用网络,从而使得能够预测 一个化合物

    Method that aligns cDNA sequences to genome sequences
    8.
    发明申请
    Method that aligns cDNA sequences to genome sequences 审中-公开
    将cDNA序列与基因组序列对齐的方法

    公开(公告)号:US20050159898A1

    公开(公告)日:2005-07-21

    申请号:US11011954

    申请日:2004-12-15

    CPC分类号: G16B30/00 G16B45/00

    摘要: Method and apparatus for mapping cDNA sequences to genome sequences at high speed are disclosed. A genome sequence is divided into K-base-length partial sequences that do not overlap and are continuous (K-mers). Then, they are stored in a table with coordinates on the genome sequence where each of them appears. Using this table, correspondences of K-mers are created from perfectly matching pairs of K-mers on the cDNA and the K-mers on the genome sequence. Of all the correspondences of K-mers, those sets that represent correct mapping rather than accidental coincidence are identified at high speed by a method based on a publicly known method that extracts a longest increasing partial sequence from a numerical sequence. The resultant correspondences of K-mers are extended to the association between bases by sequence alignment, and then correction at splice sites is performed. In order to allow for an optimum selection of parameters, an interactive interface capable of real-time response is provided.

    摘要翻译: 公开了将cDNA序列高速映射到基因组序列的方法和装置。 将基因组序列分为不重叠且连续的K-碱基长度部分序列(K-mers)。 然后,将它们存储在基因组序列上的每个都出现的坐标表中。 使用该表,K-mers的对应性由cDNA和基因组序列上的K-mers完全匹配的K-m对产生。 在K-mers的所有对应中,通过基于从数字序列中提取最长增加的部分序列的公知方法的方法,高速识别代表正确映射而不是偶然重合的那些集合。 通过序列比对将K-mers的结果对应扩展到碱基之间的关联,然后进行剪接位点的校正。 为了允许参数的最佳选择,提供了能够实时响应的交互式界面。

    Information search method
    9.
    发明申请
    Information search method 审中-公开
    信息搜索方式

    公开(公告)号:US20050004900A1

    公开(公告)日:2005-01-06

    申请号:US10841525

    申请日:2004-05-10

    IPC分类号: G06F17/30

    CPC分类号: G06F16/338

    摘要: New information is extracted efficiently and exhaustively to predict the function of genes or proteins. First, known-sequence data with high relevance to a search object sequence or structure information is obtained using a sequence database. Then, documents relevant to the resultant known-sequence data are retrieved, using a document database. Feature words common to a plurality of documents extracted are extracted and outputted.

    摘要翻译: 新的信息被有效和全面地提取,以预测基因或蛋白质的功能。 首先,使用序列数据库获得与搜索对象序列或结构信息高度相关的已知序列数据。 然后,使用文档数据库检索与所得到的已知序列数据相关的文档。 提取并输出提取的多个文档共有的特征词。

    Information processing system using base sequence relevant information
    10.
    发明申请
    Information processing system using base sequence relevant information 审中-公开
    信息处理系统使用基本序列相关信息

    公开(公告)号:US20060149502A1

    公开(公告)日:2006-07-06

    申请号:US10543759

    申请日:2004-02-25

    IPC分类号: G06F15/00

    CPC分类号: G16B20/00 G16B30/00

    摘要: This invention provides a method for processing information that allows the discovery of a correlation between predetermined individual-related information and nucleotide sequence-related information concerning an individual. This method comprises: step (a) of calculating a percentage for each piece of nucleotide sequence-related information using a first occurrence frequency and a second occurrence frequency, wherein the first occurrence frequency is calculated for each possible piece of nucleotide sequence-related information in a given position in a nucleotide sequence based on a predetermined population and the second occurrence frequency is calculated for each possible piece of nucleotide sequence-related information in the aforementioned position based on the population gathered for predetermined individual-related information concerning an individual; and step (b) of associating the percentage calculated in step (a) with positional information representing the aforementioned position and with the nucleotide sequence-related information for each the predetermined piece of individual-related information.

    摘要翻译: 本发明提供一种用于处理信息的方法,其允许发现预定的个体相关信息与关于个体的核苷酸序列相关信息之间的相关性。 该方法包括:(a)使用第一次出现频率和第二次出现频率计算每个核苷酸序列相关信息的百分比,其中针对每个可能的核苷酸序列相关信息片段计算第一次出现频率 基于针对与个人相关的预定个人相关信息收集的群体,在基于预定群体的核苷酸序列中的给定位置和针对上述位置中的每个可能的核苷酸序列相关信息片段计算第二出现频率; 以及将步骤(a)中计算的百分比与表示上述位置的位置信息和每个预定个体相关信息的核苷酸序列相关信息相关联的步骤(b)。