DOCUMENT PROCESSING METHOD AND SYSTEM
    21.
    发明申请
    DOCUMENT PROCESSING METHOD AND SYSTEM 审中-公开
    文件处理方法和系统

    公开(公告)号:US20130060808A1

    公开(公告)日:2013-03-07

    申请号:US13608309

    申请日:2012-09-10

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30716 G06F17/30011

    摘要: A method and system for expanding a document set as a search data source in the field of business related search. The present invention provides a method of expanding a seed document in a seed document set. The method includes identifying one or more entity words of the seed document; identifying one or more topic words identifying one or more topic words related to a based entity word in the seed document where the entity word is located; forming an entity word-topic word pair from each identified topic word and the entity word on the basis of which each topic word is identified; and obtaining one or more expanded documents by taking the entity word and topic word in each entity word-topic word pair as key words for web searching at the same time. A system for executing the above method is also provided.

    摘要翻译: 一种在业务相关搜索领域中扩展作为搜索数据源的文档集的方法和系统。 本发明提供一种在种子文档集中扩展种子文档的方法。 该方法包括识别种子文档的一个或多个实体单词; 识别识别与所述实体单词所在的种子文档中的基于实体词相关的一个或多个主题词的一个或多个主题词; 从每个识别的主题词和实体单词形成实体单词对,并根据该单词识别每个主题词; 并通过将每个实体词主题词对中的实体单词和主题词作为用于网页搜索的关键词同时获得一个或多个扩展文档。 还提供了用于执行上述方法的系统。

    Document processing method and system
    22.
    发明授权
    Document processing method and system 有权
    文件处理方法和系统

    公开(公告)号:US08359327B2

    公开(公告)日:2013-01-22

    申请号:US12786557

    申请日:2010-05-25

    IPC分类号: G06F7/00 G06F17/30

    CPC分类号: G06F17/30716 G06F17/30011

    摘要: A method and system for expanding a document set as a search data source in the field of business related search. The present invention provides a method of expanding a seed document in a seed document set. The method includes identifying one or more entity words of the seed document; identifying one or more topic words identifying one or more topic words related to the based entity word in the seed document where the entity word is located; forming an entity word-topic word pair from each identified topic word and the entity word on the basis of which each topic word is identified; and obtaining one or more expanded documents through web by taking the entity word and topic word in the each entity word-topic word pair as key words at the same time. A system for executing the above method is also provided.

    摘要翻译: 一种在业务相关搜索领域中扩展作为搜索数据源的文档集的方法和系统。 本发明提供一种在种子文档集中扩展种子文档的方法。 该方法包括识别种子文档的一个或多个实体单词; 识别识别在所述实体字所在的种子文档中与所述基于实体字相关的一个或多个主题词的一个或多个主题词; 从每个识别的主题词和实体单词形成实体单词对,并根据该单词识别每个主题词; 并通过网络获取一个或多个扩展文档,通过将每个实体词主题词对中的实体单词和主题词作为关键词同时获取。 还提供了用于执行上述方法的系统。

    DOCUMENT PROCESSING METHOD AND SYSTEM
    23.
    发明申请
    DOCUMENT PROCESSING METHOD AND SYSTEM 审中-公开
    文件处理方法和系统

    公开(公告)号:US20130007025A1

    公开(公告)日:2013-01-03

    申请号:US13608438

    申请日:2012-09-10

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30716 G06F17/30011

    摘要: A method and system for filtering a candidate document in a candidate document set are provided. The method includes receiving one or more entity word—topic word pairs and identifying one or more entity words of the candidate document and topic words. The method also includes determining whether to add the candidate document into a filtered document set using the entity words and topic words in the given entity word—topic word pairs and the identified entity words and topic words in the candidate document. The method further includes adding the candidate document into a filtered document set in response to determining that the candidate document should be added into the filtered document set.

    摘要翻译: 提供了一种用于过滤候选文档集中候选文档的方法和系统。 该方法包括接收一个或多个实体字主题词对并识别候选文档和主题词的一个或多个实体单词。 该方法还包括使用给定实体字主题词对中的实体单词和主题词以及候选文档中所识别的实体单词和主题词来确定是否将候选文档添加到过滤文档集中。 该方法还包括响应于确定候选文档应该被添加到经过滤的文档集中而将候选文档添加到经过过滤的文档集中。

    METHOD AND APPARATUS FOR IDENTIFIER RETRIEVAL
    24.
    发明申请
    METHOD AND APPARATUS FOR IDENTIFIER RETRIEVAL 审中-公开
    识别器检索的方法和装置

    公开(公告)号:US20120317125A1

    公开(公告)日:2012-12-13

    申请号:US13590479

    申请日:2012-08-21

    IPC分类号: G06F17/30

    CPC分类号: G06F16/367 G06F16/20

    摘要: A method for identifier retrieval. The method can include the steps of: extracting candidate identifiers from a data source according to a source identifier; obtaining a profile of the source identifier and profiles of the candidate identifiers from the data source; and selecting a target identifier associated with the source identifier from the candidate identifiers according to the profile of the source identifier and the profiles of the candidate identifiers. The method may efficiently, accurately and rapidly find a target identifier associated with a source identifier.

    摘要翻译: 一种用于标识符检索的方法。 该方法可以包括以下步骤:根据源标识符从数据源中提取候选标识符; 从数据源获取候选标识符的源标识符和简档的简档; 以及根据源标识符的简档和候选标识符的简档,从候选标识符中选择与源标识符相关联的目标标识符。 该方法可以有效,准确和快速地找到与源标识符相关联的目标标识符。

    Index and method for extending and querying index
    25.
    发明授权
    Index and method for extending and querying index 失效
    扩展和查询索引的索引和方法

    公开(公告)号:US07689574B2

    公开(公告)日:2010-03-30

    申请号:US11562495

    申请日:2006-11-22

    IPC分类号: G06F17/00 G06F15/16 G06F3/00

    CPC分类号: G06F17/30622

    摘要: A method, system and program storage device are provided for extending an inverted index, which comprises first and second inverted index subfiles to increase the speed of establishing and updating inverted index files. The method includes performing ordered keyword indexing operations of generating an inverted index from data sources, in which a frequency of occurrence of keywords in each of the data sources is calculated, and writing each keyword, the data sources, and the frequency of occurrence of each keyword in the corresponding data sources to the inverted index. If a number of data sources involved in the indexing operations reaches a first threshold, then writing contents of the inverted index as a smallest grid into the first inverted index subfile. If a number of smallest grids in the first inverted index subfile reaches a second threshold, then merging the smallest grids into a merged grid and writing the merged grid into the second inverted index subfile. If the number of merged grids in the second inverted index subfile reaches a third threshold, then further merging the merged grids into a larger merged grid, and writing the larger merged grid back into the first inverted index subfile.

    摘要翻译: 提供了一种用于扩展反向索引的方法,系统和程序存储装置,其包括第一和第二反向索引子文件,以增加建立和更新反向索引文件的速度。 该方法包括执行从数据源生成反向索引的有序关键字索引操作,其中计算每个数据源中的关键字的发生频率,并且写入每个关键字,数据源和每个数据源的发生频率 关键字在相应的数据源中反转索引。 如果涉及索引操作的数据源数目达到第一阈值,则将反向索引的内容作为最小格网写入第一反向索引子文件中。 如果第一反向索引子文件中的最小格数达到第二阈值,则将最小网格合并到合并的网格中,并将合并的网格写入第二个反向索引子文件。 如果第二反向索引子文件中的合并网格数达到第三阈值,则将合并的网格进一步合并到较大的合并网格中,并将较大的合并网格写回第一个反向索引子文件。

    Index and Method for Extending and Querying Index
    26.
    发明申请
    Index and Method for Extending and Querying Index 失效
    扩展和查询索引的索引和方法

    公开(公告)号:US20070124277A1

    公开(公告)日:2007-05-31

    申请号:US11562495

    申请日:2006-11-22

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30622

    摘要: Disclosed are an index structure and a method of extending index which comprises: (a) performing indexing operations of generating inverted index for newly inserted data source in the memory; (b) if the number of source data involved in the indexing operations reaches a first threshold value k1, sequentially writing the generated inverted index into the first index subfile; (c) if the number of the smallest grids, or index groups, in the first index subfile reaches a second threshold value k2, merging the k2 grids into a larger grid and sequentially writing it into the second index subfile; and (d) if the number of the smallest grids in the second index subfile reaches a third threshold value k3, merging the k3 grids into a larger grid and sequentially writing it into the first index subfile. Because index updating mostly occurs in small grids, the number of I/O operations on large grids is reduced and thus the speed of index building and updating is increased. In addition, the threshold values k1, k2 and k3 may be automatically adjusted based on the usage of system resources.

    摘要翻译: 公开了一种索引结构和扩展索引的方法,包括:(a)对存储器中新插入的数据源生成反向索引的索引操作; (b)如果在索引操作中涉及的源数据的数量达到第一阈值k 1,则将生成的反向索引顺序写入第一索引子文件; (c)如果第一索引子文件中的最小网格或索引组的数量达到第二阈值k 2,则将k个网格合并成较大的网格并将其顺序地写入第二索​​引子文件中; 和(d)如果第二索引子文件中的最小网格的数量达到第三阈值k 3,则将k 3个网格合并到较大的网格中并将其顺序写入第一索引子文件中。 由于索引更新主要发生在小网格中,因此大网格上的I / O操作数量减少,因此索引构建和更新速度提高。 此外,可以基于系统资源的使用自动调整阈值k 1,k 2和k 3。

    Automatically adjusting a webpage
    27.
    发明授权
    Automatically adjusting a webpage 失效
    自动调整网页

    公开(公告)号:US08489985B2

    公开(公告)日:2013-07-16

    申请号:US13170778

    申请日:2011-06-28

    IPC分类号: G06F17/00

    CPC分类号: G06F17/30905

    摘要: A solution is provided for automatically adjusting a webpage. According to the method of the present invention it can be automatically learned what the user's historical browsing behaviors are and thereby predict which block in the webpage would interest the user more so that it would be the one to be browsed and then adjust display of the block accordingly. Thus, with the present invention, limited screen resources can be utilized to more efficiently display the content that would interest a user when the user browses a webpage. A system for automatically adjusting a webpage and a computer readable article of manufacture tangibly embodying non-transitory computer readable instructions which, when executed, cause a computer to carry out the steps of a method for automatically adjusting a webpage, are also provided.

    摘要翻译: 提供了一种自动调整网页的解决方案。 根据本发明的方法,可以自动地了解用户的历史浏览行为,从而预测网页中的哪个块将更多地感兴趣用户,使得它将被浏览,然后调整块的显示 相应地。 因此,利用本发明,可以利用有限的屏幕资源来更有效地显示当用户浏览网页时对用户感兴趣的内容。 一种用于自动调整网页的系统和一种有形地体现非瞬时计算机可读指令的计算机可读制品,当被执行时,使得计算机执行自动调整网页的方法的步骤。

    AUTOMATICALLY ACQUIRING FEATURE SEGMENTS IN A MUSIC FILE
    28.
    发明申请
    AUTOMATICALLY ACQUIRING FEATURE SEGMENTS IN A MUSIC FILE 有权
    自动获取音乐文件中的特征部分

    公开(公告)号:US20120167748A1

    公开(公告)日:2012-07-05

    申请号:US13278406

    申请日:2011-10-21

    IPC分类号: G10H7/00

    摘要: A method of automatically acquiring a feature segment in a music file includes receiving, with a processing device, a music file; converting the music file into a character string; evaluating at least one character string segment in the character string based on one or more music features; and determining, based on an evaluation result, at least one music segment corresponding to at least one character string segment in the character string as a feature segment.

    摘要翻译: 一种自动获取音乐文件中的特征片段的方法包括:利用处理装置接收音乐文件; 将音乐文件转换成字符串; 基于一个或多个音乐特征来评估所述字符串中的至少一个字符串段; 以及基于评估结果确定与所述字符串中的至少一个字符串段相对应的至少一个音乐段作为特征段。

    METHOD AND SYSTEM FOR SEARCHING MULTILINGUAL DOCUMENTS
    29.
    发明申请
    METHOD AND SYSTEM FOR SEARCHING MULTILINGUAL DOCUMENTS 失效
    搜索多文档的方法和系统

    公开(公告)号:US20110106805A1

    公开(公告)日:2011-05-05

    申请号:US12914012

    申请日:2010-10-28

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30675 G06F17/30017

    摘要: A method, system and computer program product for searching multilingual documents. The method includes the steps of: receiving a search request based on at least one language; searching a first relevant document using the search request where the first relevant document (1) is written in a first language and (2) has a first image; finding a second relevant document having a second image which is similar to the first image and is written in a second language; and searching a second relevant document using the search request.

    摘要翻译: 用于搜索多语言文档的方法,系统和计算机程序产品。 该方法包括以下步骤:基于至少一种语言接收搜索请求; 使用第一相关文档(1)以第一语言书写的搜索请求搜索第一相关文档,以及(2)具有第一图像; 找到具有与第一图像类似并以第二语言写入的第二图像的第二相关文档; 以及使用搜索请求搜索第二相关文档。

    Method and system for achieving emotional text to speech utilizing emotion tags assigned to text data
    30.
    发明授权
    Method and system for achieving emotional text to speech utilizing emotion tags assigned to text data 有权
    使用分配给文本数据的情感标签来实现情感文本到文本的方法和系统

    公开(公告)号:US09117446B2

    公开(公告)日:2015-08-25

    申请号:US13221953

    申请日:2011-08-31

    IPC分类号: G10L13/08 G10L13/10

    CPC分类号: G10L13/10 G10L13/02 G10L13/08

    摘要: A method and system for achieving emotional text to speech. The method includes: receiving text data; generating emotion tag for the text data by a rhythm piece; and achieving TTS to the text data corresponding to the emotion tag, where the emotion tags are expressed as a set of emotion vectors; where each emotion vector includes a plurality of emotion scores given based on a plurality of emotion categories. A system for the same includes: a text data receiving module; an emotion tag generating module; and a TTS module for achieving TTS, wherein the emotion tag is expressed as a set of emotion vectors; and wherein emotion vector includes a plurality of emotion scores given based on a plurality of emotion categories.

    摘要翻译: 用于实现情感文字到语音的方法和系统。 该方法包括:接收文本数据; 通过节奏片产生文本数据的情感标签; 并且对于与情感标签相对应的文本数据实现TTS,其中情感标签被表达为一组情绪向量; 其中每个情绪向量包括基于多个情绪类别给出的多个情感评分。 一种系统,包括:文本数据接收模块; 情感标签生成模块; 以及用于实现TTS的TTS模块,其中所述情感标签被表达为一组情感向量; 并且其中情绪向量包括基于多个情绪类别给出的多个情绪评分。