AUTOMATIC WOD-CLOUD GENERATION
    31.
    发明申请
    AUTOMATIC WOD-CLOUD GENERATION 有权
    自动生成云生成

    公开(公告)号:US20120303637A1

    公开(公告)日:2012-11-29

    申请号:US13113110

    申请日:2011-05-23

    IPC分类号: G06F17/30 G06F7/00

    摘要: Method, system, and computer program product for automatic generation of a word-cloud for a content item are provided. The method includes: extracting terms from a content item using statistical selection criteria; weighting a term by a probability that the term is used as a tag; and generating a visual representation of terms with enhanced representation of terms according to the weighting. Weighting a term by a probability that the term is used as a tag may include determining the relative frequency of the term in a folksonomy of tag terms for a domain.

    摘要翻译: 提供了用于自动生成内容项的单词云的方法,系统和计算机程序产品。 该方法包括:使用统计选择标准从内容项提取术语; 以术语用作标签的概率加权项; 以及根据加权产生具有增强的术语表示的术语的视觉表示。 以术语用作标签的概率来加权术语可以包括确定域的标签术语的民间学习中术语的相对频率。

    Content analysis simulator for improving site findability in information retrieval systems
    32.
    发明授权
    Content analysis simulator for improving site findability in information retrieval systems 有权
    内容分析模拟器,用于提高信息检索系统中的站点可找性

    公开(公告)号:US08285702B2

    公开(公告)日:2012-10-09

    申请号:US12188013

    申请日:2008-08-07

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30867 G06F17/30722

    摘要: A system and method including a simulator operating in conjunction with a search-engine, for improving document and site findability. Users input their content (pages or sites) and the simulator will analyze the site in terms of structure and content. It will then give the user a ranked list of suggestions about how the user might improve his/her site's findability. The user will then be able to apply some or all of these suggestions, or any other changes, by virtually modifying the site, and then immediately receive feedback both on how the pages look and a sense of the degree of findability improvement. The interactive process allows users to simulate modifications in their site structure and content in order to improve its findability. When the user completes the modifications and is satisfied with the new findability of his site, the user will be able then to replace his/her current site in the repository with the modified one.

    摘要翻译: 一种包括与搜索引擎一起操作的模拟器的系统和方法,用于改进文档和站点可发现性。 用户输入内容(页面或站点),模拟器将根据结构和内容分析站点。 然后,它将向用户提供关于用户如何改善他/她的网站的可找到性的排名列表。 然后,用户可以通过虚拟修改网站来应用部分或全部这些建议或任何其他更改,然后立即收到关于页面外观和可寻找程度改进程度的反馈。 交互过程允许用户模拟其站点结构和内容的修改,以提高其可查找性。 当用户完成修改并对其网站的新可找到性感到满意时,用户将能够用修改后的位置替换他/她当前在存储库中的站点。

    Method and System for Maintaining Profiles of Information Channels
    33.
    发明申请
    Method and System for Maintaining Profiles of Information Channels 失效
    维护信息渠道的方法和系统

    公开(公告)号:US20110219056A1

    公开(公告)日:2011-09-08

    申请号:US13105924

    申请日:2011-05-12

    IPC分类号: G06F15/16 G06F15/18

    CPC分类号: G06F17/30867 H04L69/14

    摘要: A method and system are provided for maintaining profiles of information channels available on the Web, wherein the information channels are accessed via pull-only protocols. The method includes monitoring one or more channels by a channel pull action at a monitoring rate, wherein the monitoring rate is determined for the one or more channels based on the number of update events in a previous time period. The method may optimally include filtering the update events in the time period by a novelty measure, wherein the filtering disregards events that do not include significant novel information. The monitoring rate is adapted based on reinforcement learning applying iterative learning rules over time.

    摘要翻译: 提供了一种用于维护在Web上可用的信息信道的简档的方法和系统,其中通过仅拉协议访问信息信道。 该方法包括以监视速率通过信道拉动操作监视一个或多个信道,其中基于前一时间段内的更新事件的数量来确定针对一个或多个信道的监视速率。 该方法可以最佳地包括通过新颖度量来对该时间段内的更新事件进行过滤,其中过滤忽略不包括重要新颖信息的事件。 基于强化学习,随着时间的推移应用迭代学习规则,对监测率进行了调整。

    Merging of results in distributed information retrieval
    34.
    发明授权
    Merging of results in distributed information retrieval 有权
    结果在分布式信息检索中合并

    公开(公告)号:US07984039B2

    公开(公告)日:2011-07-19

    申请号:US11183086

    申请日:2005-07-14

    IPC分类号: G06F7/00 G06F17/30

    CPC分类号: G06F17/30864

    摘要: A method and system are provided of merging results in distributed information retrieval. A search manager is in communication with a plurality of components, wherein a component is a search engine working on a document collection and returning results in the form of a list of documents to a search query. The search manager submits a query to the plurality of components, receives results from each component in the form of a list of documents; estimates the success of a component in handling the query to generate a merit score for a component per query; applies the merit score to the results for the component; and merges results from the plurality of components by ranking in order of the applied merit score.

    摘要翻译: 提供了一种在分布式信息检索中合并结果的方法和系统。 搜索管理器与多个组件进行通信,其中组件是对文档收集工作的搜索引擎,并以搜索查询的文档列表的形式返回结果。 搜索管理器向多个组件提交查询,以文档列表的形式从每个组件接收结果; 估计组件处理查询的成功,以生成每个查询的组件的优点得分; 将优点分数应用于组件的结果; 并且通过按照所应用的优点得分的顺序来排列来自多个组分的结果。

    Personalized Web Feed Views
    35.
    发明申请
    Personalized Web Feed Views 审中-公开
    个性化的Web Feed视图

    公开(公告)号:US20100161547A1

    公开(公告)日:2010-06-24

    申请号:US12342090

    申请日:2008-12-23

    CPC分类号: G06F3/04847 G06F16/9535

    摘要: A system for generation of personalized Web feed views in accordance with pre-defined profile parameters, is presented. The system including a user definition interface for receiving at least one user parameter and sending the parameter to a content provider and a feed view personalization unit operable to receive the user parameter and customize feed content in accordance with the at least one user parameter for displaying to the user.

    摘要翻译: 提出了一种根据预定义的配置文件参数生成个性化Web Feed视图的系统。 该系统包括用于接收至少一个用户参数并将参数发送到内容提供者的用户定义界面和用于接收该用户参数并且根据该至少一个用户参数自定义馈送内容的馈送视图个性化单元,用于显示至 用户。

    Information Retrieval with Unified Search Using Multiple Facets
    36.
    发明申请
    Information Retrieval with Unified Search Using Multiple Facets 有权
    使用多个面进行统一搜索的信息检索

    公开(公告)号:US20090327271A1

    公开(公告)日:2009-12-31

    申请号:US12164139

    申请日:2008-06-30

    IPC分类号: G06F7/06 G06F17/30

    CPC分类号: G06F17/30675

    摘要: Information retrieval with unified search between heterogeneous objects is described. The method includes: indexing a first object as a document in a search index; referencing a second object related to the first object in a facet of the document; and storing a relationship strength between the first and second objects in the facet of the document in the search index. Multiple heterogeneous objects can be related to the first object and referenced in multiple facets of the document, each with its relationship strength to the first object. Scoring an indirect object by indirect relation to a query object can be carried out by aggregating the relationship strengths between the indirect object and the retrieved objects multiplied by the retrieved objects' direct scores of relationship strength to the query object.

    摘要翻译: 描述了异构对象之间统一搜索的信息检索。 该方法包括:将第一对象作为文档索引到搜索索引中; 在所述文档的方面引用与所述第一对象相关的第二对象; 以及在所述搜索索引中存储所述文档的所述面中的所述第一和第二对象之间的关系强度。 多个异构对象可以与第一个对象相关,并在文档的多个方面被引用,每一个都具有与第一个对象的关系强度。 通过与查询对象的间接关系来计算间接对象可以通过将间接对象和检索对象之间的关系强度乘以检索到的对象的关系强度的直接得分与查询对象进行。

    Analyzing the Ability to Find Textual Content
    37.
    发明申请
    Analyzing the Ability to Find Textual Content 有权
    分析查找文本内容的能力

    公开(公告)号:US20080033971A1

    公开(公告)日:2008-02-07

    申请号:US11461464

    申请日:2006-08-01

    IPC分类号: G06F7/00

    CPC分类号: G06F17/30675

    摘要: A method and system for analyzing a document set (202, 420) are provided. The method includes determining a set of terms (312) from the terms of the document set that minimizes a distance measurement (405) from the given set of documents (420). The method includes using a greedy algorithm to build the set of terms incrementally, at each stage finding a single word that is closest to the document set (202, 420). The set of terms is evaluated to assess the ability to find the document set (202, 420). The set of terms are compared with expected terms to evaluate the ability to find the document set (202, 420). A measure of the ability to find a document set (202, 420) is provided by computing a distance measure (403) between a document set and an entire collection.

    摘要翻译: 提供了一种用于分析文档集(202,420)的方法和系统。 该方法包括从文档集合的术语中确定一组术语(312),该文档集合的术语使距离给定文档集合(420)最小化距离测量(405)。 该方法包括使用贪心算法逐渐建立术语集合,在每个阶段找到最靠近文档集(202,420)的单个单词。 评估一组术语以评估查找文档集(202,420)的能力。 将这组术语与预期术语进行比较,以评估查找文档集(202,420)的能力。 通过计算文档集和整个集合之间的距离度量(403)来提供查找文档集(202,420)的能力的度量。

    Merging of results in distributed information retrieval
    38.
    发明申请
    Merging of results in distributed information retrieval 有权
    结果在分布式信息检索中合并

    公开(公告)号:US20070016574A1

    公开(公告)日:2007-01-18

    申请号:US11183086

    申请日:2005-07-14

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30864

    摘要: A method and system are provided of merging results in distributed information retrieval. A search manager (104) is in communication with a plurality of components, wherein a component is a search engine (106-108) working on a document collection and returning results in the form of a list of documents to a search query. The search manager (104) submits a query (202) to the plurality of components, receives results (213) from each component in the form of a list of documents; estimates (208) the success of a component in handling the query to generate a merit score (210) for a component per query; applies (220) the merit score (210) to the results for the component; and merges (222) results from the plurality of components by ranking in order of the applied merit score.

    摘要翻译: 提供了一种在分布式信息检索中合并结果的方法和系统。 搜索管理器(104)与多个组件通信,其中组件是在文档收集上工作的搜索引擎(106-108),并以搜索查询的文档列表的形式返回结果。 搜索管理器(104)向多个组件提交查询(202),以文档列表的形式从每个组件接收结果(213); 估计(208)组件在处理查询中的成功以生成每个查询的组件的优点得分(210); 将优点得分(210)(220)应用于组件的结果; 并通过按照应用的优点得分的顺序来合并来自多个成分的结果(222)。

    INDEXING AND SEARCHING ENTITY-RELATIONSHIP DATA
    39.
    发明申请
    INDEXING AND SEARCHING ENTITY-RELATIONSHIP DATA 有权
    指数和搜索实体关系数据

    公开(公告)号:US20130238631A1

    公开(公告)日:2013-09-12

    申请号:US13417248

    申请日:2012-03-11

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30604

    摘要: Method, system, and computer program product for indexing and searching entity-relationship data are provided. The method includes: defining a logical document model for entity-relationship data including: representing an entity as a document containing the entity's searchable content and metadata; dually representing the entity as a document and as a category; and representing each relationship instance for the entity as a category set that contains categories of all participating entities in the relationship. The method also includes: translating entity-relationship data into the logical document model; and indexing the entity-relationship data of the populated logical document model as an inverted index. The method may include searching indexed entity-relationship data using a faceted search, wherein the categories are all categories required for supporting faceted navigation.

    摘要翻译: 提供了索引和搜索实体关系数据的方法,系统和计算机程序产品。 该方法包括:定义用于实体关系数据的逻辑文档模型,包括:将实体表示为包含该实体的可搜索内容和元数据的文档; 将实体双重表示为文件和类别; 并将实体的每个关系实例表示为包含关系中所有参与实体的类别的类别集合。 该方法还包括:将实体关系数据转换为逻辑文档模型; 并将填充的逻辑文档模型的实体关系数据索引为反向索引。 该方法可以包括使用分面搜索搜索索引的实体关系数据,其中类别是支持分面导航所需的所有类别。

    FACETED SEARCH WITH RELATIONSHIPS BETWEEN CATEGORIES
    40.
    发明申请
    FACETED SEARCH WITH RELATIONSHIPS BETWEEN CATEGORIES 有权
    正面搜索与类别之间的关系

    公开(公告)号:US20120310940A1

    公开(公告)日:2012-12-06

    申请号:US13118477

    申请日:2011-05-30

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30722

    摘要: Method, system, and computer program product for faceted search with relationships between categories are provided. The method includes: having a document set of multiple documents, each document having associated categories to which it belongs; grouping multiple categories associated with a document into a category set based on a relationship between the multiple categories; associating the category set with the document; and indexing the category set for retrieval of documents from categories sharing a category set. Wherein indexing the category set includes: having an index entry of a textual representations of a category, wherein the index entry includes a single occurrence for each document to which the category is attached; adding a payload to a document occurrence of a serialization of an identifier of the category sets to which the category belongs associated with the document. Indexing the category set further includes: adding an index entry for category set data, wherein the index entry includes a single occurrence for each document, wherein a document occurrence includes a payload of a serialization of an identifier of category sets associated with the document, and an identifier of the categories belonging to the category sets.

    摘要翻译: 提供了方法,系统和计算机程序产品,用于分类搜索与类别之间的关系。 该方法包括:具有多个文档的文档集合,每个文档具有其所属的相关类别; 基于多个类别之间的关系将与文档相关联的多个类别分组成类别集合; 将类别集与文档相关联; 并索引用于从共享类别集的类别中检索文档的类别集。 其中索引所述类别集包括:具有类别的文本表示的索引条目,其中所述索引条目包括所述类别附加到的每个文档的单个事件; 向文档的类别集合的标识符的序列化的文档发生添加有效载荷,所述类别集合的标识符与文档相关联。 对类别集的索引进一步包括:为类别集数据添加索引条目,其中索引条目包括每个文档的单个出现,其中文档发生包括与文档相关联的类别集合的标识符的序列化的有效载荷,以及 属于类别集的类别的标识符。