Ranking search results using click-based data
    1.
    发明授权
    Ranking search results using click-based data 有权
    使用点击数据排名搜索结果

    公开(公告)号:US08370337B2

    公开(公告)日:2013-02-05

    申请号:US12762929

    申请日:2010-04-19

    IPC分类号: G06F7/00 G06F17/30

    CPC分类号: G06N99/005 G06F17/30882

    摘要: Methods and computer-storage media having computer-executable instructions embodied thereon that facilitate generating a machine-learned model for ranking search results using click-based data are provided. Data is referenced from user queries, which may include search results generated by general search engines and vertical search engines. A training set is generated from the search results and click-based judgments are associated with the search results in the training set. Based on click-based judgments, identifiable features are determined from the search results in a training set. Based on determining identifiable features in a training set, a rule set is generated for ranking subsequent search results.

    摘要翻译: 提供了具有其上包含计算机可执行指令的方法和计算机存储介质,其利用基于点击的数据便于生成用于对搜索结果进行排名的机器学习模型。 数据来自用户查询,可能包括由一般搜索引擎和垂直搜索引擎生成的搜索结果。 从搜索结果生成训练集,并且基于点击的判断与训练集中的搜索结果相关联。 基于点击判断,可以从训练集中的搜索结果确定可识别的特征。 基于确定训练集中的可识别特征,生成用于对后续搜索结果进行排序的规则集。

    System, method, and service for using a focused random walk to produce samples on a topic from a collection of hyper-linked pages
    4.
    发明授权
    System, method, and service for using a focused random walk to produce samples on a topic from a collection of hyper-linked pages 失效
    系统,方法和服务,用于使用集中的随机游走从超链接页面集合中的主题生成样本

    公开(公告)号:US07640488B2

    公开(公告)日:2009-12-29

    申请号:US11004412

    申请日:2004-12-04

    IPC分类号: G06F17/00 G06F17/20

    CPC分类号: G06F17/30864

    摘要: A focused random walk system produces samples of on-topic pages from a collection of hyper-linked pages such as Web pages. The focused random walk system utilizes a focused random walk to produce a focused sample, which is a random sample of Web pages focused on a topic. The focused random walk system uniformly samples pages iteratively, where each iteration follows a random link from a union of the in-links and out-links of a page. The system then classifies this randomly selected link to determine whether the page is on-topic. The random walk sampling process could comprise a hard-focus method that selects only on-topic pages at each step of the focused random walk, or a soft-focus method that allows limited divergence to off-topic pages.

    摘要翻译: 集中的随机游走系统从一系列超链接页面(如网页)生成主题页面的样本。 集中的随机游走系统利用一个集中的随机游走来产生一个聚焦的样本,这是一个专注于主题的网页的随机抽样。 集中的随机游走系统统一地对页面进行一次抽样,其中每次迭代都遵循一个页面的链接和外链的联合的随机链接。 然后,系统对这个随机选择的链接进行分类,以确定该页面是否是主题的。 随机游走抽样过程可以包括仅在聚焦随机游走的每个步骤选择专题页面的硬焦点方法,或者允许有限散点到偏离主题页面的软焦点方法。

    SYSTEM AND METHOD FOR AUTOMATICALLY RANKING LINES OF TEXT
    5.
    发明申请
    SYSTEM AND METHOD FOR AUTOMATICALLY RANKING LINES OF TEXT 有权
    用于自动排列文本行的系统和方法

    公开(公告)号:US20090292683A1

    公开(公告)日:2009-11-26

    申请号:US12124086

    申请日:2008-05-20

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30675

    摘要: Disclosed are apparatus and methods for ranking lines of text. In one embodiment, an intent of a query is ascertained. A relevance of each one of a plurality of lines of text of a document is determined based upon the intent of the query, content of the query, and content of each of the plurality of lines of text. The plurality of lines of text may then be ranked according to the determined relevance of each of the plurality of lines of text.

    摘要翻译: 公开了用于对文本排列进行排序的装置和方法。 在一个实施例中,确定查询的意图。 基于查询的意图,查询的内容以及多条文本行中的每一行的内容来确定文档的多行文本中的每一行的相关性。 然后可以根据所确定的多行文本中的每一行的确定的相关性来对多行文本进行排名。

    System and method for tolerating multiple storage device failures in a storage system with constrained parity in-degree
    6.
    发明授权
    System and method for tolerating multiple storage device failures in a storage system with constrained parity in-degree 失效
    在具有约束奇偶校验的存储系统中容忍多个存储设备故障的系统和方法

    公开(公告)号:US07519629B2

    公开(公告)日:2009-04-14

    申请号:US10956466

    申请日:2004-09-30

    IPC分类号: G06F17/00

    CPC分类号: G06F11/1076 Y10S707/99953

    摘要: A fault-tolerant system for storage arrays has constraints on the number of data from which each redundancy value is computed. The fault-tolerant system has embodiments that are supported on small array sizes to arbitrarily large array sizes, and can tolerate a large number T of failures. Certain embodiments can tolerate many instances of more than T failures. The fault-tolerant system has efficient XOR-based encoding, recovery, and updating algorithms and has simple redundancy formulas. The fault-tolerant system has improved IO seek costs for certain multiple-element sequential host updates.

    摘要翻译: 用于存储阵列的容错系统对从其计算每个冗余值的数据数量具有约束。 容错系统具有支持小阵列大小到任意大的阵列大小的实施例,并且可以容忍大量T的故障。 某些实施例可以容忍多于T个故障的许多实例。 容错系统具有高效的基于XOR的编码,恢复和更新算法,并具有简单的冗余公式。 容错系统已经提高了某些多元素顺序主机更新的IO查找成本。

    System and method for enabling efficient recovery of data in a storage array
    9.
    发明申请
    System and method for enabling efficient recovery of data in a storage array 失效
    用于有效恢复存储阵列中的数据的系统和方法

    公开(公告)号:US20060074954A1

    公开(公告)日:2006-04-06

    申请号:US10956468

    申请日:2004-09-30

    IPC分类号: G06F17/30

    摘要: A recovery enabling system for storage arrays is a high distance generalization of RAID-5 with optimal update complexity and near optimal storage efficiency. The recovery enabling system utilizes presets, data cells with known values that initialize the reconstruction process. The presets allow resolution of parity equations to reconstruct data when failures occur. In one embodiment, additional copies of the layout of the recovery enabling system are packed onto the same disks to minimize the effect of presets on storage efficiency without destroying the clean geometric construction of the recovery enabling system. The recovery enabling system has efficient XOR-based encoding, recovery, and updating algorithms for arbitrarily large distances, making the recovery enabling system an ideal candidate when storage-efficient reliable codes are required.

    摘要翻译: 存储阵列的恢复启用系统是RAID-5的高距离泛化,具有最佳的更新复杂性和接近最佳的存储效率。 恢复启用系统利用预设,具有初始化重建过程的已知值的数据单元。 当故障发生时,预设允许解决奇偶校验方程来重建数据。 在一个实施例中,恢复启用系统的布局的附加副本被打包在相同的盘上,以最小化预设对存储效率的影响,而不破坏恢复使能系统的干净的几何结构。 恢复使能系统具有高效的基于XOR的编码,恢复和更新算法,用于任意大距离,使得恢复使能系统成为需要存储高效可靠代码的理想候选。

    SURFACING ENTITY ATTRIBUTES WITH SEARCH RESULTS
    10.
    发明申请
    SURFACING ENTITY ATTRIBUTES WITH SEARCH RESULTS 审中-公开
    具有搜索结果的表面实体属性

    公开(公告)号:US20140067816A1

    公开(公告)日:2014-03-06

    申请号:US13597596

    申请日:2012-08-29

    IPC分类号: G06F17/30

    摘要: In an effort to enhance computer user engagement with a search results page, systems and methods are presented which are configured to identify an entity as being the subject matter of a user's search query. If the entity is a known entity, i.e., entity information is stored in an entity store for the identified entity, a subset of entity attributes are identified and a representative entity attribute question is obtained for each of the attributes in the subset of entity attributes. The representative entity attribute questions are identified according to the probability that they are formed linguistically correct. The representative entity attribute questions are included in a search results page that is generated in response to the user's search query.

    摘要翻译: 为了增强与搜索结果页面的计算机用户参与度,提出了被配置为将实体标识为用户的搜索查询的主题的系统和方法。 如果实体是已知实体,即实体信息存储在被识别实体的实体存储器中,则识别实体属性的子集,并且为实体属性子集中的每个属性获得代表性实体属性问题。 代表性实体属性问题是根据语言正确的概率来确定的。 代表实体属性问题包括在响应于用户的搜索查询而生成的搜索结果页面中。