Ranking results using multiple nested ranking
    1.
    发明授权
    Ranking results using multiple nested ranking 有权
    使用多个嵌套排名排名结果

    公开(公告)号:US07689615B2

    公开(公告)日:2010-03-30

    申请号:US11294269

    申请日:2005-12-05

    IPC分类号: G06F7/00 G06F17/30

    摘要: A unique system and method that facilitates improving the ranking of items is provided. The system and method involve re-ranking decreasing subsets of high ranked items in separate stages. In particular, a basic ranking component can rank a set of items. A subset of the top or high ranking items can be taken and used as a new training set to train a component for improving the ranking among these high ranked documents. This process can be repeated on an arbitrary number of successive high ranked subsets. Thus, high ranked items can be reordered in separate stages by focusing on the higher ranked items to facilitate placing the most relevant items at the top of a search results list.

    摘要翻译: 提供了一种有助于提高项目排名的独特系统和方法。 该系统和方法包括在不同阶段重新排列高排名项目的减少子集。 特别地,基本排名组件可以对一组项目进行排序。 可以采用顶级或高级项目的一部分,并将其用作新的培训组,以训练组件以提高这些高排名文档中的排名。 该过程可以在任意数量的连续高排名子集上重复。 因此,通过关注较高排名的项目以便将最相关的项目放置在搜索结果列表的顶部,可以在单独的阶段重新排列高排名的项目。

    Question answering over structured content on the web
    2.
    发明授权
    Question answering over structured content on the web 失效
    在网络上回答结构化内容的问题

    公开(公告)号:US07873624B2

    公开(公告)日:2011-01-18

    申请号:US11256503

    申请日:2005-10-21

    IPC分类号: G06F17/30

    摘要: Structured content and associated metadata from the Web are leveraged to provide specific answer string responses to user questions. The structured content can also be indexed at crawl-time to facilitate searching of the content at search-time. Ranking techniques can also be employed to facilitate in providing an optimum answer string and/or a top K list of answer strings for a query. Ranking can be based on trainable algorithms that utilize feature vectors for candidate answer strings. In one instance, at crawl-time, structured content is indexed and automatically associated with metadata relating to the structured content and the source web page. At search-time, candidate indexed structured content is then utilized to extract an appropriate answer string in response to a user query.

    摘要翻译: 来自网络的结构化内容和相关元数据被用来提供用户问题的特定答案字符串响应。 结构化内容还可以在爬行时间进行索引,以便于搜索时搜索内容。 也可以采用排名技术来促进为查询提供最佳答案字符串和/或回答字符串的顶部K列表。 排名可以基于利用候选答案字符串的特征向量的可训练算法。 在一个实例中,在爬行时,结构化内容被索引并且与结构化内容和源网页相关联的元数据自动关联。 在搜索时间,然后利用候选索引的结构化内容来提取响应于用户查询的适当答案字符串。

    Training a learning system with arbitrary cost functions
    3.
    发明授权
    Training a learning system with arbitrary cost functions 有权
    培训具有任意成本功能的学习系统

    公开(公告)号:US07472096B2

    公开(公告)日:2008-12-30

    申请号:US11305395

    申请日:2005-12-16

    IPC分类号: G06F15/18 G06K9/00

    CPC分类号: G06N3/08

    摘要: The subject disclosure pertains to systems and methods for training machine learning systems. Many cost functions are not smooth or differentiable and cannot easily be used during training of a machine learning system. The machine learning system can include a set of estimated gradients based at least in part upon the ranked or sorted results generated by the learning system. The estimated gradients can be selected to reflect the requirements of a cost function and utilized instead of the cost function to determine or modify the parameters of the learning system during training of the learning system.

    摘要翻译: 本发明涉及用于训练机器学习系统的系统和方法。 许多成本函数不平滑或可微分,并且在机器学习系统的训练期间不能轻易地使用。 机器学习系统可以至少部分地基于学习系统产生的排名或排序结果来包括一组估计梯度。 可以选择估计的梯度来反映成本函数的要求,而不是使用成本函数来确定或修改在学习系统的训练期间学习系统的参数。

    Efficiency of training for ranking systems based on pairwise training with aggregated gradients
    4.
    发明授权
    Efficiency of training for ranking systems based on pairwise training with aggregated gradients 有权
    基于与聚合梯度成对训练的排名系统的培训效率

    公开(公告)号:US07617164B2

    公开(公告)日:2009-11-10

    申请号:US11378086

    申请日:2006-03-17

    IPC分类号: G06F15/18 G06G7/00 G06N3/02

    CPC分类号: G06N99/005 Y10S707/99935

    摘要: The subject disclosure pertains to systems and methods for facilitating training of machine learning systems utilizing pairwise training. The number of computations required during pairwise training is reduced by grouping the computations. First, a score is generated for each retrieved data item. During processing of the data item pairs, the scores of the data items in the pair are retrieved and used to generate a gradient for each data item. Once all of the pairs have been processed, the gradients for each data item are aggregated and the aggregated gradients are used to update the machine learning system.

    摘要翻译: 本发明涉及利用成对训练促进训练机器学习系统的系统和方法。 通过对计算进行分组,减少了成对训练中所需的计算次数。 首先,为每个检索到的数据项生成分数。 在数据项对的处理期间,检索对中的数据项的分数,并用于为每个数据项生成一个渐变。 一旦所有的对已被处理,每个数据项的渐变被聚合,聚合的梯度被用来更新机器学习系统。

    Alphanumeric image segmentation scheme
    5.
    发明授权
    Alphanumeric image segmentation scheme 失效
    字母数字图像分割方案

    公开(公告)号:US5519788A

    公开(公告)日:1996-05-21

    申请号:US505039

    申请日:1995-07-21

    IPC分类号: G06K9/34 G06K9/00

    CPC分类号: G06K9/342 G06K2209/01

    摘要: A process for creating segments out of an arbitrary string of handwritten alphanumeric script is described, in which the contours of the image are defined by the path a ball or pointer follows when allowed to roll from the top and bottom of an image, down or up either side. From the contours, the initial image cut points are determined. The pointer is provided with a capability to measure ink density in the nearby pixels. A grey scale threshold control is provided which operates in conjunction with the pointer as it rolls or moves, to define ink density above the threshold as a white pixel wherein no image content is present; and ink density below the threshold as a black pixel wherein image content is present.

    摘要翻译: 描述了用于从任意字符串的手写字母数字脚本中创建段的过程,其中,图像的轮廓由允许从图像的顶部和底部滚动或向下或向上滚动的球或指针所遵循的路径来定义 任何一边。 根据轮廓,确定初始图像切割点。 该指针具有测量附近像素中的墨密度的能力。 提供灰度阈值控制,当指针滚动或移动时与指示器一起操作,以将墨水密度定义为高于阈值的白色像素,其中不存在图像内容; 并且墨密度低于阈值,作为其中存在图像内容的黑色像素。

    Graphical system for automated segmentation and recognition for image
recognition systems
    6.
    发明授权
    Graphical system for automated segmentation and recognition for image recognition systems 失效
    用于图像识别系统的自动分割和识别的图形系统

    公开(公告)号:US5487117A

    公开(公告)日:1996-01-23

    申请号:US327339

    申请日:1994-10-11

    IPC分类号: G06K9/34 G06K9/00

    CPC分类号: G06K9/344 G06K2209/01

    摘要: Apparatus and processes are described for the automatic recognition of alphanumeric images. A set of cuts are made to the image which include incorrect segmentations. The resulting "cells" comprising in their totality the created segments of the image are then analyzed to determine which cells are legal neighbors and which are not. All cells which are legal neighbors are then presented as connected nodes. A pruning of nodes which are related to certain predetermined image cuts is effected. Each set of remaining connected nodes is then presented to a recognizer which identifies the image and assigns a specified probability to the output. Many cells which are not legal neighbors are thereby not presented to the recognizer, thus saving substantially on computations per recognized image.

    摘要翻译: 描述了用于自动识别字母数字图像的装置和过程。 对图像进行一组切割,包括不正确的分割。 然后分析所产生的“单元格”,其中总共构成图像的创建段,以确定哪些单元是合法的邻居,哪些单元不是。 所有作为合法邻居的单元然后呈现为连接节点。 实现与某些预定图像切割相关的节点的修剪。 然后将每组剩余连接的节点呈现给识别图像并将其指定概率分配给输出的识别器。 因此,不是合法邻居的许多单元不被呈现给识别器,从而基本上节省了每个识别的图像的计算。