专利检索 ap:("Fabrice Canel" OR "Junaid Ahmed" OR "Thomas Francis McElroy" OR "Walter Sun" OR "Kumar Chellapilla" OR "Abhishek Singh" OR "Vishnu Challam") AND inv:"Kumar Chellapilla" 第 1 页

1.

发明授权
Content signature notification 有权
标题翻译：内容签名通知

公开(公告)号：US09043306B2

公开(公告)日：2015-05-26

申请号：US12861788

申请日：2010-08-23

申请人： Fabrice Canel , Junaid Ahmed , Thomas Francis McElroy , Walter Sun , Kumar Chellapilla , Abhishek Singh , Vishnu Challam

发明人： Fabrice Canel , Junaid Ahmed , Thomas Francis McElroy , Walter Sun , Kumar Chellapilla , Abhishek Singh , Vishnu Challam

IPC分类号： G06F17/30

CPC分类号： G06F17/30864 , G06F17/30109 , G06F17/30336 , G06F17/30867 , G06F17/30899

摘要： A client application installed on end user computers generates metadata from the content of web pages visited by end users and provides the metadata to a search engine. When an end user visits a web page, the end user's computer downloads and displays the web page to the end user. The client application may simultaneously access the web page content and generate this metadata in the form of a content signature of the web page from the web page content. The client application then provides the content signature to a search engine. The search engine may employ content signatures to identify new web pages to crawl and index. Additionally, the search engine may employ content signatures to identify changes to web pages and determine the crawl frequency of web pages.

摘要翻译： 安装在最终用户计算机上的客户端应用程序从最终用户访问的网页的内容生成元数据，并将元数据提供给搜索引擎。当最终用户访问网页时，最终用户的计算机下载并将该网页显示给最终用户。客户端应用程序可以同时访问网页内容，并从网页内容以网页的内容签名的形式生成该元数据。然后，客户应用程序将内容签名提供给搜索引擎。搜索引擎可以使用内容签名来识别新的网页来爬行和索引。此外，搜索引擎可以使用内容签名来识别网页的改变并确定网页的爬行频率。

2.

发明授权
Click prediction using bin counting 有权
标题翻译：点击预测使用bin计数

公开(公告)号：US09104960B2

公开(公告)日：2015-08-11

申请号：US13163857

申请日：2011-06-20

申请人： Leon Bottou , Kumar Chellapilla , Patrice Y. Simard , David Max Chickering

发明人： Leon Bottou , Kumar Chellapilla , Patrice Y. Simard , David Max Chickering

IPC分类号： G06Q30/00 , G06N7/00 , G06Q30/02

CPC分类号： G06N7/005 , G06Q30/0242

摘要： Methods, systems, and computer-storage media having computer-usable instructions embodied thereon for calculating event probabilities are provided. The event may be a click probability. Event probabilities are calculated using a system optimized for runtime model accuracy with an operable learning algorithm. Bin counting techniques are used to calculate event probabilities based on a count of event occurrences and non-event occurrences. Linear parameters, such and counts of clicks and non-clicks, may also be used in the system to allow for runtime adjustments.

摘要翻译： 提供了具有计算机可用指令的方法，系统和计算机存储介质，用于计算事件概率。事件可能是点击概率。事件概率是使用针对运行时模型精度优化的系统与可操作的学习算法计算的。 Bin计数技术用于根据事件发生次数和非事件发生次数来计算事件概率。也可以在系统中使用线性参数，例如点击次数和非点击次数，以允许运行时间调整。

3.

发明授权
Classifying search query traffic 有权
标题翻译：分类搜索查询流量

公开(公告)号：US08244752B2

公开(公告)日：2012-08-14

申请号：US12106857

申请日：2008-04-21

申请人： Greg Buehrer , Kumar Chellapilla , Jack W. Stokes

发明人： Greg Buehrer , Kumar Chellapilla , Jack W. Stokes

IPC分类号： G06F7/00 , G06F17/30 , G06F11/00 , G06F12/14 , G06F12/16 , G08B23/00

CPC分类号： H04L47/10

摘要： A method for classifying search query traffic can involve receiving a plurality of labeled sample search query traffic and generating a feature set partitioned into human physical limit features and query stream behavioral features. A model can be generated using the plurality of labeled sample search query traffic and the feature set. Search query traffic can be received and the model can be utilized to classify the received search query traffic as generated by a human or automatically generated.

摘要翻译： 用于分类搜索查询流量的方法可以包括接收多个标记的样本搜索查询流量并生成被划分为人体物理限制特征和查询流行为特征的特征集。可以使用多个标记的样本搜索查询流量和特征集来生成模型。可以接收搜索查询流量，并且该模型可以用于对由人类生成的或自动生成的接收的搜索查询流量进行分类。

4.

发明授权
Data partitioning via bucketing bloom filters 失效
标题翻译：数据分区通过强化布朗过滤器

公开(公告)号：US07743013B2

公开(公告)日：2010-06-22

申请号：US11811619

申请日：2007-06-11

申请人： Anton Mityagin , Kumar Chellapilla , Denis Charles

发明人： Anton Mityagin , Kumar Chellapilla , Denis Charles

IPC分类号： G06F17/30

CPC分类号： G06F17/30011

摘要： Multiple Bloom filters are generated to partition data between first and second disjoint data sets of elements. Each element in the first data set is assigned to a bucket of a first set of buckets, and each element in the second data set is assigned to a bucket of a second set of buckets. A Bloom filter is generated for each bucket of the first set of buckets. The Bloom filter generated for a bucket indicates that each element assigned to that bucket is part of the first data set, and that each element assigned to a corresponding bucket of the second set of buckets is not part of the first data set. Additionally, a Bloom filter corresponding to a subsequently received element can be determined and used to identify whether that subsequently received element is part of the first data set or the second data set.

摘要翻译： 生成多个Bloom过滤器以在元素的第一和第二不相交数据集之间划分数据。第一数据集中的每个元素被分配给第一组桶的桶，并且第二数据集中的每个元素被分配给第二组桶的桶。为第一组存储桶的每个桶生成布隆过滤器。为桶生成的Bloom过滤器指示分配给该桶的每个元素是第一数据集的一部分，并且分配给第二组桶的相应桶的每个元素不是第一数据集的一部分。此外，可以确定与随后接收到的元素相对应的布隆式过滤器，并用于识别随后接收的元件是否是第一数据集或第二数据集的一部分。

5.

发明授权
Speeding up analysis of compressed web graphs using virtual nodes 有权
标题翻译：使用虚拟节点加速对压缩Web图形的分析

公开(公告)号：US08200596B2

公开(公告)日：2012-06-12

申请号：US12473428

申请日：2009-05-28

申请人： Reid Andersen , Kumar Chellapilla , Chinmay Karande

发明人： Reid Andersen , Kumar Chellapilla , Chinmay Karande

IPC分类号： G06F17/00

CPC分类号： G06F17/10

摘要： Classes of web graph algorithms are extended to run directly on virtual node-type compressed web graphs where a reduction in runtime of the extended algorithms is realized which is approximately proportional to the compression ratio applied to the original (i.e., uncompressed) graph. In the virtual node compression technique, a succinct representation of a web graph is constructed by replacing dense subgraphs by sparse ones so that the resulting compressed graph has significantly fewer edges and a relatively small number of additional nodes.

摘要翻译： Web图算法的类被扩展为直接在虚拟节点类型的压缩web图上运行，其中实现了与应用于原始（即未压缩）图的压缩比大致成比例的扩展算法的运行时间的减少。在虚拟节点压缩技术中，通过用稀疏替换密集子图来构建网络图的简洁表示，使得所得到的压缩图具有明显更少的边缘和相对较少数量的附加节点。

6.

发明申请
ROBUST PERSONALIZATION THROUGH BIASED REGULARIZATION 有权
标题翻译：通过偏心正则化的稳健个性化

公开(公告)号：US20070239450A1

公开(公告)日：2007-10-11

申请号：US11278949

申请日：2006-04-06

申请人： Wolf Kienzle , Kumar Chellapilla

发明人： Wolf Kienzle , Kumar Chellapilla

IPC分类号： G10L15/06

CPC分类号： G10L15/07

摘要： The subject disclosure pertains to systems and methods for personalization of a recognizer. In general, recognizers can be used to classify input data. During personalization, a recognizer is provided with samples specific to a user, entity or format to improve performance for the specific user, entity or format. Biased regularization can be utilized during personalization to maintain recognizer performance for non-user specific input. In one aspect, regularization can be biased to the original parameters of the recognizer, such that the recognizer is not modified excessively during personalization.

摘要翻译： 本发明涉及用于识别器个性化的系统和方法。通常，识别器可用于对输入数据进行分类。在个性化期间，向识别器提供特定于用户，实体或格式的样本，以提高特定用户，实体或格式的性能。在个性化过程中可以利用偏置正则化来维持非用户特定输入的识别器性能。在一个方面，正则化可以偏向识别器的原始参数，使得识别器在个性化期间不被过度修改。

7.

发明申请
Logical structure and layout based offline character recognition 有权
标题翻译：基于逻辑结构和布局的离线字符识别

公开(公告)号：US20070133883A1

公开(公告)日：2007-06-14

申请号：US11299873

申请日：2005-12-12

申请人： Kumar Chellapilla , Patrice Simard

发明人： Kumar Chellapilla , Patrice Simard

IPC分类号： G06K9/62

CPC分类号： G06K9/80

摘要： A method and system for implementing character recognition is described herein. An input character is received. The input character is composed of one or more logical structures in a particular layout. The layout of the one or more logical structures is identified. One or more of a plurality of classifiers are selected based on the layout of the one or more logical structures in the input character. The entire character is input into the selected classifiers. The selected classifiers classify the logical structures. The outputs from the selected classifiers are then combined to form an output character vector.

摘要翻译： 本文描述了用于实现字符识别的方法和系统。接收到一个输入字符。输入字符由特定布局中的一个或多个逻辑结构组成。识别一个或多个逻辑结构的布局。基于输入字符中的一个或多个逻辑结构的布局来选择多个分类器中的一个或多个。整个字符被输入到所选择的分类器中。所选分类器对逻辑结构进行分类。然后将所选分类器的输出组合以形成输出字符向量。

8.

发明授权
Short paths in web graphs with small query time 有权
标题翻译： Web图形中的短路径，查询时间较短

公开(公告)号：US08296327B2

公开(公告)日：2012-10-23

申请号：US12473706

申请日：2009-05-28

申请人： Reid Andersen , Kumar Chellapilla , Chinmay Karande

发明人： Reid Andersen , Kumar Chellapilla , Chinmay Karande

IPC分类号： G06F17/30

CPC分类号： G06F17/30958 , G06F17/30861

摘要： Short paths are found with a small query time in scale-free directed graphs using a two-phase process by which data structures comprising shortest path trees are first pre-computed for a group of central vertices called “hubs” that have short paths to most other vertices in the graph. In a query time phase, a short path between two vertices of interest in the graph is found by looking up the path to the root in each of the shortest path trees.

摘要翻译： 在无尺度的有向图中使用两阶段过程找到短路径，其中使用两阶段过程，通过该两阶段过程，首先对于具有到大多数其他顶点的短路径的一组称为集线器的中心顶点预先计算包括最短路径树的数据结构在图中。在查询时间阶段，通过在每个最短路径树中查找到根的路径来找到图中感兴趣的两个顶点之间的短路径。

9.

发明申请
Data partitioning via bucketing bloom filters 失效
标题翻译：数据分区通过强化布朗过滤器

公开(公告)号：US20080307189A1

公开(公告)日：2008-12-11

申请号：US11811619

申请日：2007-06-11

申请人： Anton Mityagin , Kumar Chellapilla , Denis Charles

发明人： Anton Mityagin , Kumar Chellapilla , Denis Charles

IPC分类号： G06F12/00

CPC分类号： G06F17/30011

摘要： Multiple Bloom filters are generated to partition data between first and second disjoint data sets of elements. Each element in the first data set is assigned to a bucket of a first set of buckets, and each element in the second data set is assigned to a bucket of a second set of buckets. A Bloom filter is generated for each bucket of the first set of buckets. The Bloom filter generated for a bucket indicates that each element assigned to that bucket is part of the first data set, and that each element assigned to a corresponding bucket of the second set of buckets is not part of the first data set. Additionally, a Bloom filter corresponding to a subsequently received element can be determined and used to identify whether that subsequently received element is part of the first data set or the second data set.

摘要翻译： 生成多个Bloom过滤器以在元素的第一和第二不相交数据集之间划分数据。第一数据集中的每个元素被分配给第一组桶的桶，并且第二数据集中的每个元素被分配给第二组桶的桶。为第一组存储桶的每个桶生成布隆过滤器。为桶生成的Bloom过滤器指示分配给该桶的每个元素是第一数据集的一部分，并且分配给第二组桶的相应桶的每个元素不是第一数据集的一部分。此外，可以确定与随后接收到的元素相对应的布隆式过滤器，并用于识别随后接收的元件是否是第一数据集或第二数据集的一部分。

10.

发明申请
Extracting link spam using random walks and spam seeds 审中-公开
标题翻译：使用随机散步和垃圾邮件种子提取链接垃圾邮件

公开(公告)号：US20080270549A1

公开(公告)日：2008-10-30

申请号：US11789997

申请日：2007-04-26

申请人： Kumar Chellapilla , Baoning Wu

发明人： Kumar Chellapilla , Baoning Wu

IPC分类号： G06F17/30 , G06F15/16

CPC分类号： G06Q10/107 , G06F16/951

摘要： Architecture for extracting link spam communities when given one or more members of the community. A link spam extraction algorithm is provided that takes as input link spam seeds and extracts other nearby link spam through a biased local random walk around the seed(s). The seed set is provided by a user (or an automated algorithm scrubbed by a human) which the algorithm uses to simulate a random walk on a web graph. The random walk can be biased to explore a local neighborhood around the seed set through use of decay probabilities. Truncation can be used to retain only the most frequently visited nodes. After termination, the nodes are sorted in decreasing order of final probabilities and presented to the user. Human judges need only make decisions at the spam community level, thereby limiting involvement, and human input can be scaled by several orders of magnitude.

摘要翻译： 当给予一个或多个社区成员时，提取链接垃圾邮件社区的架构。提供链接垃圾邮件提取算法，其作为输入链接垃圾邮件种子，并且通过在种子周围的偏置的本地随机游走来提取其他附近的链接垃圾邮件。种子集由用户（或人类擦除的自动化算法）提供，该算法用于模拟网络图上的随机游走。随机游走可以偏向于通过使用衰变概率来探索种子集周围的当地社区。截断可用于仅保留最常访问的节点。终止后，节点按最终概率的降序排序，并呈现给用户。人类法官只需要在垃圾邮件社区一级作出决定，从而限制参与，而且人力投入可以按几个数量级进行。

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类