专利检索 ap:("Amit Singhal" OR "Mehran Sahami" OR "John Lamping" OR "Marcin Kaszkiel" OR "Monika H. Henzinger") AND inv:"Monika H. Henzinger" 第 3 页

21.

发明申请
DETECTING DUPLICATE AND NEAR-DUPLICATE FILES 审中-公开
标题翻译：检测重复和近似文件

公开(公告)号：US20120078871A1

公开(公告)日：2012-03-29

申请号：US13313913

申请日：2011-12-07

申请人： William Pugh , Monika H. Henzinger

发明人： William Pugh , Monika H. Henzinger

IPC分类号： G06F17/30

CPC分类号： G06F16/951 , G06F16/355 , Y10S707/99933 , Y10S707/99935 , Y10S707/99936 , Y10S707/99943

摘要： Improved duplicate and near-duplicate detection techniques may assign a number of fingerprints to a given document by (i) extracting parts from the document, (ii) assigning the extracted parts to one or more of a predetermined number of lists, and (iii) generating a fingerprint from each of the populated lists. Two documents may be considered to be near-duplicates if any one of their fingerprints match.

摘要翻译： 改进的重复和近似重复的检测技术可以通过（i）从文档中提取部分，（ii）将提取的部分分配给预定数目的列表中的一个或多个来分配给定文档的许多指纹，以及（iii）从每个填充列表生成指纹。如果任何一个指纹匹配，两个文件可能被认为是近似重复的。

22.

发明授权
Algorithms for selecting subsequences 失效
标题翻译：选择子序列的算法

公开(公告)号：US08131751B1

公开(公告)日：2012-03-06

申请号：US12327368

申请日：2008-12-03

申请人： Behshad Behzadi , Yaniv Bernstein , Stefan Burkhardt , Monika H. Henzinger , Benjamin Liebald , Richard Tucker

发明人： Behshad Behzadi , Yaniv Bernstein , Stefan Burkhardt , Monika H. Henzinger , Benjamin Liebald , Richard Tucker

IPC分类号： G06F17/30

CPC分类号： G06F17/30675

摘要： The present disclosure includes, among other things, systems, methods and program products for selecting subsequences (shingles or tuples) generated from sequences of tokens.

摘要翻译： 本公开包括用于选择从令牌序列生成的子序列（带状键或元组）的系统，方法和程序产品。

23.

发明授权
Systems and methods for determining a quality of provided items 有权
标题翻译：用于确定提供物品质量的系统和方法

公开(公告)号：US08065296B1

公开(公告)日：2011-11-22

申请号：US10952501

申请日：2004-09-29

申请人： Alexander Mark Franz , Monika H. Henzinger

发明人： Alexander Mark Franz , Monika H. Henzinger

IPC分类号： G06F17/30

CPC分类号： G06F17/30864

摘要： A system may provide items during a time period and determine a quality of the items provided during the time period using a time series model.

摘要翻译： 系统可以在一段时间段内提供项目，并使用时间序列模型确定在该时间段期间提供的项目的质量。

24.

发明授权
Systems and methods for using anchor text as parallel corpora for cross-language information retrieval 有权
标题翻译：使用锚文本作为跨语言信息检索的并行语料库的系统和方法

公开(公告)号：US07996402B1

公开(公告)日：2011-08-09

申请号：US12872755

申请日：2010-08-31

申请人： Luis Gravano , Monika H. Henzinger

发明人： Luis Gravano , Monika H. Henzinger

IPC分类号： G06F17/30

CPC分类号： G06F17/30864 , Y10S707/99934 , Y10S707/99935

摘要： A system performs cross-language query translations. The system receives a search query that includes terms in a first language and determines possible translations of the terms of the search query into a second language. The system also locates documents for use as parallel corpora to aid in the translation by: (1) locating documents in the first language that contain references that match the terms of the search query and identify documents in the second language; (2) locating documents in the first language that contain references that match the terms of the query and refer to other documents in the first language and identify documents in the second language that contain references to the other documents; or (3) locating documents in the first language that match the terms of the query and identify documents in the second language that contain references to the documents in the first language. The system may use the second language documents as parallel corpora to disambiguate among the possible translations of the terms of the search query and identify one of the possible translations as a likely translation of the search query into the second language.

摘要翻译： 系统执行跨语言查询翻译。系统接收包括第一语言的搜索查询，并确定搜索查询的条款可能的翻译成第二语言。该系统还将用作并行语料库的文档定位为通过以下方式帮助翻译：（1）以包含与搜索查询的条款匹配的引用的第一语言定位文档，并识别第二语言的文档; （2）以包含与查询条款相匹配的引用的第一语言定位文件，并引用第一语言的其他文档，并且识别包含对其他文档的引用的第二语言的文档; 或者（3）以符合查询条款的第一语言定位文档，并且识别第二语言中包含对第一语言文档的引用的文档。系统可以使用第二语言文档作为并行语料库来消除搜索查询的术语的可能的翻译之间的歧义，并将可能的翻译之一识别为搜索查询到第二语言的可能的翻译。

25.

发明授权
Detecting duplicate and near-duplicate files 有权
标题翻译：检测重复和近似重复的文件

公开(公告)号：US07366718B1

公开(公告)日：2008-04-29

申请号：US10608468

申请日：2003-06-27

申请人： William Pugh , Monika H. Henzinger

发明人： William Pugh , Monika H. Henzinger

IPC分类号： G06F7/00 , G06F17/30

CPC分类号： G06F17/30864 , G06F17/3071 , Y10S707/99933 , Y10S707/99935 , Y10S707/99936 , Y10S707/99943

摘要： Improved duplicate and near-duplicate detection techniques may assign a number of fingerprints to a given document by (i) extracting parts from the document, (ii) assigning the extracted parts to one or more of a predetermined number of lists, and (iii) generating a fingerprint from each of the populated lists. Two documents may be considered to be near-duplicates if any one of their fingerprints match.

摘要翻译： 改进的重复和近似重复的检测技术可以通过（i）从文档中提取部分，（ii）将提取的部分分配给预定数目的列表中的一个或多个来分配给定文档的许多指纹，以及（iii）从每个填充列表生成指纹。如果任何一个指纹匹配，两个文件可能被认为是近似重复的。

26.

发明授权
Voice interface for a search engine 有权

公开(公告)号：US07027987B1

公开(公告)日：2006-04-11

申请号：US09777863

申请日：2001-02-07

申请人： Alexander Mark Franz , Monika H. Henzinger , Sergey Brin , Brian Christopher Milch

发明人： Alexander Mark Franz , Monika H. Henzinger , Sergey Brin , Brian Christopher Milch

IPC分类号： G10L15/08 , G06F17/20

CPC分类号： G10L15/22 , G10L2015/085 , Y10S707/99933 , Y10S707/99935

摘要： A system provides search results from a voice search query. The system receives a voice search query from a user, derives one or more recognition hypotheses, each being associated with a weight, from the voice search query, and constructs a weighted boolean query using the recognition hypotheses. The system then provides the weighted boolean query to a search system and provides the results of the search system to a user.

27.

发明授权
Connectivity server for locating linkage information between Web pages 失效
标题翻译：用于在网页之间查找链接信息的连接服务器

公开(公告)号：US6073135A

公开(公告)日：2000-06-06

申请号：US37350

申请日：1998-03-10

申请人： Andrei Z. Broder , Michael Burrows , Monika H. Henzinger , Sanjay Ghemawat , Puneet Kumar , Suresh Venkatasubramanian

发明人： Andrei Z. Broder , Michael Burrows , Monika H. Henzinger , Sanjay Ghemawat , Puneet Kumar , Suresh Venkatasubramanian

IPC分类号： G06F17/30

CPC分类号： G06F17/30882 , G06F17/30873 , Y10S707/99932 , Y10S707/99933 , Y10S707/99937

摘要： A server computer is provided for representing and navigating the connectivity of Web pages. The Web pages include links to other Web pages. The links and Web page s have associated names (URLs). The names of the Web pages are sorted in a memory of the connectivity server. The sorted names are delta encoded while periodically storing full names as checkpoints in the memory. Each delta encoded name and checkpoint has a unique identification. A list of pairs of identifications representing existent links is sorted twice, first according to the first identification of each pair to produce an inlist, and second according to the second identification of each pair to produce an outlist. An array of elements is stored in the memory, there is one array element for each Web page. Each element includes a first pointer to one of the checkpoints, a second pointer to an associated inlist of the Web page, and a third pointer to an associated outlist of the Web page. The array is indexed by a particular identification to locate connected Web pages.

摘要翻译： 提供服务器计算机用于表示和浏览网页的连接。网页包含指向其他网页的链接。链接和网页都有相关联的名称（URL）。网页的名称在连接服务器的内存中排序。排序的名称是增量编码的，同时周期性地将全名作为检查点存储在内存中。每个delta编码的名称和检查点都有唯一的标识。代表存在的链接的标识对的列表被分类两次，首先根据每对的第一个标识来产生一个列表，其次是根据每一对的第二个标识来产生一个列表。元素数组存储在内存中，每个网页有一个数组元素。每个元素包括指向其中一个检查点的第一指针，指向该网页的相关联列表的第二指针，以及指向该网页的相关联的列表的第三指针。该阵列由特定的标识索引，以定位连接的网页。

28.

发明授权
Detecting duplicate and near-duplicate files 有权

公开(公告)号：US09275143B2

公开(公告)日：2016-03-01

申请号：US12049278

申请日：2008-03-15

申请人： William Pugh , Monika H. Henzinger

发明人： William Pugh , Monika H. Henzinger

IPC分类号： G06F7/00 , G06F17/30

CPC分类号： G06F17/30864 , G06F17/3071 , Y10S707/99933 , Y10S707/99935 , Y10S707/99936 , Y10S707/99943

摘要： Improved duplicate and near-duplicate detection techniques may assign a number of fingerprints to a given document by (i) extracting parts from the document, (ii) assigning the extracted parts to one or more of a predetermined number of lists, and (iii) generating a fingerprint from each of the populated lists. Two documents may be considered to be near-duplicates if any one of their fingerprints match.

29.

发明授权
In-context searching 有权
标题翻译：上下文搜索

公开(公告)号：US08868549B1

公开(公告)日：2014-10-21

申请号：US13154050

申请日：2011-06-06

申请人： Urs Hoelzle , Monika H. Henzinger , David Desjardins

发明人： Urs Hoelzle , Monika H. Henzinger , David Desjardins

IPC分类号： G06F17/30

CPC分类号： G06F17/30867 , G06F17/30011 , G06F17/30424 , G06F17/30598 , H04L67/02 , Y10S707/99932 , Y10S707/99933 , Y10S707/99934

摘要： A system limits search results based on context information. The system obtains the context information and a search query, and obtains a set of references to documents in response to the search query. The system then filters the set of references based on the context information and presents the filtered set of references to a user.

摘要翻译： 系统基于上下文信息限制搜索结果。系统获取上下文信息和搜索查询，并且响应于搜索查询获得一组对文档的引用。然后，该系统基于上下文信息对该组引用进行过滤，并向用户呈现过滤的引用集合。

30.

发明授权
Finding web pages relevant to multimedia streams 有权
标题翻译：查找与多媒体流相关的网页

公开(公告)号：US08868543B1

公开(公告)日：2014-10-21

申请号：US10408784

申请日：2003-04-08

申请人： Monika H. Henzinger , Bay-Wei Chang , Sergey Brin

发明人： Monika H. Henzinger , Bay-Wei Chang , Sergey Brin

IPC分类号： G06F7/00

CPC分类号： G06F17/30864 , G06F17/30867

摘要： A media stream, such as a news broadcast, is supplemented with documents that are relevant to the media stream. The documents may be web pages returned from a search engine. A search query generation component generates search queries for the search engine based on the media stream. A post processing component may re-rank and/or filter the documents to enhance the viewing experience for the user.

摘要翻译： 诸如新闻广播的媒体流补充有与媒体流相关的文档。文档可以是从搜索引擎返回的网页。搜索查询生成组件基于媒体流生成搜索引擎的搜索查询。后处理组件可以重新排序和/或过滤文档以增强用户的观看体验。

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类