专利检索 ap:("Daniel Dulitz" OR "Alexandre A. Verstak" OR "Sanjay Ghemawat" OR "Jeffrey A. Dean") AND inv:"Sanjay Ghemawat" 第 4 页

31.

发明申请
Organizing Data in a Distributed Storage System 有权
标题翻译：在分布式存储系统中组织数据

公开(公告)号：US20130339295A1

公开(公告)日：2013-12-19

申请号：US13898411

申请日：2013-05-20

申请人： Jeffrey Adgate Dean , Michael James Boyer Epstein , Andrew Fikes , Sanjay Ghemawat , Wilson Cheng-Yi Hsieh , Alexander Lloyd , Yasushi Saito , Michal Piotr Szymaniak , Sebastian Kanthak , Chris Jorgen Taylor

发明人： Jeffrey Adgate Dean , Michael James Boyer Epstein , Andrew Fikes , Sanjay Ghemawat , Wilson Cheng-Yi Hsieh , Alexander Lloyd , Yasushi Saito , Michal Piotr Szymaniak , Sebastian Kanthak , Chris Jorgen Taylor

IPC分类号： G06F17/30

CPC分类号： G06F17/30575 , G06F3/0611 , G06F3/0617 , G06F3/065 , G06F3/067

摘要： A distributed storage system is provided. The distributed storage system includes multiple front-end servers and zones for managing data for clients. Data within the distributed storage system is associated with a plurality of accounts and divided into a plurality of groups, each group including a plurality of splits, each split being associated with a respective account, and each group having multiple tablets and each tablet managed by a respective tablet server of the distributed storage system. Data associated with different accounts may be replicated within the distributed storage system using different data replication policies. There is no limit to the amount of data for an account by adding new splits to the distributed storage system. In response to a client request for a particular account's data, a front-end server communicates such request to a particular zone that has the client-requested data and returns the client-requested data to the requesting client.

摘要翻译： 提供分布式存储系统。分布式存储系统包括多个前端服务器和用于管理客户端数据的区域。分布式存储系统内的数据与多个帐户相关联，并被分成多个组，每个组包括多个分组，每个分组与相应的帐户相关联，并且每组具有多个平板电脑，每个分组由分布式存储系统的平板电脑服务器。可以使用不同的数据复制策略在分布式存储系统内复制与不同帐户相关联的数据。通过向分布式存储系统添加新的拆分，帐户数据的数量没有限制。响应于客户端对特定帐户的数据的请求，前端服务器将该请求传送到具有客户端请求的数据的特定区域，并将客户端请求的数据返回给请求客户端。

32.

发明授权
System and method of accessing a document efficiently through multi-tier web caching 有权
标题翻译：通过多层网页缓存有效访问文档的系统和方法

公开(公告)号：US08275790B2

公开(公告)日：2012-09-25

申请号：US12251413

申请日：2008-10-14

申请人： Eric Russell Fredricksen , Fritz John Schneider , Jeffrey Adgate Dean , Sanjay Ghemawat , Niels Provos , Georges Harik

发明人： Eric Russell Fredricksen , Fritz John Schneider , Jeffrey Adgate Dean , Sanjay Ghemawat , Niels Provos , Georges Harik

IPC分类号： G06F17/30

CPC分类号： G06F17/30902 , G06F17/30011 , Y10S707/99931 , Y10S707/99932

摘要： Upon receipt of a document request, a client assistant examines its cache for the document. If not successful, a server searches for the requested document in its cache. If the server copy is still not fresh or not found, the server seeks the document from its host. If the host cannot provide the copy, the server seeks it from a document repository. Certain documents are identified from the document repository as being fresh or stable. Information about each these identified documents is transmitted to the server which inserts entries into an index if the index does not already contain an entry for the document. If and when this particular document is requested, the document will not be present in the server, however the server will contain an entry directing the server to obtain the document from the document repository rather than the document's web host.

摘要翻译： 在接收到文档请求时，客户端助理检查其文件的缓存。如果不成功，服务器将在其缓存中搜索所请求的文档。如果服务器副本仍然不新鲜或找不到，则服务器从其主机寻找文档。如果主机无法提供副本，则服务器从文档存储库中查找它。某些文件从文档库中确定为新鲜或稳定。关于每个这些标识的文档的信息被传送到服务器，如果索引还没有包含文档的条目，则将该条目插入到索引中。如果请求此特定文档时，该文档将不存在于服务器中，但是服务器将包含一个条目，指示服务器从文档存储库而不是文档的Web主机获取文档。

33.

发明授权
Systems and methods for replicating data 有权
标题翻译：用于复制数据的系统和方法

公开(公告)号：US08065268B1

公开(公告)日：2011-11-22

申请号：US12727138

申请日：2010-03-18

申请人： Sanjay Ghemawat , Howard Gobioff , Shun-Tak Leung

发明人： Sanjay Ghemawat , Howard Gobioff , Shun-Tak Leung

IPC分类号： G06F17/30

CPC分类号： H04L67/1095 , G06F17/30174 , G06F17/30215

摘要： A system facilitates the distribution and redistribution of chunks of data among multiple servers. The system may identify servers to store a replica of the data based on at least one of utilization of the servers, prior data distribution involving the servers, and failure correlation properties associated with the servers, and place the replicas of the data at the identified servers. The system may also monitor total numbers of replicas of the chunks available in the system, identify chunks that have a total number of replicas below one or more chunk thresholds, assign priorities to the identified chunks, and re-replicate the identified chunks based substantially on the assigned priorities. The system may further monitor utilization of the servers, determine whether to redistribute any of the replicas, select one or more of the replicas to redistribute based on the utilization of the servers, select one or more of the servers to which to move the one or more replicas, and move the one or more replicas to the selected one or more servers.

摘要翻译： 系统便于在多个服务器之间分发和重新分发数据块。该系统可以基于服务器的使用，涉及服务器的先前数据分发以及与服务器相关联的故障相关属性中的至少一个来识别服务器来存储数据的副本，并将数据的副本放置在所识别的服务器。该系统还可以监视系统中可用的块的副本的总数，识别具有低于一个或多个块阈值的总副本数量的块，为所识别的块分配优先级，并基于实质上重新复制所识别的块分配的优先级。该系统可以进一步监视服务器的利用率，确定是否重新分发任何副本，基于服务器的使用选择一个或多个副本以重新分配，选择一个或多个服务器来移动一个或多个更多的副本，并将一个或多个副本移动到所选的一个或多个服务器。

34.

发明授权
System and method of accessing a document efficiently through multi-tier web caching 有权
标题翻译：通过多层网页缓存有效访问文档的系统和方法

公开(公告)号：US07437364B1

公开(公告)日：2008-10-14

申请号：US10882795

申请日：2004-06-30

申请人： Eric Russell Fredricksen , Fritz John Schneider , Jeffrey Adgate Dean , Sanjay Ghemawat , Niels Provos , Georges Harik

发明人： Eric Russell Fredricksen , Fritz John Schneider , Jeffrey Adgate Dean , Sanjay Ghemawat , Niels Provos , Georges Harik

IPC分类号： G06F17/30

CPC分类号： G06F17/30902 , G06F17/30011 , Y10S707/99931 , Y10S707/99932

摘要： Upon receipt of a document request, a client assistant examines its cache for the document. If not successful, a server searches for the requested document in its cache. If the server copy is still not fresh or not found, the server seeks the document from its host. If the host cannot provide the copy, the server seeks it from a document repository. Certain documents are identified from the document repository as being fresh or stable. Information about each these identified documents is transmitted to the server which inserts entries into an index if the index does not already contain an entry for the document. If and when this particular document is requested, the document will not be present in the server, however the server will contain an entry directing the server to obtain the document from the document repository rather than the document's web host.

摘要翻译： 在接收到文档请求时，客户端助理检查其文件的缓存。如果不成功，服务器将在其缓存中搜索所请求的文档。如果服务器副本仍然不新鲜或找不到，则服务器从其主机寻找文档。如果主机无法提供副本，则服务器从文档存储库中查找它。某些文件从文档库中确定为新鲜或稳定。关于每个这些标识的文档的信息被传送到服务器，如果索引还没有包含文档的条目，则将该条目插入到索引中。如果请求此特定文档时，该文档将不存在于服务器中，但是服务器将包含一个条目，指示服务器从文档存储库而不是文档的Web主机获取文档。

35.

发明授权
Methods and apparatus for using a modified index to provide search results in response to an ambiguous search query 有权
标题翻译：使用修改的索引以响应于模糊搜索查询来提供搜索结果的方法和装置

公开(公告)号：US06865575B1

公开(公告)日：2005-03-08

申请号：US10351772

申请日：2003-01-27

申请人： Benjamin Thomas Smith , Sergey Brin , Sanjay Ghemawat , Christopher D. Manning

发明人： Benjamin Thomas Smith , Sergey Brin , Sanjay Ghemawat , Christopher D. Manning

IPC分类号： G06F17/30

CPC分类号： G06F17/3061 , Y10S707/915 , Y10S707/916 , Y10S707/99943

摘要： A system allows a user to submit an ambiguous search query and to receive potentially disambiguated search results. In one implementation, a search engine's conventional alphanumeric index is translated into a second index that is ambiguated in the same manner as which the user's input is ambiguated. The user's ambiguous search query is compared to this ambiguated index, and the corresponding documents are provided to the user as search results.

摘要翻译： 系统允许用户提交模棱两可的搜索查询并接收可能消歧义的搜索结果。在一个实现中，搜索引擎的常规字母数字索引被转换成与用户输入消隐的相同方式歧义的第二索引。将用户的模糊搜索查询与该模糊索引进行比较，并将相应的文档作为搜索结果提供给用户。

36.

发明授权
Connectivity server for locating linkage information between Web pages 失效
标题翻译：用于在网页之间查找链接信息的连接服务器

公开(公告)号：US6073135A

公开(公告)日：2000-06-06

申请号：US37350

申请日：1998-03-10

申请人： Andrei Z. Broder , Michael Burrows , Monika H. Henzinger , Sanjay Ghemawat , Puneet Kumar , Suresh Venkatasubramanian

发明人： Andrei Z. Broder , Michael Burrows , Monika H. Henzinger , Sanjay Ghemawat , Puneet Kumar , Suresh Venkatasubramanian

IPC分类号： G06F17/30

CPC分类号： G06F17/30882 , G06F17/30873 , Y10S707/99932 , Y10S707/99933 , Y10S707/99937

摘要： A server computer is provided for representing and navigating the connectivity of Web pages. The Web pages include links to other Web pages. The links and Web page s have associated names (URLs). The names of the Web pages are sorted in a memory of the connectivity server. The sorted names are delta encoded while periodically storing full names as checkpoints in the memory. Each delta encoded name and checkpoint has a unique identification. A list of pairs of identifications representing existent links is sorted twice, first according to the first identification of each pair to produce an inlist, and second according to the second identification of each pair to produce an outlist. An array of elements is stored in the memory, there is one array element for each Web page. Each element includes a first pointer to one of the checkpoints, a second pointer to an associated inlist of the Web page, and a third pointer to an associated outlist of the Web page. The array is indexed by a particular identification to locate connected Web pages.

摘要翻译： 提供服务器计算机用于表示和浏览网页的连接。网页包含指向其他网页的链接。链接和网页都有相关联的名称（URL）。网页的名称在连接服务器的内存中排序。排序的名称是增量编码的，同时周期性地将全名作为检查点存储在内存中。每个delta编码的名称和检查点都有唯一的标识。代表存在的链接的标识对的列表被分类两次，首先根据每对的第一个标识来产生一个列表，其次是根据每一对的第二个标识来产生一个列表。元素数组存储在内存中，每个网页有一个数组元素。每个元素包括指向其中一个检查点的第一指针，指向该网页的相关联列表的第二指针，以及指向该网页的相关联的列表的第三指针。该阵列由特定的标识索引，以定位连接的网页。

37.

发明授权
Associating summaries with pointers in persistent data structures 有权
标题翻译：将摘要与持久性数据结构中的指针相关联

公开(公告)号：US09002860B1

公开(公告)日：2015-04-07

申请号：US13366934

申请日：2012-02-06

申请人： Sanjay Ghemawat

发明人： Sanjay Ghemawat

IPC分类号： G06F17/30 , G06F12/02

CPC分类号： G06F12/0284 , G06F3/0613 , G06F3/064 , G06F3/067 , G06F17/30321

摘要： Methods for organizing and retrieving data values in a persistent data structure are provided. Data values are grouped into data blocks and pointers are obtained for each data block. In addition, one or more summaries, related to a properties of the data block, are created and associated with the data block's pointer. The summaries allow for a more efficient retrieval of data values from the data structure by preventing unnecessary retrieval calls to persistent storage when the summaries do not match query criteria.

摘要翻译： 提供了在持久数据结构中组织和检索数据值的方法。数据值被分组成数据块，并且为每个数据块获得指针。此外，与数据块的属性相关的一个或多个摘要被创建并与数据块的指针相关联。总结允许从数据结构更有效地检索数据值，当汇总不符合查询条件时，可以防止对永久存储进行不必要的检索。

38.

发明授权
Identification of semantic units from within a search query 有权
标题翻译：从搜索查询中识别语义单位

公开(公告)号：US08719262B1

公开(公告)日：2014-05-06

申请号：US13616094

申请日：2012-09-14

申请人： Krishna Bharat , Sanjay Ghemawat , Urs Hoelzle

发明人： Krishna Bharat , Sanjay Ghemawat , Urs Hoelzle

IPC分类号： G06F17/30

CPC分类号： G06F17/30867 , G06F17/30663 , Y10S707/99931 , Y10S707/99933

摘要： A search engine for searching a corpus improves the relevancy of the results by classifying multiple terms in a search query as a single semantic unit. A semantic unit locator of the search engine generates a subset of documents that are generally relevant to the query based on the individual terms within the query. Combinations of search terms that define potential semantic units from the query are then evaluated against the subset of documents to determine which combinations of search terms should be classified as a semantic unit. The resultant semantic units are used to refine the results of the search.

摘要翻译： 用于搜索语料库的搜索引擎通过将搜索查询中的多个项目分类为单个语义单元来提高结果的相关性。搜索引擎的语义单元定位器基于查询中的各个术语生成通常与查询相关的文档的子集。然后根据文档子集来评估从查询定义潜在语义单元的搜索项的组合，以确定搜索词的哪些组合应该被分类为语义单元。所得到的语义单位用于细化搜索结果。

39.

发明授权
Systems and methods for searching using queries written in a different character-set and/or language from the target pages 有权
标题翻译：使用从目标页面以不同字符集和/或语言编写的查询进行搜索的系统和方法

公开(公告)号：US08706747B2

公开(公告)日：2014-04-22

申请号：US10676724

申请日：2003-09-30

申请人： Vibhu Mittal , Jay M. Ponte , Mehran Sahami , Sanjay Ghemawat , John A. Bauer

发明人： Vibhu Mittal , Jay M. Ponte , Mehran Sahami , Sanjay Ghemawat , John A. Bauer

IPC分类号： G06F7/00 , G06F17/30 , G06F17/21

CPC分类号： G06F17/3043 , G06F3/0237 , G06F17/22 , G06F17/27 , G06F17/30427 , G06F17/3066 , G06F17/30893

摘要： Methods and apparatus consistent with the invention allow a user to submit an ambiguous search query and to receive relevant search results. Queries can be expressed using character sets and/or languages that are different from the character set and/or language of at least some of the data that is to be searched. A translation between these character sets and/or languages can be performed by examining the use of terms in aligned text. Probabilities can be associated with each possible translation. Refinements can be made to these probabilities by examining user interactions with the search results.

摘要翻译： 与本发明一致的方法和装置允许用户提交模糊的搜索查询并接收相关的搜索结果。可以使用与要搜索的至少一些数据的字符集和/或语言不同的字符集和/或语言来表达查询。这些字符集和/或语言之间的翻译可以通过检查对齐文本中的术语的使用来执行。概率可以与每个可能的翻译相关联。通过检查用户与搜索结果的交互，可以对这些概率进行细化。

40.

发明授权
System and method for large-scale data processing using an application-independent framework 有权
标题翻译：使用独立于应用程序的框架进行大规模数据处理的系统和方法

公开(公告)号：US08612510B2

公开(公告)日：2013-12-17

申请号：US12686292

申请日：2010-01-12

申请人： Jeffrey Dean , Sanjay Ghemawat

发明人： Jeffrey Dean , Sanjay Ghemawat

IPC分类号： G06F15/16

CPC分类号： G06F17/30339 , G06F9/4881 , G06F9/54 , G06F17/30377 , G06F17/30445

摘要： A large-scale data processing system and method for processing data in a distributed and parallel processing environment. The system includes an application-independent framework for processing data having a plurality of application-independent map modules and reduce modules. These application-independent modules use application-independent operators to automatically handle parallelization of computations across the distributed and parallel processing environment when performing user-specified data processing operations. The system also includes a plurality of user-specified, application-specific operators, for use with the application-independent framework to perform a user-specified data processing operation on a user-specified set of input files. The application-specific operators include: a map operator and a reduce operator. The map operator is applied by the application-independent map modules to input data in the user-specified set of input files to produce intermediate data values. The reduce operator is applied by the application-independent reduce modules to process the intermediate data values to produce final output data.

摘要翻译： 用于在分布式和并行处理环境中处理数据的大规模数据处理系统和方法。该系统包括用于处理具有多个独立于应用的地图模块并减少模块的数据的独立于应用的框架。这些独立于应用程序的模块在执行用户指定的数据处理操作时，使用独立于应用程序的运算符来自动处理分布式和并行处理环境中的计算并行化。该系统还包括多个用户指定的应用专用运营商，用于与应用无关的框架，以对用户指定的一组输入文件执行用户指定的数据处理操作。应用程序特定的运算符包括：map运算符和reduce运算符。映射运算符由应用无关映射模块应用于输入用户指定的输入文件集中的数据，以产生中间数据值。 reduce运算符由独立于应用程序的模块应用，以处理中间数据值以产生最终输出数据。

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类