-
公开(公告)号:US12072839B2
公开(公告)日:2024-08-27
申请号:US17544705
申请日:2021-12-07
Applicant: GOOGLE LLC
Inventor: Weize Kong , Mingyang Zhang , Michael Bendersky , Marc Alexander Najork , Mike Colagrosso , Brandon Vargo , Remy Burger
CPC classification number: G06F16/122 , G06F16/18
Abstract: Techniques are described herein for enabling more computationally efficient organization of files within a cloud storage system. A method includes: receiving information identifying a document and a set of folders; for each folder in the set of folders, using a trained model to predict a similarity measure between the folder and the document; for each folder in the set of folders, determining a score for the folder based on the predicted similarity measure for the folder; selecting a candidate folder from the set of folders using the scores of the folders within the set of folders; and providing, on a user interface, a selectable option to associate the document with the candidate folder.
-
公开(公告)号:US12236322B2
公开(公告)日:2025-02-25
申请号:US18074774
申请日:2022-12-05
Applicant: GOOGLE LLC
Inventor: Spurthi Amba Hombaiah , Vladimir Ofitserov , Mike Bendersky , Marc Alexander Najork
IPC: G06N20/00 , G06F16/9038 , G06N5/04
Abstract: Implementations relate to training a model that can be used to process values for defined features, where the values are specific to a user account, to generate a predicted user measure that reflects both popularity and quality of the user account. The model is trained based on losses that are each generated as a function of both a corresponding generated popularity measure and a corresponding generated quality measure of a corresponding training instance. Accordingly, the model can be trained to generate, based on values for a given user account, a single measure that reflects both quality and popularity of the given user account. Implementations are additionally or alternatively directed to utilizing such predicted user measures to restrict provisioning of content items that are from user accounts having respective predicted user measures that fail to satisfy a threshold.
-
公开(公告)号:US20230401382A1
公开(公告)日:2023-12-14
申请号:US18249275
申请日:2021-10-19
Applicant: Google LLC
Inventor: Spurthi Amba Hombaiah , Mingyang Zhang , Michael Bendersky , Tao Chen , Marc Alexander Najork
IPC: G06F40/242 , G06F40/40 , G06F40/30 , G06F40/284
CPC classification number: G06F40/242 , G06F40/40 , G06F40/30 , G06F40/284
Abstract: Provided are systems and methods for incremental training of machine learning models to adapt to changes in an underlying data distribution. One example setting in which the techniques described herein may be beneficial is for incrementally training natural language models to enable the models to have or adapt to a dynamically changing vocabulary. Incremental training is provided as a feasible and inexpensive way of adapting machine learning models to evolving vocabulary without having to retrain them from scratch.
-
公开(公告)号:US11238058B2
公开(公告)日:2022-02-01
申请号:US17086564
申请日:2020-11-02
Applicant: Google LLC
Inventor: Marc Alexander Najork , Sujith Ravi , Michael Bendersky , Peter Shao-sen Young , Timothy Youngjin Sohn , Mingyang Zhang , Thomas Nelson , Xuanhui Wang
IPC: G06F16/248 , G06F16/2455 , G06F16/951 , G06F16/38
Abstract: Methods, systems, apparatus, including computer programs encoded on computer storage medium, to facilitate identification of additional trigger-terms for a structured information card. In one aspect, the method includes actions of accessing data associated with a template for presenting structured information, wherein the accessed data references (i) a label term and (ii) a value. Other actions may include obtaining a candidate label term, identifying one or more entities that are associated with the label term, identifying one or more of the entities that are associated with the candidate label term, and for each particular entity of the one or more entities that are associated with the candidate label term, associating, with the candidate label term, (i) a label term that is associated with the particular entity, and (ii) the value associated with the label term.
-
公开(公告)号:US20230177004A1
公开(公告)日:2023-06-08
申请号:US17544705
申请日:2021-12-07
Applicant: GOOGLE LLC
Inventor: Weize Kong , Mingyang Zhang , Michael Bendersky , Marc Alexander Najork , Mike Colagrosso , Brandon Vargo , Remy Burger
CPC classification number: G06F16/122 , G06F16/18
Abstract: Techniques are described herein for enabling more computationally efficient organization of files within a cloud storage system. A method includes: receiving information identifying a document and a set of folders; for each folder in the set of folders, using a trained model to predict a similarity measure between the folder and the document; for each folder in the set of folders, determining a score for the folder based on the predicted similarity measure for the folder; selecting a candidate folder from the set of folders using the scores of the folders within the set of folders; and providing, on a user interface, a selectable option to associate the document with the candidate folder.
-
公开(公告)号:US20230094198A1
公开(公告)日:2023-03-30
申请号:US18074774
申请日:2022-12-05
Applicant: GOOGLE LLC
Inventor: Spurthi Amba Hombaiah , Vladimir Ofitserov , Mike Bendersky , Marc Alexander Najork
IPC: G06N20/00 , G06F16/9038 , G06N5/04
Abstract: Implementations relate to training a model that can be used to process values for defined features, where the values are specific to a user account, to generate a predicted user measure that reflects both popularity and quality of the user account. The model is trained based on losses that are each generated as a function of both a corresponding generated popularity measure and a corresponding generated quality measure of a corresponding training instance. Accordingly, the model can be trained to generate, based on values for a given user account, a single measure that reflects both quality and popularity of the given user account. Implementations are additionally or alternatively directed to utilizing such predicted user measures to restrict provisioning of content items that are from user accounts having respective predicted user measures that fail to satisfy a threshold.
-
公开(公告)号:US11551150B2
公开(公告)日:2023-01-10
申请号:US16946779
申请日:2020-07-06
Applicant: Google LLC
Inventor: Spurthi Amba Hombaiah , Vladimir Ofitserov , Mike Bendersky , Marc Alexander Najork
IPC: G06N20/00 , G06F16/9038 , G06N5/04
Abstract: Implementations relate to training a model that can be used to process values for defined features, where the values are specific to a user account, to generate a predicted user measure that reflects both popularity and quality of the user account. The model is trained based on losses that are each generated as a function of both a corresponding generated popularity measure and a corresponding generated quality measure of a corresponding training instance. Accordingly, the model can be trained to generate, based on values for a given user account, a single measure that reflects both quality and popularity of the given user account. Implementations are additionally or alternatively directed to utilizing such predicted user measures to restrict provisioning of content items that are from user accounts having respective predicted user measures that fail to satisfy a threshold.
-
公开(公告)号:US20210374345A1
公开(公告)日:2021-12-02
申请号:US17336093
申请日:2021-06-01
Applicant: Google LLC
Inventor: Karthik Raman , Liu Yang , Mike Bendersky , Jiecao Chen , Marc Alexander Najork
IPC: G06F40/284 , G06N3/08 , G06N3/04
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for performing a machine learning task on a tuple of respective input sequences to generate an output. In one aspect, one of the systems includes a neural network comprising a plurality of encoder neural networks and a head neural network, each encoder neural network configured to: receive a respective input sequence from the tuple; process the respective input sequence using one or more encoder network layers to generate an encoded representation comprising a sequence of tokens; and process each of some or all of the tokens in the sequence of tokens using a projection layer to generate a lower-dimensional representation, and the head neural network configured to: receive lower-dimensional representations of a respective proper subset of the sequence of tokens generated by the encoder neural network; and process the lower-dimensional representations to generate the output.
-
公开(公告)号:US20210125108A1
公开(公告)日:2021-04-29
申请号:US15333086
申请日:2016-10-24
Applicant: Google LLC
Inventor: Donald Arthur Metzler, JR. , Xuanhui Wang , Marc Alexander Najork , Michael Bendersky
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training a ranking machine learning model. In one aspect, a method includes the actions of receiving training data for a ranking machine learning model, the training data including training examples, and each training example including data identifying: a search query, result documents from a result list for the search query, and a result document that was selected by a user from the result list, receiving position data for each training example in the training data, the position data identifying a respective position of the selected result document in the result list for the search query in the training example; determining, for each training example in the training data, a respective selection bias value; and determining a respective importance value for each training example from the selection bias value for the training example, the importance value.
-
公开(公告)号:US20250068679A1
公开(公告)日:2025-02-27
申请号:US18936579
申请日:2024-11-04
Applicant: GOOGLE LLC
Inventor: Michael Bendersky , Przemyslaw Gajda , Sergey Novikov , Marc Alexander Najork , Shuguang Han
IPC: G06F16/951 , G06F18/214 , G06Q30/0207 , G06Q30/0601
Abstract: Techniques of generating recrawl policies for commercial offer pages include generating a multiple strategy approach using a number of different strategies. In some implementations, each strategy is an arm of a K-armed adversarial bandits algorithm with reinforcement learning. Moreover, in some implementations, the multiple strategy approach also uses a machine learning algorithm to estimate parameters such as a click rate, impression rate, and likelihood of price change, i.e., change rate, which was assumed known in the conventional approaches.
-
-
-
-
-
-
-
-
-